hgbook

diff fr/undo.tex @ 962:bac1c207c76d
Fusion des remarques de Hughues avec celles de William
author: Romain PELISSE <belaran@gmail.com>
date: Thu Mar 26 08:57:10 2009 +0100 (2009-03-26)
parents: f79542a53cb2
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/fr/undo.tex	Thu Mar 26 08:57:10 2009 +0100
     1.3 @@ -0,0 +1,767 @@
     1.4 +\chapter{Finding and fixing your mistakes}
     1.5 +\label{chap:undo}
     1.6 +
     1.7 +To err might be human, but to really handle the consequences well
     1.8 +takes a top-notch revision control system.  In this chapter, we'll
     1.9 +discuss some of the techniques you can use when you find that a
    1.10 +problem has crept into your project.  Mercurial has some highly
    1.11 +capable features that will help you to isolate the sources of
    1.12 +problems, and to handle them appropriately.
    1.13 +
    1.14 +\section{Erasing local history}
    1.15 +
    1.16 +\subsection{The accidental commit}
    1.17 +
    1.18 +I have the occasional but persistent problem of typing rather more
    1.19 +quickly than I can think, which sometimes results in me committing a
    1.20 +changeset that is either incomplete or plain wrong.  In my case, the
    1.21 +usual kind of incomplete changeset is one in which I've created a new
    1.22 +source file, but forgotten to \hgcmd{add} it.  A ``plain wrong''
    1.23 +changeset is not as common, but no less annoying.
    1.24 +
    1.25 +\subsection{Rolling back a transaction}
    1.26 +\label{sec:undo:rollback}
    1.27 +
    1.28 +In section~\ref{sec:concepts:txn}, I mentioned that Mercurial treats
    1.29 +each modification of a repository as a \emph{transaction}.  Every time
    1.30 +you commit a changeset or pull changes from another repository,
    1.31 +Mercurial remembers what you did.  You can undo, or \emph{roll back},
    1.32 +exactly one of these actions using the \hgcmd{rollback} command.  (See
    1.33 +section~\ref{sec:undo:rollback-after-push} for an important caveat
    1.34 +about the use of this command.)
    1.35 +
    1.36 +Here's a mistake that I often find myself making: committing a change
    1.37 +in which I've created a new file, but forgotten to \hgcmd{add} it.
    1.38 +\interaction{rollback.commit}
    1.39 +Looking at the output of \hgcmd{status} after the commit immediately
    1.40 +confirms the error.
    1.41 +\interaction{rollback.status}
    1.42 +The commit captured the changes to the file \filename{a}, but not the
    1.43 +new file \filename{b}.  If I were to push this changeset to a
    1.44 +repository that I shared with a colleague, the chances are high that
    1.45 +something in \filename{a} would refer to \filename{b}, which would not
    1.46 +be present in their repository when they pulled my changes.  I would
    1.47 +thus become the object of some indignation.
    1.48 +
    1.49 +However, luck is with me---I've caught my error before I pushed the
    1.50 +changeset.  I use the \hgcmd{rollback} command, and Mercurial makes
    1.51 +that last changeset vanish.
    1.52 +\interaction{rollback.rollback}
    1.53 +Notice that the changeset is no longer present in the repository's
    1.54 +history, and the working directory once again thinks that the file
    1.55 +\filename{a} is modified.  The commit and rollback have left the
    1.56 +working directory exactly as it was prior to the commit; the changeset
    1.57 +has been completely erased.  I can now safely \hgcmd{add} the file
    1.58 +\filename{b}, and rerun my commit.
    1.59 +\interaction{rollback.add}
    1.60 +
    1.61 +\subsection{The erroneous pull}
    1.62 +
    1.63 +It's common practice with Mercurial to maintain separate development
    1.64 +branches of a project in different repositories.  Your development
    1.65 +team might have one shared repository for your project's ``0.9''
    1.66 +release, and another, containing different changes, for the ``1.0''
    1.67 +release.
    1.68 +
    1.69 +Given this, you can imagine that the consequences could be messy if
    1.70 +you had a local ``0.9'' repository, and accidentally pulled changes
    1.71 +from the shared ``1.0'' repository into it.  At worst, you could be
    1.72 +paying insufficient attention, and push those changes into the shared
    1.73 +``0.9'' tree, confusing your entire team (but don't worry, we'll
    1.74 +return to this horror scenario later).  However, it's more likely that
    1.75 +you'll notice immediately, because Mercurial will display the URL it's
    1.76 +pulling from, or you will see it pull a suspiciously large number of
    1.77 +changes into the repository.
    1.78 +
    1.79 +The \hgcmd{rollback} command will work nicely to expunge all of the
    1.80 +changesets that you just pulled.  Mercurial groups all changes from
    1.81 +one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is
    1.82 +all you need to undo this mistake.
    1.83 +
    1.84 +\subsection{Rolling back is useless once you've pushed}
    1.85 +\label{sec:undo:rollback-after-push}
    1.86 +
    1.87 +The value of the \hgcmd{rollback} command drops to zero once you've
    1.88 +pushed your changes to another repository.  Rolling back a change
    1.89 +makes it disappear entirely, but \emph{only} in the repository in
    1.90 +which you perform the \hgcmd{rollback}.  Because a rollback eliminates
    1.91 +history, there's no way for the disappearance of a change to propagate
    1.92 +between repositories.
    1.93 +
    1.94 +If you've pushed a change to another repository---particularly if it's
    1.95 +a shared repository---it has essentially ``escaped into the wild,''
    1.96 +and you'll have to recover from your mistake in a different way.  What
    1.97 +will happen if you push a changeset somewhere, then roll it back, then
    1.98 +pull from the repository you pushed to, is that the changeset will
    1.99 +reappear in your repository.
   1.100 +
   1.101 +(If you absolutely know for sure that the change you want to roll back
   1.102 +is the most recent change in the repository that you pushed to,
   1.103 +\emph{and} you know that nobody else could have pulled it from that
   1.104 +repository, you can roll back the changeset there, too, but you really
   1.105 +should really not rely on this working reliably.  If you do this,
   1.106 +sooner or later a change really will make it into a repository that
   1.107 +you don't directly control (or have forgotten about), and come back to
   1.108 +bite you.)
   1.109 +
   1.110 +\subsection{You can only roll back once}
   1.111 +
   1.112 +Mercurial stores exactly one transaction in its transaction log; that
   1.113 +transaction is the most recent one that occurred in the repository.
   1.114 +This means that you can only roll back one transaction.  If you expect
   1.115 +to be able to roll back one transaction, then its predecessor, this is
   1.116 +not the behaviour you will get.
   1.117 +\interaction{rollback.twice}
   1.118 +Once you've rolled back one transaction in a repository, you can't
   1.119 +roll back again in that repository until you perform another commit or
   1.120 +pull.
   1.121 +
   1.122 +\section{Reverting the mistaken change}
   1.123 +
   1.124 +If you make a modification to a file, and decide that you really
   1.125 +didn't want to change the file at all, and you haven't yet committed
   1.126 +your changes, the \hgcmd{revert} command is the one you'll need.  It
   1.127 +looks at the changeset that's the parent of the working directory, and
   1.128 +restores the contents of the file to their state as of that changeset.
   1.129 +(That's a long-winded way of saying that, in the normal case, it
   1.130 +undoes your modifications.)
   1.131 +
   1.132 +Let's illustrate how the \hgcmd{revert} command works with yet another
   1.133 +small example.  We'll begin by modifying a file that Mercurial is
   1.134 +already tracking.
   1.135 +\interaction{daily.revert.modify}
   1.136 +If we don't want that change, we can simply \hgcmd{revert} the file.
   1.137 +\interaction{daily.revert.unmodify}
   1.138 +The \hgcmd{revert} command provides us with an extra degree of safety
   1.139 +by saving our modified file with a \filename{.orig} extension.
   1.140 +\interaction{daily.revert.status}
   1.141 +
   1.142 +Here is a summary of the cases that the \hgcmd{revert} command can
   1.143 +deal with.  We will describe each of these in more detail in the
   1.144 +section that follows.
   1.145 +\begin{itemize}
   1.146 +\item If you modify a file, it will restore the file to its unmodified
   1.147 +  state.
   1.148 +\item If you \hgcmd{add} a file, it will undo the ``added'' state of
   1.149 +  the file, but leave the file itself untouched.
   1.150 +\item If you delete a file without telling Mercurial, it will restore
   1.151 +  the file to its unmodified contents.
   1.152 +\item If you use the \hgcmd{remove} command to remove a file, it will
   1.153 +  undo the ``removed'' state of the file, and restore the file to its
   1.154 +  unmodified contents.
   1.155 +\end{itemize}
   1.156 +
   1.157 +\subsection{File management errors}
   1.158 +\label{sec:undo:mgmt}
   1.159 +
   1.160 +The \hgcmd{revert} command is useful for more than just modified
   1.161 +files.  It lets you reverse the results of all of Mercurial's file
   1.162 +management commands---\hgcmd{add}, \hgcmd{remove}, and so on.
   1.163 +
   1.164 +If you \hgcmd{add} a file, then decide that in fact you don't want
   1.165 +Mercurial to track it, use \hgcmd{revert} to undo the add.  Don't
   1.166 +worry; Mercurial will not modify the file in any way.  It will just
   1.167 +``unmark'' the file.
   1.168 +\interaction{daily.revert.add}
   1.169 +
   1.170 +Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use
   1.171 +\hgcmd{revert} to restore it to the contents it had as of the parent
   1.172 +of the working directory.
   1.173 +\interaction{daily.revert.remove}
   1.174 +This works just as well for a file that you deleted by hand, without
   1.175 +telling Mercurial (recall that in Mercurial terminology, this kind of
   1.176 +file is called ``missing'').
   1.177 +\interaction{daily.revert.missing}
   1.178 +
   1.179 +If you revert a \hgcmd{copy}, the copied-to file remains in your
   1.180 +working directory afterwards, untracked.  Since a copy doesn't affect
   1.181 +the copied-from file in any way, Mercurial doesn't do anything with
   1.182 +the copied-from file.
   1.183 +\interaction{daily.revert.copy}
   1.184 +
   1.185 +\subsubsection{A slightly special case: reverting a rename}
   1.186 +
   1.187 +If you \hgcmd{rename} a file, there is one small detail that
   1.188 +you should remember.  When you \hgcmd{revert} a rename, it's not
   1.189 +enough to provide the name of the renamed-to file, as you can see
   1.190 +here.
   1.191 +\interaction{daily.revert.rename}
   1.192 +As you can see from the output of \hgcmd{status}, the renamed-to file
   1.193 +is no longer identified as added, but the renamed-\emph{from} file is
   1.194 +still removed!  This is counter-intuitive (at least to me), but at
   1.195 +least it's easy to deal with.
   1.196 +\interaction{daily.revert.rename-orig}
   1.197 +So remember, to revert a \hgcmd{rename}, you must provide \emph{both}
   1.198 +the source and destination names.  
   1.199 +
   1.200 +% TODO: the output doesn't look like it will be removed!
   1.201 +
   1.202 +(By the way, if you rename a file, then modify the renamed-to file,
   1.203 +then revert both components of the rename, when Mercurial restores the
   1.204 +file that was removed as part of the rename, it will be unmodified.
   1.205 +If you need the modifications in the renamed-to file to show up in the
   1.206 +renamed-from file, don't forget to copy them over.)
   1.207 +
   1.208 +These fiddly aspects of reverting a rename arguably constitute a small
   1.209 +bug in Mercurial.
   1.210 +
   1.211 +\section{Dealing with committed changes}
   1.212 +
   1.213 +Consider a case where you have committed a change $a$, and another
   1.214 +change $b$ on top of it; you then realise that change $a$ was
   1.215 +incorrect.  Mercurial lets you ``back out'' an entire changeset
   1.216 +automatically, and building blocks that let you reverse part of a
   1.217 +changeset by hand.
   1.218 +
   1.219 +Before you read this section, here's something to keep in mind: the
   1.220 +\hgcmd{backout} command undoes changes by \emph{adding} history, not
   1.221 +by modifying or erasing it.  It's the right tool to use if you're
   1.222 +fixing bugs, but not if you're trying to undo some change that has
   1.223 +catastrophic consequences.  To deal with those, see
   1.224 +section~\ref{sec:undo:aaaiiieee}.
   1.225 +
   1.226 +\subsection{Backing out a changeset}
   1.227 +
   1.228 +The \hgcmd{backout} command lets you ``undo'' the effects of an entire
   1.229 +changeset in an automated fashion.  Because Mercurial's history is
   1.230 +immutable, this command \emph{does not} get rid of the changeset you
   1.231 +want to undo.  Instead, it creates a new changeset that
   1.232 +\emph{reverses} the effect of the to-be-undone changeset.
   1.233 +
   1.234 +The operation of the \hgcmd{backout} command is a little intricate, so
   1.235 +let's illustrate it with some examples.  First, we'll create a
   1.236 +repository with some simple changes.
   1.237 +\interaction{backout.init}
   1.238 +
   1.239 +The \hgcmd{backout} command takes a single changeset ID as its
   1.240 +argument; this is the changeset to back out.  Normally,
   1.241 +\hgcmd{backout} will drop you into a text editor to write a commit
   1.242 +message, so you can record why you're backing the change out.  In this
   1.243 +example, we provide a commit message on the command line using the
   1.244 +\hgopt{backout}{-m} option.
   1.245 +
   1.246 +\subsection{Backing out the tip changeset}
   1.247 +
   1.248 +We're going to start by backing out the last changeset we committed.
   1.249 +\interaction{backout.simple}
   1.250 +You can see that the second line from \filename{myfile} is no longer
   1.251 +present.  Taking a look at the output of \hgcmd{log} gives us an idea
   1.252 +of what the \hgcmd{backout} command has done.
   1.253 +\interaction{backout.simple.log}
   1.254 +Notice that the new changeset that \hgcmd{backout} has created is a
   1.255 +child of the changeset we backed out.  It's easier to see this in
   1.256 +figure~\ref{fig:undo:backout}, which presents a graphical view of the
   1.257 +change history.  As you can see, the history is nice and linear.
   1.258 +
   1.259 +\begin{figure}[htb]
   1.260 +  \centering
   1.261 +  \grafix{undo-simple}
   1.262 +  \caption{Backing out a change using the \hgcmd{backout} command}
   1.263 +  \label{fig:undo:backout}
   1.264 +\end{figure}
   1.265 +
   1.266 +\subsection{Backing out a non-tip change}
   1.267 +
   1.268 +If you want to back out a change other than the last one you
   1.269 +committed, pass the \hgopt{backout}{--merge} option to the
   1.270 +\hgcmd{backout} command.
   1.271 +\interaction{backout.non-tip.clone}
   1.272 +This makes backing out any changeset a ``one-shot'' operation that's
   1.273 +usually simple and fast.
   1.274 +\interaction{backout.non-tip.backout}
   1.275 +
   1.276 +If you take a look at the contents of \filename{myfile} after the
   1.277 +backout finishes, you'll see that the first and third changes are
   1.278 +present, but not the second.
   1.279 +\interaction{backout.non-tip.cat}
   1.280 +
   1.281 +As the graphical history in figure~\ref{fig:undo:backout-non-tip}
   1.282 +illustrates, Mercurial actually commits \emph{two} changes in this
   1.283 +kind of situation (the box-shaped nodes are the ones that Mercurial
   1.284 +commits automatically).  Before Mercurial begins the backout process,
   1.285 +it first remembers what the current parent of the working directory
   1.286 +is.  It then backs out the target changeset, and commits that as a
   1.287 +changeset.  Finally, it merges back to the previous parent of the
   1.288 +working directory, and commits the result of the merge.
   1.289 +
   1.290 +% TODO: to me it looks like mercurial doesn't commit the second merge automatically!
   1.291 +
   1.292 +\begin{figure}[htb]
   1.293 +  \centering
   1.294 +  \grafix{undo-non-tip}
   1.295 +  \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}
   1.296 +  \label{fig:undo:backout-non-tip}
   1.297 +\end{figure}
   1.298 +
   1.299 +The result is that you end up ``back where you were'', only with some
   1.300 +extra history that undoes the effect of the changeset you wanted to
   1.301 +back out.
   1.302 +
   1.303 +\subsubsection{Always use the \hgopt{backout}{--merge} option}
   1.304 +
   1.305 +In fact, since the \hgopt{backout}{--merge} option will do the ``right
   1.306 +thing'' whether or not the changeset you're backing out is the tip
   1.307 +(i.e.~it won't try to merge if it's backing out the tip, since there's
   1.308 +no need), you should \emph{always} use this option when you run the
   1.309 +\hgcmd{backout} command.
   1.310 +
   1.311 +\subsection{Gaining more control of the backout process}
   1.312 +
   1.313 +While I've recommended that you always use the
   1.314 +\hgopt{backout}{--merge} option when backing out a change, the
   1.315 +\hgcmd{backout} command lets you decide how to merge a backout
   1.316 +changeset.  Taking control of the backout process by hand is something
   1.317 +you will rarely need to do, but it can be useful to understand what
   1.318 +the \hgcmd{backout} command is doing for you automatically.  To
   1.319 +illustrate this, let's clone our first repository, but omit the
   1.320 +backout change that it contains.
   1.321 +
   1.322 +\interaction{backout.manual.clone}
   1.323 +As with our earlier example, We'll commit a third changeset, then back
   1.324 +out its parent, and see what happens.
   1.325 +\interaction{backout.manual.backout} 
   1.326 +Our new changeset is again a descendant of the changeset we backout
   1.327 +out; it's thus a new head, \emph{not} a descendant of the changeset
   1.328 +that was the tip.  The \hgcmd{backout} command was quite explicit in
   1.329 +telling us this.
   1.330 +\interaction{backout.manual.log}
   1.331 +
   1.332 +Again, it's easier to see what has happened by looking at a graph of
   1.333 +the revision history, in figure~\ref{fig:undo:backout-manual}.  This
   1.334 +makes it clear that when we use \hgcmd{backout} to back out a change
   1.335 +other than the tip, Mercurial adds a new head to the repository (the
   1.336 +change it committed is box-shaped).
   1.337 +
   1.338 +\begin{figure}[htb]
   1.339 +  \centering
   1.340 +  \grafix{undo-manual}
   1.341 +  \caption{Backing out a change using the \hgcmd{backout} command}
   1.342 +  \label{fig:undo:backout-manual}
   1.343 +\end{figure}
   1.344 +
   1.345 +After the \hgcmd{backout} command has completed, it leaves the new
   1.346 +``backout'' changeset as the parent of the working directory.
   1.347 +\interaction{backout.manual.parents}
   1.348 +Now we have two isolated sets of changes.
   1.349 +\interaction{backout.manual.heads}
   1.350 +
   1.351 +Let's think about what we expect to see as the contents of
   1.352 +\filename{myfile} now.  The first change should be present, because
   1.353 +we've never backed it out.  The second change should be missing, as
   1.354 +that's the change we backed out.  Since the history graph shows the
   1.355 +third change as a separate head, we \emph{don't} expect to see the
   1.356 +third change present in \filename{myfile}.
   1.357 +\interaction{backout.manual.cat}
   1.358 +To get the third change back into the file, we just do a normal merge
   1.359 +of our two heads.
   1.360 +\interaction{backout.manual.merge}
   1.361 +Afterwards, the graphical history of our repository looks like
   1.362 +figure~\ref{fig:undo:backout-manual-merge}.
   1.363 +
   1.364 +\begin{figure}[htb]
   1.365 +  \centering
   1.366 +  \grafix{undo-manual-merge}
   1.367 +  \caption{Manually merging a backout change}
   1.368 +  \label{fig:undo:backout-manual-merge}
   1.369 +\end{figure}
   1.370 +
   1.371 +\subsection{Why \hgcmd{backout} works as it does}
   1.372 +
   1.373 +Here's a brief description of how the \hgcmd{backout} command works.
   1.374 +\begin{enumerate}
   1.375 +\item It ensures that the working directory is ``clean'', i.e.~that
   1.376 +  the output of \hgcmd{status} would be empty.
   1.377 +\item It remembers the current parent of the working directory.  Let's
   1.378 +  call this changeset \texttt{orig}
   1.379 +\item It does the equivalent of a \hgcmd{update} to sync the working
   1.380 +  directory to the changeset you want to back out.  Let's call this
   1.381 +  changeset \texttt{backout}
   1.382 +\item It finds the parent of that changeset.  Let's call that
   1.383 +  changeset \texttt{parent}.
   1.384 +\item For each file that the \texttt{backout} changeset affected, it
   1.385 +  does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,
   1.386 +  to restore it to the contents it had before that changeset was
   1.387 +  committed.
   1.388 +\item It commits the result as a new changeset.  This changeset has
   1.389 +  \texttt{backout} as its parent.
   1.390 +\item If you specify \hgopt{backout}{--merge} on the command line, it
   1.391 +  merges with \texttt{orig}, and commits the result of the merge.
   1.392 +\end{enumerate}
   1.393 +
   1.394 +An alternative way to implement the \hgcmd{backout} command would be
   1.395 +to \hgcmd{export} the to-be-backed-out changeset as a diff, then use
   1.396 +the \cmdopt{patch}{--reverse} option to the \command{patch} command to
   1.397 +reverse the effect of the change without fiddling with the working
   1.398 +directory.  This sounds much simpler, but it would not work nearly as
   1.399 +well.
   1.400 +
   1.401 +The reason that \hgcmd{backout} does an update, a commit, a merge, and
   1.402 +another commit is to give the merge machinery the best chance to do a
   1.403 +good job when dealing with all the changes \emph{between} the change
   1.404 +you're backing out and the current tip.  
   1.405 +
   1.406 +If you're backing out a changeset that's~100 revisions back in your
   1.407 +project's history, the chances that the \command{patch} command will
   1.408 +be able to apply a reverse diff cleanly are not good, because
   1.409 +intervening changes are likely to have ``broken the context'' that
   1.410 +\command{patch} uses to determine whether it can apply a patch (if
   1.411 +this sounds like gibberish, see \ref{sec:mq:patch} for a
   1.412 +discussion of the \command{patch} command).  Also, Mercurial's merge
   1.413 +machinery will handle files and directories being renamed, permission
   1.414 +changes, and modifications to binary files, none of which
   1.415 +\command{patch} can deal with.
   1.416 +
   1.417 +\section{Changes that should never have been}
   1.418 +\label{sec:undo:aaaiiieee}
   1.419 +
   1.420 +Most of the time, the \hgcmd{backout} command is exactly what you need
   1.421 +if you want to undo the effects of a change.  It leaves a permanent
   1.422 +record of exactly what you did, both when committing the original
   1.423 +changeset and when you cleaned up after it.
   1.424 +
   1.425 +On rare occasions, though, you may find that you've committed a change
   1.426 +that really should not be present in the repository at all.  For
   1.427 +example, it would be very unusual, and usually considered a mistake,
   1.428 +to commit a software project's object files as well as its source
   1.429 +files.  Object files have almost no intrinsic value, and they're
   1.430 +\emph{big}, so they increase the size of the repository and the amount
   1.431 +of time it takes to clone or pull changes.
   1.432 +
   1.433 +Before I discuss the options that you have if you commit a ``brown
   1.434 +paper bag'' change (the kind that's so bad that you want to pull a
   1.435 +brown paper bag over your head), let me first discuss some approaches
   1.436 +that probably won't work.
   1.437 +
   1.438 +Since Mercurial treats history as accumulative---every change builds
   1.439 +on top of all changes that preceded it---you generally can't just make
   1.440 +disastrous changes disappear.  The one exception is when you've just
   1.441 +committed a change, and it hasn't been pushed or pulled into another
   1.442 +repository.  That's when you can safely use the \hgcmd{rollback}
   1.443 +command, as I detailed in section~\ref{sec:undo:rollback}.
   1.444 +
   1.445 +After you've pushed a bad change to another repository, you
   1.446 +\emph{could} still use \hgcmd{rollback} to make your local copy of the
   1.447 +change disappear, but it won't have the consequences you want.  The
   1.448 +change will still be present in the remote repository, so it will
   1.449 +reappear in your local repository the next time you pull.
   1.450 +
   1.451 +If a situation like this arises, and you know which repositories your
   1.452 +bad change has propagated into, you can \emph{try} to get rid of the
   1.453 +changeefrom \emph{every} one of those repositories.  This is, of
   1.454 +course, not a satisfactory solution: if you miss even a single
   1.455 +repository while you're expunging, the change is still ``in the
   1.456 +wild'', and could propagate further.
   1.457 +
   1.458 +If you've committed one or more changes \emph{after} the change that
   1.459 +you'd like to see disappear, your options are further reduced.
   1.460 +Mercurial doesn't provide a way to ``punch a hole'' in history,
   1.461 +leaving changesets intact.
   1.462 +
   1.463 +XXX This needs filling out.  The \texttt{hg-replay} script in the
   1.464 +\texttt{examples} directory works, but doesn't handle merge
   1.465 +changesets.  Kind of an important omission.
   1.466 +
   1.467 +\subsection{Protect yourself from ``escaped'' changes}
   1.468 +
   1.469 +If you've committed some changes to your local repository and they've
   1.470 +been pushed or pulled somewhere else, this isn't necessarily a
   1.471 +disaster.  You can protect yourself ahead of time against some classes
   1.472 +of bad changeset.  This is particularly easy if your team usually
   1.473 +pulls changes from a central repository.
   1.474 +
   1.475 +By configuring some hooks on that repository to validate incoming
   1.476 +changesets (see chapter~\ref{chap:hook}), you can automatically
   1.477 +prevent some kinds of bad changeset from being pushed to the central
   1.478 +repository at all.  With such a configuration in place, some kinds of
   1.479 +bad changeset will naturally tend to ``die out'' because they can't
   1.480 +propagate into the central repository.  Better yet, this happens
   1.481 +without any need for explicit intervention.
   1.482 +
   1.483 +For instance, an incoming change hook that verifies that a changeset
   1.484 +will actually compile can prevent people from inadvertantly ``breaking
   1.485 +the build''.
   1.486 +
   1.487 +\section{Finding the source of a bug}
   1.488 +\label{sec:undo:bisect}
   1.489 +
   1.490 +While it's all very well to be able to back out a changeset that
   1.491 +introduced a bug, this requires that you know which changeset to back
   1.492 +out.  Mercurial provides an invaluable command, called
   1.493 +\hgcmd{bisect}, that helps you to automate this process and accomplish
   1.494 +it very efficiently.
   1.495 +
   1.496 +The idea behind the \hgcmd{bisect} command is that a changeset has
   1.497 +introduced some change of behaviour that you can identify with a
   1.498 +simple binary test.  You don't know which piece of code introduced the
   1.499 +change, but you know how to test for the presence of the bug.  The
   1.500 +\hgcmd{bisect} command uses your test to direct its search for the
   1.501 +changeset that introduced the code that caused the bug.
   1.502 +
   1.503 +Here are a few scenarios to help you understand how you might apply
   1.504 +this command.
   1.505 +\begin{itemize}
   1.506 +\item The most recent version of your software has a bug that you
   1.507 +  remember wasn't present a few weeks ago, but you don't know when it
   1.508 +  was introduced.  Here, your binary test checks for the presence of
   1.509 +  that bug.
   1.510 +\item You fixed a bug in a rush, and now it's time to close the entry
   1.511 +  in your team's bug database.  The bug database requires a changeset
   1.512 +  ID when you close an entry, but you don't remember which changeset
   1.513 +  you fixed the bug in.  Once again, your binary test checks for the
   1.514 +  presence of the bug.
   1.515 +\item Your software works correctly, but runs~15\% slower than the
   1.516 +  last time you measured it.  You want to know which changeset
   1.517 +  introduced the performance regression.  In this case, your binary
   1.518 +  test measures the performance of your software, to see whether it's
   1.519 +  ``fast'' or ``slow''.
   1.520 +\item The sizes of the components of your project that you ship
   1.521 +  exploded recently, and you suspect that something changed in the way
   1.522 +  you build your project.
   1.523 +\end{itemize}
   1.524 +
   1.525 +From these examples, it should be clear that the \hgcmd{bisect}
   1.526 +command is not useful only for finding the sources of bugs.  You can
   1.527 +use it to find any ``emergent property'' of a repository (anything
   1.528 +that you can't find from a simple text search of the files in the
   1.529 +tree) for which you can write a binary test.
   1.530 +
   1.531 +We'll introduce a little bit of terminology here, just to make it
   1.532 +clear which parts of the search process are your responsibility, and
   1.533 +which are Mercurial's.  A \emph{test} is something that \emph{you} run
   1.534 +when \hgcmd{bisect} chooses a changeset.  A \emph{probe} is what
   1.535 +\hgcmd{bisect} runs to tell whether a revision is good.  Finally,
   1.536 +we'll use the word ``bisect'', as both a noun and a verb, to stand in
   1.537 +for the phrase ``search using the \hgcmd{bisect} command.
   1.538 +
   1.539 +One simple way to automate the searching process would be simply to
   1.540 +probe every changeset.  However, this scales poorly.  If it took ten
   1.541 +minutes to test a single changeset, and you had 10,000 changesets in
   1.542 +your repository, the exhaustive approach would take on average~35
   1.543 +\emph{days} to find the changeset that introduced a bug.  Even if you
   1.544 +knew that the bug was introduced by one of the last 500 changesets,
   1.545 +and limited your search to those, you'd still be looking at over 40
   1.546 +hours to find the changeset that introduced your bug.
   1.547 +
   1.548 +What the \hgcmd{bisect} command does is use its knowledge of the
   1.549 +``shape'' of your project's revision history to perform a search in
   1.550 +time proportional to the \emph{logarithm} of the number of changesets
   1.551 +to check (the kind of search it performs is called a dichotomic
   1.552 +search).  With this approach, searching through 10,000 changesets will
   1.553 +take less than three hours, even at ten minutes per test (the search
   1.554 +will require about 14 tests).  Limit your search to the last hundred
   1.555 +changesets, and it will take only about an hour (roughly seven tests).
   1.556 +
   1.557 +The \hgcmd{bisect} command is aware of the ``branchy'' nature of a
   1.558 +Mercurial project's revision history, so it has no problems dealing
   1.559 +with branches, merges, or multiple heads in a repository.  It can
   1.560 +prune entire branches of history with a single probe, which is how it
   1.561 +operates so efficiently.
   1.562 +
   1.563 +\subsection{Using the \hgcmd{bisect} command}
   1.564 +
   1.565 +Here's an example of \hgcmd{bisect} in action.
   1.566 +
   1.567 +\begin{note}
   1.568 +  In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a
   1.569 +  core command: it was distributed with Mercurial as an extension.
   1.570 +  This section describes the built-in command, not the old extension.
   1.571 +\end{note}
   1.572 +
   1.573 +Now let's create a repository, so that we can try out the
   1.574 +\hgcmd{bisect} command in isolation.
   1.575 +\interaction{bisect.init}
   1.576 +We'll simulate a project that has a bug in it in a simple-minded way:
   1.577 +create trivial changes in a loop, and nominate one specific change
   1.578 +that will have the ``bug''.  This loop creates 35 changesets, each
   1.579 +adding a single file to the repository.  We'll represent our ``bug''
   1.580 +with a file that contains the text ``i have a gub''.
   1.581 +\interaction{bisect.commits}
   1.582 +
   1.583 +The next thing that we'd like to do is figure out how to use the
   1.584 +\hgcmd{bisect} command.  We can use Mercurial's normal built-in help
   1.585 +mechanism for this.
   1.586 +\interaction{bisect.help}
   1.587 +
   1.588 +The \hgcmd{bisect} command works in steps.  Each step proceeds as follows.
   1.589 +\begin{enumerate}
   1.590 +\item You run your binary test.
   1.591 +  \begin{itemize}
   1.592 +  \item If the test succeeded, you tell \hgcmd{bisect} by running the
   1.593 +    \hgcmdargs{bisect}{good} command.
   1.594 +  \item If it failed, run the \hgcmdargs{bisect}{--bad} command.
   1.595 +  \end{itemize}
   1.596 +\item The command uses your information to decide which changeset to
   1.597 +  test next.
   1.598 +\item It updates the working directory to that changeset, and the
   1.599 +  process begins again.
   1.600 +\end{enumerate}
   1.601 +The process ends when \hgcmd{bisect} identifies a unique changeset
   1.602 +that marks the point where your test transitioned from ``succeeding''
   1.603 +to ``failing''.
   1.604 +
   1.605 +To start the search, we must run the \hgcmdargs{bisect}{--reset} command.
   1.606 +\interaction{bisect.search.init}
   1.607 +
   1.608 +In our case, the binary test we use is simple: we check to see if any
   1.609 +file in the repository contains the string ``i have a gub''.  If it
   1.610 +does, this changeset contains the change that ``caused the bug''.  By
   1.611 +convention, a changeset that has the property we're searching for is
   1.612 +``bad'', while one that doesn't is ``good''.
   1.613 +
   1.614 +Most of the time, the revision to which the working directory is
   1.615 +synced (usually the tip) already exhibits the problem introduced by
   1.616 +the buggy change, so we'll mark it as ``bad''.
   1.617 +\interaction{bisect.search.bad-init}
   1.618 +
   1.619 +Our next task is to nominate a changeset that we know \emph{doesn't}
   1.620 +have the bug; the \hgcmd{bisect} command will ``bracket'' its search
   1.621 +between the first pair of good and bad changesets.  In our case, we
   1.622 +know that revision~10 didn't have the bug.  (I'll have more words
   1.623 +about choosing the first ``good'' changeset later.)
   1.624 +\interaction{bisect.search.good-init}
   1.625 +
   1.626 +Notice that this command printed some output.
   1.627 +\begin{itemize}
   1.628 +\item It told us how many changesets it must consider before it can
   1.629 +  identify the one that introduced the bug, and how many tests that
   1.630 +  will require.
   1.631 +\item It updated the working directory to the next changeset to test,
   1.632 +  and told us which changeset it's testing.
   1.633 +\end{itemize}
   1.634 +
   1.635 +We now run our test in the working directory.  We use the
   1.636 +\command{grep} command to see if our ``bad'' file is present in the
   1.637 +working directory.  If it is, this revision is bad; if not, this
   1.638 +revision is good.
   1.639 +\interaction{bisect.search.step1}
   1.640 +
   1.641 +This test looks like a perfect candidate for automation, so let's turn
   1.642 +it into a shell function.
   1.643 +\interaction{bisect.search.mytest}
   1.644 +We can now run an entire test step with a single command,
   1.645 +\texttt{mytest}.
   1.646 +\interaction{bisect.search.step2}
   1.647 +A few more invocations of our canned test step command, and we're
   1.648 +done.
   1.649 +\interaction{bisect.search.rest}
   1.650 +
   1.651 +Even though we had~40 changesets to search through, the \hgcmd{bisect}
   1.652 +command let us find the changeset that introduced our ``bug'' with
   1.653 +only five tests.  Because the number of tests that the \hgcmd{bisect}
   1.654 +command performs grows logarithmically with the number of changesets to
   1.655 +search, the advantage that it has over the ``brute force'' search
   1.656 +approach increases with every changeset you add.
   1.657 +
   1.658 +\subsection{Cleaning up after your search}
   1.659 +
   1.660 +When you're finished using the \hgcmd{bisect} command in a
   1.661 +repository, you can use the \hgcmdargs{bisect}{reset} command to drop
   1.662 +the information it was using to drive your search.  The command
   1.663 +doesn't use much space, so it doesn't matter if you forget to run this
   1.664 +command.  However, \hgcmd{bisect} won't let you start a new search in
   1.665 +that repository until you do a \hgcmdargs{bisect}{reset}.
   1.666 +\interaction{bisect.search.reset}
   1.667 +
   1.668 +\section{Tips for finding bugs effectively}
   1.669 +
   1.670 +\subsection{Give consistent input}
   1.671 +
   1.672 +The \hgcmd{bisect} command requires that you correctly report the
   1.673 +result of every test you perform.  If you tell it that a test failed
   1.674 +when it really succeeded, it \emph{might} be able to detect the
   1.675 +inconsistency.  If it can identify an inconsistency in your reports,
   1.676 +it will tell you that a particular changeset is both good and bad.
   1.677 +However, it can't do this perfectly; it's about as likely to report
   1.678 +the wrong changeset as the source of the bug.
   1.679 +
   1.680 +\subsection{Automate as much as possible}
   1.681 +
   1.682 +When I started using the \hgcmd{bisect} command, I tried a few times
   1.683 +to run my tests by hand, on the command line.  This is an approach
   1.684 +that I, at least, am not suited to.  After a few tries, I found that I
   1.685 +was making enough mistakes that I was having to restart my searches
   1.686 +several times before finally getting correct results.
   1.687 +
   1.688 +My initial problems with driving the \hgcmd{bisect} command by hand
   1.689 +occurred even with simple searches on small repositories; if the
   1.690 +problem you're looking for is more subtle, or the number of tests that
   1.691 +\hgcmd{bisect} must perform increases, the likelihood of operator
   1.692 +error ruining the search is much higher.  Once I started automating my
   1.693 +tests, I had much better results.
   1.694 +
   1.695 +The key to automated testing is twofold:
   1.696 +\begin{itemize}
   1.697 +\item always test for the same symptom, and
   1.698 +\item always feed consistent input to the \hgcmd{bisect} command.
   1.699 +\end{itemize}
   1.700 +In my tutorial example above, the \command{grep} command tests for the
   1.701 +symptom, and the \texttt{if} statement takes the result of this check
   1.702 +and ensures that we always feed the same input to the \hgcmd{bisect}
   1.703 +command.  The \texttt{mytest} function marries these together in a
   1.704 +reproducible way, so that every test is uniform and consistent.
   1.705 +
   1.706 +\subsection{Check your results}
   1.707 +
   1.708 +Because the output of a \hgcmd{bisect} search is only as good as the
   1.709 +input you give it, don't take the changeset it reports as the
   1.710 +absolute truth.  A simple way to cross-check its report is to manually
   1.711 +run your test at each of the following changesets:
   1.712 +\begin{itemize}
   1.713 +\item The changeset that it reports as the first bad revision.  Your
   1.714 +  test should still report this as bad.
   1.715 +\item The parent of that changeset (either parent, if it's a merge).
   1.716 +  Your test should report this changeset as good.
   1.717 +\item A child of that changeset.  Your test should report this
   1.718 +  changeset as bad.
   1.719 +\end{itemize}
   1.720 +
   1.721 +\subsection{Beware interference between bugs}
   1.722 +
   1.723 +It's possible that your search for one bug could be disrupted by the
   1.724 +presence of another.  For example, let's say your software crashes at
   1.725 +revision 100, and worked correctly at revision 50.  Unknown to you,
   1.726 +someone else introduced a different crashing bug at revision 60, and
   1.727 +fixed it at revision 80.  This could distort your results in one of
   1.728 +several ways.
   1.729 +
   1.730 +It is possible that this other bug completely ``masks'' yours, which
   1.731 +is to say that it occurs before your bug has a chance to manifest
   1.732 +itself.  If you can't avoid that other bug (for example, it prevents
   1.733 +your project from building), and so can't tell whether your bug is
   1.734 +present in a particular changeset, the \hgcmd{bisect} command cannot
   1.735 +help you directly.  Instead, you can mark a changeset as untested by
   1.736 +running \hgcmdargs{bisect}{--skip}.
   1.737 +
   1.738 +A different problem could arise if your test for a bug's presence is
   1.739 +not specific enough.  If you check for ``my program crashes'', then
   1.740 +both your crashing bug and an unrelated crashing bug that masks it
   1.741 +will look like the same thing, and mislead \hgcmd{bisect}.
   1.742 +
   1.743 +Another useful situation in which to use \hgcmdargs{bisect}{--skip} is
   1.744 +if you can't test a revision because your project was in a broken and
   1.745 +hence untestable state at that revision, perhaps because someone
   1.746 +checked in a change that prevented the project from building.
   1.747 +
   1.748 +\subsection{Bracket your search lazily}
   1.749 +
   1.750 +Choosing the first ``good'' and ``bad'' changesets that will mark the
   1.751 +end points of your search is often easy, but it bears a little
   1.752 +discussion nevertheless.  From the perspective of \hgcmd{bisect}, the
   1.753 +``newest'' changeset is conventionally ``bad'', and the older
   1.754 +changeset is ``good''.
   1.755 +
   1.756 +If you're having trouble remembering when a suitable ``good'' change
   1.757 +was, so that you can tell \hgcmd{bisect}, you could do worse than
   1.758 +testing changesets at random.  Just remember to eliminate contenders
   1.759 +that can't possibly exhibit the bug (perhaps because the feature with
   1.760 +the bug isn't present yet) and those where another problem masks the
   1.761 +bug (as I discussed above).
   1.762 +
   1.763 +Even if you end up ``early'' by thousands of changesets or months of
   1.764 +history, you will only add a handful of tests to the total number that
   1.765 +\hgcmd{bisect} must perform, thanks to its logarithmic behaviour.
   1.766 +
   1.767 +%%% Local Variables: 
   1.768 +%%% mode: latex
   1.769 +%%% TeX-master: "00book"
   1.770 +%%% End:
author	Romain PELISSE <belaran@gmail.com>
date	Thu Mar 26 08:57:10 2009 +0100 (2009-03-26)
parents	f79542a53cb2
children