hgbook
diff fr/undo.tex @ 962:bac1c207c76d
Fusion des remarques de Hughues avec celles de William
author | Romain PELISSE <belaran@gmail.com> |
---|---|
date | Thu Mar 26 08:57:10 2009 +0100 (2009-03-26) |
parents | f79542a53cb2 |
children |
line diff
1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 1.2 +++ b/fr/undo.tex Thu Mar 26 08:57:10 2009 +0100 1.3 @@ -0,0 +1,767 @@ 1.4 +\chapter{Finding and fixing your mistakes} 1.5 +\label{chap:undo} 1.6 + 1.7 +To err might be human, but to really handle the consequences well 1.8 +takes a top-notch revision control system. In this chapter, we'll 1.9 +discuss some of the techniques you can use when you find that a 1.10 +problem has crept into your project. Mercurial has some highly 1.11 +capable features that will help you to isolate the sources of 1.12 +problems, and to handle them appropriately. 1.13 + 1.14 +\section{Erasing local history} 1.15 + 1.16 +\subsection{The accidental commit} 1.17 + 1.18 +I have the occasional but persistent problem of typing rather more 1.19 +quickly than I can think, which sometimes results in me committing a 1.20 +changeset that is either incomplete or plain wrong. In my case, the 1.21 +usual kind of incomplete changeset is one in which I've created a new 1.22 +source file, but forgotten to \hgcmd{add} it. A ``plain wrong'' 1.23 +changeset is not as common, but no less annoying. 1.24 + 1.25 +\subsection{Rolling back a transaction} 1.26 +\label{sec:undo:rollback} 1.27 + 1.28 +In section~\ref{sec:concepts:txn}, I mentioned that Mercurial treats 1.29 +each modification of a repository as a \emph{transaction}. Every time 1.30 +you commit a changeset or pull changes from another repository, 1.31 +Mercurial remembers what you did. You can undo, or \emph{roll back}, 1.32 +exactly one of these actions using the \hgcmd{rollback} command. (See 1.33 +section~\ref{sec:undo:rollback-after-push} for an important caveat 1.34 +about the use of this command.) 1.35 + 1.36 +Here's a mistake that I often find myself making: committing a change 1.37 +in which I've created a new file, but forgotten to \hgcmd{add} it. 1.38 +\interaction{rollback.commit} 1.39 +Looking at the output of \hgcmd{status} after the commit immediately 1.40 +confirms the error. 1.41 +\interaction{rollback.status} 1.42 +The commit captured the changes to the file \filename{a}, but not the 1.43 +new file \filename{b}. If I were to push this changeset to a 1.44 +repository that I shared with a colleague, the chances are high that 1.45 +something in \filename{a} would refer to \filename{b}, which would not 1.46 +be present in their repository when they pulled my changes. I would 1.47 +thus become the object of some indignation. 1.48 + 1.49 +However, luck is with me---I've caught my error before I pushed the 1.50 +changeset. I use the \hgcmd{rollback} command, and Mercurial makes 1.51 +that last changeset vanish. 1.52 +\interaction{rollback.rollback} 1.53 +Notice that the changeset is no longer present in the repository's 1.54 +history, and the working directory once again thinks that the file 1.55 +\filename{a} is modified. The commit and rollback have left the 1.56 +working directory exactly as it was prior to the commit; the changeset 1.57 +has been completely erased. I can now safely \hgcmd{add} the file 1.58 +\filename{b}, and rerun my commit. 1.59 +\interaction{rollback.add} 1.60 + 1.61 +\subsection{The erroneous pull} 1.62 + 1.63 +It's common practice with Mercurial to maintain separate development 1.64 +branches of a project in different repositories. Your development 1.65 +team might have one shared repository for your project's ``0.9'' 1.66 +release, and another, containing different changes, for the ``1.0'' 1.67 +release. 1.68 + 1.69 +Given this, you can imagine that the consequences could be messy if 1.70 +you had a local ``0.9'' repository, and accidentally pulled changes 1.71 +from the shared ``1.0'' repository into it. At worst, you could be 1.72 +paying insufficient attention, and push those changes into the shared 1.73 +``0.9'' tree, confusing your entire team (but don't worry, we'll 1.74 +return to this horror scenario later). However, it's more likely that 1.75 +you'll notice immediately, because Mercurial will display the URL it's 1.76 +pulling from, or you will see it pull a suspiciously large number of 1.77 +changes into the repository. 1.78 + 1.79 +The \hgcmd{rollback} command will work nicely to expunge all of the 1.80 +changesets that you just pulled. Mercurial groups all changes from 1.81 +one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is 1.82 +all you need to undo this mistake. 1.83 + 1.84 +\subsection{Rolling back is useless once you've pushed} 1.85 +\label{sec:undo:rollback-after-push} 1.86 + 1.87 +The value of the \hgcmd{rollback} command drops to zero once you've 1.88 +pushed your changes to another repository. Rolling back a change 1.89 +makes it disappear entirely, but \emph{only} in the repository in 1.90 +which you perform the \hgcmd{rollback}. Because a rollback eliminates 1.91 +history, there's no way for the disappearance of a change to propagate 1.92 +between repositories. 1.93 + 1.94 +If you've pushed a change to another repository---particularly if it's 1.95 +a shared repository---it has essentially ``escaped into the wild,'' 1.96 +and you'll have to recover from your mistake in a different way. What 1.97 +will happen if you push a changeset somewhere, then roll it back, then 1.98 +pull from the repository you pushed to, is that the changeset will 1.99 +reappear in your repository. 1.100 + 1.101 +(If you absolutely know for sure that the change you want to roll back 1.102 +is the most recent change in the repository that you pushed to, 1.103 +\emph{and} you know that nobody else could have pulled it from that 1.104 +repository, you can roll back the changeset there, too, but you really 1.105 +should really not rely on this working reliably. If you do this, 1.106 +sooner or later a change really will make it into a repository that 1.107 +you don't directly control (or have forgotten about), and come back to 1.108 +bite you.) 1.109 + 1.110 +\subsection{You can only roll back once} 1.111 + 1.112 +Mercurial stores exactly one transaction in its transaction log; that 1.113 +transaction is the most recent one that occurred in the repository. 1.114 +This means that you can only roll back one transaction. If you expect 1.115 +to be able to roll back one transaction, then its predecessor, this is 1.116 +not the behaviour you will get. 1.117 +\interaction{rollback.twice} 1.118 +Once you've rolled back one transaction in a repository, you can't 1.119 +roll back again in that repository until you perform another commit or 1.120 +pull. 1.121 + 1.122 +\section{Reverting the mistaken change} 1.123 + 1.124 +If you make a modification to a file, and decide that you really 1.125 +didn't want to change the file at all, and you haven't yet committed 1.126 +your changes, the \hgcmd{revert} command is the one you'll need. It 1.127 +looks at the changeset that's the parent of the working directory, and 1.128 +restores the contents of the file to their state as of that changeset. 1.129 +(That's a long-winded way of saying that, in the normal case, it 1.130 +undoes your modifications.) 1.131 + 1.132 +Let's illustrate how the \hgcmd{revert} command works with yet another 1.133 +small example. We'll begin by modifying a file that Mercurial is 1.134 +already tracking. 1.135 +\interaction{daily.revert.modify} 1.136 +If we don't want that change, we can simply \hgcmd{revert} the file. 1.137 +\interaction{daily.revert.unmodify} 1.138 +The \hgcmd{revert} command provides us with an extra degree of safety 1.139 +by saving our modified file with a \filename{.orig} extension. 1.140 +\interaction{daily.revert.status} 1.141 + 1.142 +Here is a summary of the cases that the \hgcmd{revert} command can 1.143 +deal with. We will describe each of these in more detail in the 1.144 +section that follows. 1.145 +\begin{itemize} 1.146 +\item If you modify a file, it will restore the file to its unmodified 1.147 + state. 1.148 +\item If you \hgcmd{add} a file, it will undo the ``added'' state of 1.149 + the file, but leave the file itself untouched. 1.150 +\item If you delete a file without telling Mercurial, it will restore 1.151 + the file to its unmodified contents. 1.152 +\item If you use the \hgcmd{remove} command to remove a file, it will 1.153 + undo the ``removed'' state of the file, and restore the file to its 1.154 + unmodified contents. 1.155 +\end{itemize} 1.156 + 1.157 +\subsection{File management errors} 1.158 +\label{sec:undo:mgmt} 1.159 + 1.160 +The \hgcmd{revert} command is useful for more than just modified 1.161 +files. It lets you reverse the results of all of Mercurial's file 1.162 +management commands---\hgcmd{add}, \hgcmd{remove}, and so on. 1.163 + 1.164 +If you \hgcmd{add} a file, then decide that in fact you don't want 1.165 +Mercurial to track it, use \hgcmd{revert} to undo the add. Don't 1.166 +worry; Mercurial will not modify the file in any way. It will just 1.167 +``unmark'' the file. 1.168 +\interaction{daily.revert.add} 1.169 + 1.170 +Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use 1.171 +\hgcmd{revert} to restore it to the contents it had as of the parent 1.172 +of the working directory. 1.173 +\interaction{daily.revert.remove} 1.174 +This works just as well for a file that you deleted by hand, without 1.175 +telling Mercurial (recall that in Mercurial terminology, this kind of 1.176 +file is called ``missing''). 1.177 +\interaction{daily.revert.missing} 1.178 + 1.179 +If you revert a \hgcmd{copy}, the copied-to file remains in your 1.180 +working directory afterwards, untracked. Since a copy doesn't affect 1.181 +the copied-from file in any way, Mercurial doesn't do anything with 1.182 +the copied-from file. 1.183 +\interaction{daily.revert.copy} 1.184 + 1.185 +\subsubsection{A slightly special case: reverting a rename} 1.186 + 1.187 +If you \hgcmd{rename} a file, there is one small detail that 1.188 +you should remember. When you \hgcmd{revert} a rename, it's not 1.189 +enough to provide the name of the renamed-to file, as you can see 1.190 +here. 1.191 +\interaction{daily.revert.rename} 1.192 +As you can see from the output of \hgcmd{status}, the renamed-to file 1.193 +is no longer identified as added, but the renamed-\emph{from} file is 1.194 +still removed! This is counter-intuitive (at least to me), but at 1.195 +least it's easy to deal with. 1.196 +\interaction{daily.revert.rename-orig} 1.197 +So remember, to revert a \hgcmd{rename}, you must provide \emph{both} 1.198 +the source and destination names. 1.199 + 1.200 +% TODO: the output doesn't look like it will be removed! 1.201 + 1.202 +(By the way, if you rename a file, then modify the renamed-to file, 1.203 +then revert both components of the rename, when Mercurial restores the 1.204 +file that was removed as part of the rename, it will be unmodified. 1.205 +If you need the modifications in the renamed-to file to show up in the 1.206 +renamed-from file, don't forget to copy them over.) 1.207 + 1.208 +These fiddly aspects of reverting a rename arguably constitute a small 1.209 +bug in Mercurial. 1.210 + 1.211 +\section{Dealing with committed changes} 1.212 + 1.213 +Consider a case where you have committed a change $a$, and another 1.214 +change $b$ on top of it; you then realise that change $a$ was 1.215 +incorrect. Mercurial lets you ``back out'' an entire changeset 1.216 +automatically, and building blocks that let you reverse part of a 1.217 +changeset by hand. 1.218 + 1.219 +Before you read this section, here's something to keep in mind: the 1.220 +\hgcmd{backout} command undoes changes by \emph{adding} history, not 1.221 +by modifying or erasing it. It's the right tool to use if you're 1.222 +fixing bugs, but not if you're trying to undo some change that has 1.223 +catastrophic consequences. To deal with those, see 1.224 +section~\ref{sec:undo:aaaiiieee}. 1.225 + 1.226 +\subsection{Backing out a changeset} 1.227 + 1.228 +The \hgcmd{backout} command lets you ``undo'' the effects of an entire 1.229 +changeset in an automated fashion. Because Mercurial's history is 1.230 +immutable, this command \emph{does not} get rid of the changeset you 1.231 +want to undo. Instead, it creates a new changeset that 1.232 +\emph{reverses} the effect of the to-be-undone changeset. 1.233 + 1.234 +The operation of the \hgcmd{backout} command is a little intricate, so 1.235 +let's illustrate it with some examples. First, we'll create a 1.236 +repository with some simple changes. 1.237 +\interaction{backout.init} 1.238 + 1.239 +The \hgcmd{backout} command takes a single changeset ID as its 1.240 +argument; this is the changeset to back out. Normally, 1.241 +\hgcmd{backout} will drop you into a text editor to write a commit 1.242 +message, so you can record why you're backing the change out. In this 1.243 +example, we provide a commit message on the command line using the 1.244 +\hgopt{backout}{-m} option. 1.245 + 1.246 +\subsection{Backing out the tip changeset} 1.247 + 1.248 +We're going to start by backing out the last changeset we committed. 1.249 +\interaction{backout.simple} 1.250 +You can see that the second line from \filename{myfile} is no longer 1.251 +present. Taking a look at the output of \hgcmd{log} gives us an idea 1.252 +of what the \hgcmd{backout} command has done. 1.253 +\interaction{backout.simple.log} 1.254 +Notice that the new changeset that \hgcmd{backout} has created is a 1.255 +child of the changeset we backed out. It's easier to see this in 1.256 +figure~\ref{fig:undo:backout}, which presents a graphical view of the 1.257 +change history. As you can see, the history is nice and linear. 1.258 + 1.259 +\begin{figure}[htb] 1.260 + \centering 1.261 + \grafix{undo-simple} 1.262 + \caption{Backing out a change using the \hgcmd{backout} command} 1.263 + \label{fig:undo:backout} 1.264 +\end{figure} 1.265 + 1.266 +\subsection{Backing out a non-tip change} 1.267 + 1.268 +If you want to back out a change other than the last one you 1.269 +committed, pass the \hgopt{backout}{--merge} option to the 1.270 +\hgcmd{backout} command. 1.271 +\interaction{backout.non-tip.clone} 1.272 +This makes backing out any changeset a ``one-shot'' operation that's 1.273 +usually simple and fast. 1.274 +\interaction{backout.non-tip.backout} 1.275 + 1.276 +If you take a look at the contents of \filename{myfile} after the 1.277 +backout finishes, you'll see that the first and third changes are 1.278 +present, but not the second. 1.279 +\interaction{backout.non-tip.cat} 1.280 + 1.281 +As the graphical history in figure~\ref{fig:undo:backout-non-tip} 1.282 +illustrates, Mercurial actually commits \emph{two} changes in this 1.283 +kind of situation (the box-shaped nodes are the ones that Mercurial 1.284 +commits automatically). Before Mercurial begins the backout process, 1.285 +it first remembers what the current parent of the working directory 1.286 +is. It then backs out the target changeset, and commits that as a 1.287 +changeset. Finally, it merges back to the previous parent of the 1.288 +working directory, and commits the result of the merge. 1.289 + 1.290 +% TODO: to me it looks like mercurial doesn't commit the second merge automatically! 1.291 + 1.292 +\begin{figure}[htb] 1.293 + \centering 1.294 + \grafix{undo-non-tip} 1.295 + \caption{Automated backout of a non-tip change using the \hgcmd{backout} command} 1.296 + \label{fig:undo:backout-non-tip} 1.297 +\end{figure} 1.298 + 1.299 +The result is that you end up ``back where you were'', only with some 1.300 +extra history that undoes the effect of the changeset you wanted to 1.301 +back out. 1.302 + 1.303 +\subsubsection{Always use the \hgopt{backout}{--merge} option} 1.304 + 1.305 +In fact, since the \hgopt{backout}{--merge} option will do the ``right 1.306 +thing'' whether or not the changeset you're backing out is the tip 1.307 +(i.e.~it won't try to merge if it's backing out the tip, since there's 1.308 +no need), you should \emph{always} use this option when you run the 1.309 +\hgcmd{backout} command. 1.310 + 1.311 +\subsection{Gaining more control of the backout process} 1.312 + 1.313 +While I've recommended that you always use the 1.314 +\hgopt{backout}{--merge} option when backing out a change, the 1.315 +\hgcmd{backout} command lets you decide how to merge a backout 1.316 +changeset. Taking control of the backout process by hand is something 1.317 +you will rarely need to do, but it can be useful to understand what 1.318 +the \hgcmd{backout} command is doing for you automatically. To 1.319 +illustrate this, let's clone our first repository, but omit the 1.320 +backout change that it contains. 1.321 + 1.322 +\interaction{backout.manual.clone} 1.323 +As with our earlier example, We'll commit a third changeset, then back 1.324 +out its parent, and see what happens. 1.325 +\interaction{backout.manual.backout} 1.326 +Our new changeset is again a descendant of the changeset we backout 1.327 +out; it's thus a new head, \emph{not} a descendant of the changeset 1.328 +that was the tip. The \hgcmd{backout} command was quite explicit in 1.329 +telling us this. 1.330 +\interaction{backout.manual.log} 1.331 + 1.332 +Again, it's easier to see what has happened by looking at a graph of 1.333 +the revision history, in figure~\ref{fig:undo:backout-manual}. This 1.334 +makes it clear that when we use \hgcmd{backout} to back out a change 1.335 +other than the tip, Mercurial adds a new head to the repository (the 1.336 +change it committed is box-shaped). 1.337 + 1.338 +\begin{figure}[htb] 1.339 + \centering 1.340 + \grafix{undo-manual} 1.341 + \caption{Backing out a change using the \hgcmd{backout} command} 1.342 + \label{fig:undo:backout-manual} 1.343 +\end{figure} 1.344 + 1.345 +After the \hgcmd{backout} command has completed, it leaves the new 1.346 +``backout'' changeset as the parent of the working directory. 1.347 +\interaction{backout.manual.parents} 1.348 +Now we have two isolated sets of changes. 1.349 +\interaction{backout.manual.heads} 1.350 + 1.351 +Let's think about what we expect to see as the contents of 1.352 +\filename{myfile} now. The first change should be present, because 1.353 +we've never backed it out. The second change should be missing, as 1.354 +that's the change we backed out. Since the history graph shows the 1.355 +third change as a separate head, we \emph{don't} expect to see the 1.356 +third change present in \filename{myfile}. 1.357 +\interaction{backout.manual.cat} 1.358 +To get the third change back into the file, we just do a normal merge 1.359 +of our two heads. 1.360 +\interaction{backout.manual.merge} 1.361 +Afterwards, the graphical history of our repository looks like 1.362 +figure~\ref{fig:undo:backout-manual-merge}. 1.363 + 1.364 +\begin{figure}[htb] 1.365 + \centering 1.366 + \grafix{undo-manual-merge} 1.367 + \caption{Manually merging a backout change} 1.368 + \label{fig:undo:backout-manual-merge} 1.369 +\end{figure} 1.370 + 1.371 +\subsection{Why \hgcmd{backout} works as it does} 1.372 + 1.373 +Here's a brief description of how the \hgcmd{backout} command works. 1.374 +\begin{enumerate} 1.375 +\item It ensures that the working directory is ``clean'', i.e.~that 1.376 + the output of \hgcmd{status} would be empty. 1.377 +\item It remembers the current parent of the working directory. Let's 1.378 + call this changeset \texttt{orig} 1.379 +\item It does the equivalent of a \hgcmd{update} to sync the working 1.380 + directory to the changeset you want to back out. Let's call this 1.381 + changeset \texttt{backout} 1.382 +\item It finds the parent of that changeset. Let's call that 1.383 + changeset \texttt{parent}. 1.384 +\item For each file that the \texttt{backout} changeset affected, it 1.385 + does the equivalent of a \hgcmdargs{revert}{-r parent} on that file, 1.386 + to restore it to the contents it had before that changeset was 1.387 + committed. 1.388 +\item It commits the result as a new changeset. This changeset has 1.389 + \texttt{backout} as its parent. 1.390 +\item If you specify \hgopt{backout}{--merge} on the command line, it 1.391 + merges with \texttt{orig}, and commits the result of the merge. 1.392 +\end{enumerate} 1.393 + 1.394 +An alternative way to implement the \hgcmd{backout} command would be 1.395 +to \hgcmd{export} the to-be-backed-out changeset as a diff, then use 1.396 +the \cmdopt{patch}{--reverse} option to the \command{patch} command to 1.397 +reverse the effect of the change without fiddling with the working 1.398 +directory. This sounds much simpler, but it would not work nearly as 1.399 +well. 1.400 + 1.401 +The reason that \hgcmd{backout} does an update, a commit, a merge, and 1.402 +another commit is to give the merge machinery the best chance to do a 1.403 +good job when dealing with all the changes \emph{between} the change 1.404 +you're backing out and the current tip. 1.405 + 1.406 +If you're backing out a changeset that's~100 revisions back in your 1.407 +project's history, the chances that the \command{patch} command will 1.408 +be able to apply a reverse diff cleanly are not good, because 1.409 +intervening changes are likely to have ``broken the context'' that 1.410 +\command{patch} uses to determine whether it can apply a patch (if 1.411 +this sounds like gibberish, see \ref{sec:mq:patch} for a 1.412 +discussion of the \command{patch} command). Also, Mercurial's merge 1.413 +machinery will handle files and directories being renamed, permission 1.414 +changes, and modifications to binary files, none of which 1.415 +\command{patch} can deal with. 1.416 + 1.417 +\section{Changes that should never have been} 1.418 +\label{sec:undo:aaaiiieee} 1.419 + 1.420 +Most of the time, the \hgcmd{backout} command is exactly what you need 1.421 +if you want to undo the effects of a change. It leaves a permanent 1.422 +record of exactly what you did, both when committing the original 1.423 +changeset and when you cleaned up after it. 1.424 + 1.425 +On rare occasions, though, you may find that you've committed a change 1.426 +that really should not be present in the repository at all. For 1.427 +example, it would be very unusual, and usually considered a mistake, 1.428 +to commit a software project's object files as well as its source 1.429 +files. Object files have almost no intrinsic value, and they're 1.430 +\emph{big}, so they increase the size of the repository and the amount 1.431 +of time it takes to clone or pull changes. 1.432 + 1.433 +Before I discuss the options that you have if you commit a ``brown 1.434 +paper bag'' change (the kind that's so bad that you want to pull a 1.435 +brown paper bag over your head), let me first discuss some approaches 1.436 +that probably won't work. 1.437 + 1.438 +Since Mercurial treats history as accumulative---every change builds 1.439 +on top of all changes that preceded it---you generally can't just make 1.440 +disastrous changes disappear. The one exception is when you've just 1.441 +committed a change, and it hasn't been pushed or pulled into another 1.442 +repository. That's when you can safely use the \hgcmd{rollback} 1.443 +command, as I detailed in section~\ref{sec:undo:rollback}. 1.444 + 1.445 +After you've pushed a bad change to another repository, you 1.446 +\emph{could} still use \hgcmd{rollback} to make your local copy of the 1.447 +change disappear, but it won't have the consequences you want. The 1.448 +change will still be present in the remote repository, so it will 1.449 +reappear in your local repository the next time you pull. 1.450 + 1.451 +If a situation like this arises, and you know which repositories your 1.452 +bad change has propagated into, you can \emph{try} to get rid of the 1.453 +changeefrom \emph{every} one of those repositories. This is, of 1.454 +course, not a satisfactory solution: if you miss even a single 1.455 +repository while you're expunging, the change is still ``in the 1.456 +wild'', and could propagate further. 1.457 + 1.458 +If you've committed one or more changes \emph{after} the change that 1.459 +you'd like to see disappear, your options are further reduced. 1.460 +Mercurial doesn't provide a way to ``punch a hole'' in history, 1.461 +leaving changesets intact. 1.462 + 1.463 +XXX This needs filling out. The \texttt{hg-replay} script in the 1.464 +\texttt{examples} directory works, but doesn't handle merge 1.465 +changesets. Kind of an important omission. 1.466 + 1.467 +\subsection{Protect yourself from ``escaped'' changes} 1.468 + 1.469 +If you've committed some changes to your local repository and they've 1.470 +been pushed or pulled somewhere else, this isn't necessarily a 1.471 +disaster. You can protect yourself ahead of time against some classes 1.472 +of bad changeset. This is particularly easy if your team usually 1.473 +pulls changes from a central repository. 1.474 + 1.475 +By configuring some hooks on that repository to validate incoming 1.476 +changesets (see chapter~\ref{chap:hook}), you can automatically 1.477 +prevent some kinds of bad changeset from being pushed to the central 1.478 +repository at all. With such a configuration in place, some kinds of 1.479 +bad changeset will naturally tend to ``die out'' because they can't 1.480 +propagate into the central repository. Better yet, this happens 1.481 +without any need for explicit intervention. 1.482 + 1.483 +For instance, an incoming change hook that verifies that a changeset 1.484 +will actually compile can prevent people from inadvertantly ``breaking 1.485 +the build''. 1.486 + 1.487 +\section{Finding the source of a bug} 1.488 +\label{sec:undo:bisect} 1.489 + 1.490 +While it's all very well to be able to back out a changeset that 1.491 +introduced a bug, this requires that you know which changeset to back 1.492 +out. Mercurial provides an invaluable command, called 1.493 +\hgcmd{bisect}, that helps you to automate this process and accomplish 1.494 +it very efficiently. 1.495 + 1.496 +The idea behind the \hgcmd{bisect} command is that a changeset has 1.497 +introduced some change of behaviour that you can identify with a 1.498 +simple binary test. You don't know which piece of code introduced the 1.499 +change, but you know how to test for the presence of the bug. The 1.500 +\hgcmd{bisect} command uses your test to direct its search for the 1.501 +changeset that introduced the code that caused the bug. 1.502 + 1.503 +Here are a few scenarios to help you understand how you might apply 1.504 +this command. 1.505 +\begin{itemize} 1.506 +\item The most recent version of your software has a bug that you 1.507 + remember wasn't present a few weeks ago, but you don't know when it 1.508 + was introduced. Here, your binary test checks for the presence of 1.509 + that bug. 1.510 +\item You fixed a bug in a rush, and now it's time to close the entry 1.511 + in your team's bug database. The bug database requires a changeset 1.512 + ID when you close an entry, but you don't remember which changeset 1.513 + you fixed the bug in. Once again, your binary test checks for the 1.514 + presence of the bug. 1.515 +\item Your software works correctly, but runs~15\% slower than the 1.516 + last time you measured it. You want to know which changeset 1.517 + introduced the performance regression. In this case, your binary 1.518 + test measures the performance of your software, to see whether it's 1.519 + ``fast'' or ``slow''. 1.520 +\item The sizes of the components of your project that you ship 1.521 + exploded recently, and you suspect that something changed in the way 1.522 + you build your project. 1.523 +\end{itemize} 1.524 + 1.525 +From these examples, it should be clear that the \hgcmd{bisect} 1.526 +command is not useful only for finding the sources of bugs. You can 1.527 +use it to find any ``emergent property'' of a repository (anything 1.528 +that you can't find from a simple text search of the files in the 1.529 +tree) for which you can write a binary test. 1.530 + 1.531 +We'll introduce a little bit of terminology here, just to make it 1.532 +clear which parts of the search process are your responsibility, and 1.533 +which are Mercurial's. A \emph{test} is something that \emph{you} run 1.534 +when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what 1.535 +\hgcmd{bisect} runs to tell whether a revision is good. Finally, 1.536 +we'll use the word ``bisect'', as both a noun and a verb, to stand in 1.537 +for the phrase ``search using the \hgcmd{bisect} command. 1.538 + 1.539 +One simple way to automate the searching process would be simply to 1.540 +probe every changeset. However, this scales poorly. If it took ten 1.541 +minutes to test a single changeset, and you had 10,000 changesets in 1.542 +your repository, the exhaustive approach would take on average~35 1.543 +\emph{days} to find the changeset that introduced a bug. Even if you 1.544 +knew that the bug was introduced by one of the last 500 changesets, 1.545 +and limited your search to those, you'd still be looking at over 40 1.546 +hours to find the changeset that introduced your bug. 1.547 + 1.548 +What the \hgcmd{bisect} command does is use its knowledge of the 1.549 +``shape'' of your project's revision history to perform a search in 1.550 +time proportional to the \emph{logarithm} of the number of changesets 1.551 +to check (the kind of search it performs is called a dichotomic 1.552 +search). With this approach, searching through 10,000 changesets will 1.553 +take less than three hours, even at ten minutes per test (the search 1.554 +will require about 14 tests). Limit your search to the last hundred 1.555 +changesets, and it will take only about an hour (roughly seven tests). 1.556 + 1.557 +The \hgcmd{bisect} command is aware of the ``branchy'' nature of a 1.558 +Mercurial project's revision history, so it has no problems dealing 1.559 +with branches, merges, or multiple heads in a repository. It can 1.560 +prune entire branches of history with a single probe, which is how it 1.561 +operates so efficiently. 1.562 + 1.563 +\subsection{Using the \hgcmd{bisect} command} 1.564 + 1.565 +Here's an example of \hgcmd{bisect} in action. 1.566 + 1.567 +\begin{note} 1.568 + In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a 1.569 + core command: it was distributed with Mercurial as an extension. 1.570 + This section describes the built-in command, not the old extension. 1.571 +\end{note} 1.572 + 1.573 +Now let's create a repository, so that we can try out the 1.574 +\hgcmd{bisect} command in isolation. 1.575 +\interaction{bisect.init} 1.576 +We'll simulate a project that has a bug in it in a simple-minded way: 1.577 +create trivial changes in a loop, and nominate one specific change 1.578 +that will have the ``bug''. This loop creates 35 changesets, each 1.579 +adding a single file to the repository. We'll represent our ``bug'' 1.580 +with a file that contains the text ``i have a gub''. 1.581 +\interaction{bisect.commits} 1.582 + 1.583 +The next thing that we'd like to do is figure out how to use the 1.584 +\hgcmd{bisect} command. We can use Mercurial's normal built-in help 1.585 +mechanism for this. 1.586 +\interaction{bisect.help} 1.587 + 1.588 +The \hgcmd{bisect} command works in steps. Each step proceeds as follows. 1.589 +\begin{enumerate} 1.590 +\item You run your binary test. 1.591 + \begin{itemize} 1.592 + \item If the test succeeded, you tell \hgcmd{bisect} by running the 1.593 + \hgcmdargs{bisect}{good} command. 1.594 + \item If it failed, run the \hgcmdargs{bisect}{--bad} command. 1.595 + \end{itemize} 1.596 +\item The command uses your information to decide which changeset to 1.597 + test next. 1.598 +\item It updates the working directory to that changeset, and the 1.599 + process begins again. 1.600 +\end{enumerate} 1.601 +The process ends when \hgcmd{bisect} identifies a unique changeset 1.602 +that marks the point where your test transitioned from ``succeeding'' 1.603 +to ``failing''. 1.604 + 1.605 +To start the search, we must run the \hgcmdargs{bisect}{--reset} command. 1.606 +\interaction{bisect.search.init} 1.607 + 1.608 +In our case, the binary test we use is simple: we check to see if any 1.609 +file in the repository contains the string ``i have a gub''. If it 1.610 +does, this changeset contains the change that ``caused the bug''. By 1.611 +convention, a changeset that has the property we're searching for is 1.612 +``bad'', while one that doesn't is ``good''. 1.613 + 1.614 +Most of the time, the revision to which the working directory is 1.615 +synced (usually the tip) already exhibits the problem introduced by 1.616 +the buggy change, so we'll mark it as ``bad''. 1.617 +\interaction{bisect.search.bad-init} 1.618 + 1.619 +Our next task is to nominate a changeset that we know \emph{doesn't} 1.620 +have the bug; the \hgcmd{bisect} command will ``bracket'' its search 1.621 +between the first pair of good and bad changesets. In our case, we 1.622 +know that revision~10 didn't have the bug. (I'll have more words 1.623 +about choosing the first ``good'' changeset later.) 1.624 +\interaction{bisect.search.good-init} 1.625 + 1.626 +Notice that this command printed some output. 1.627 +\begin{itemize} 1.628 +\item It told us how many changesets it must consider before it can 1.629 + identify the one that introduced the bug, and how many tests that 1.630 + will require. 1.631 +\item It updated the working directory to the next changeset to test, 1.632 + and told us which changeset it's testing. 1.633 +\end{itemize} 1.634 + 1.635 +We now run our test in the working directory. We use the 1.636 +\command{grep} command to see if our ``bad'' file is present in the 1.637 +working directory. If it is, this revision is bad; if not, this 1.638 +revision is good. 1.639 +\interaction{bisect.search.step1} 1.640 + 1.641 +This test looks like a perfect candidate for automation, so let's turn 1.642 +it into a shell function. 1.643 +\interaction{bisect.search.mytest} 1.644 +We can now run an entire test step with a single command, 1.645 +\texttt{mytest}. 1.646 +\interaction{bisect.search.step2} 1.647 +A few more invocations of our canned test step command, and we're 1.648 +done. 1.649 +\interaction{bisect.search.rest} 1.650 + 1.651 +Even though we had~40 changesets to search through, the \hgcmd{bisect} 1.652 +command let us find the changeset that introduced our ``bug'' with 1.653 +only five tests. Because the number of tests that the \hgcmd{bisect} 1.654 +command performs grows logarithmically with the number of changesets to 1.655 +search, the advantage that it has over the ``brute force'' search 1.656 +approach increases with every changeset you add. 1.657 + 1.658 +\subsection{Cleaning up after your search} 1.659 + 1.660 +When you're finished using the \hgcmd{bisect} command in a 1.661 +repository, you can use the \hgcmdargs{bisect}{reset} command to drop 1.662 +the information it was using to drive your search. The command 1.663 +doesn't use much space, so it doesn't matter if you forget to run this 1.664 +command. However, \hgcmd{bisect} won't let you start a new search in 1.665 +that repository until you do a \hgcmdargs{bisect}{reset}. 1.666 +\interaction{bisect.search.reset} 1.667 + 1.668 +\section{Tips for finding bugs effectively} 1.669 + 1.670 +\subsection{Give consistent input} 1.671 + 1.672 +The \hgcmd{bisect} command requires that you correctly report the 1.673 +result of every test you perform. If you tell it that a test failed 1.674 +when it really succeeded, it \emph{might} be able to detect the 1.675 +inconsistency. If it can identify an inconsistency in your reports, 1.676 +it will tell you that a particular changeset is both good and bad. 1.677 +However, it can't do this perfectly; it's about as likely to report 1.678 +the wrong changeset as the source of the bug. 1.679 + 1.680 +\subsection{Automate as much as possible} 1.681 + 1.682 +When I started using the \hgcmd{bisect} command, I tried a few times 1.683 +to run my tests by hand, on the command line. This is an approach 1.684 +that I, at least, am not suited to. After a few tries, I found that I 1.685 +was making enough mistakes that I was having to restart my searches 1.686 +several times before finally getting correct results. 1.687 + 1.688 +My initial problems with driving the \hgcmd{bisect} command by hand 1.689 +occurred even with simple searches on small repositories; if the 1.690 +problem you're looking for is more subtle, or the number of tests that 1.691 +\hgcmd{bisect} must perform increases, the likelihood of operator 1.692 +error ruining the search is much higher. Once I started automating my 1.693 +tests, I had much better results. 1.694 + 1.695 +The key to automated testing is twofold: 1.696 +\begin{itemize} 1.697 +\item always test for the same symptom, and 1.698 +\item always feed consistent input to the \hgcmd{bisect} command. 1.699 +\end{itemize} 1.700 +In my tutorial example above, the \command{grep} command tests for the 1.701 +symptom, and the \texttt{if} statement takes the result of this check 1.702 +and ensures that we always feed the same input to the \hgcmd{bisect} 1.703 +command. The \texttt{mytest} function marries these together in a 1.704 +reproducible way, so that every test is uniform and consistent. 1.705 + 1.706 +\subsection{Check your results} 1.707 + 1.708 +Because the output of a \hgcmd{bisect} search is only as good as the 1.709 +input you give it, don't take the changeset it reports as the 1.710 +absolute truth. A simple way to cross-check its report is to manually 1.711 +run your test at each of the following changesets: 1.712 +\begin{itemize} 1.713 +\item The changeset that it reports as the first bad revision. Your 1.714 + test should still report this as bad. 1.715 +\item The parent of that changeset (either parent, if it's a merge). 1.716 + Your test should report this changeset as good. 1.717 +\item A child of that changeset. Your test should report this 1.718 + changeset as bad. 1.719 +\end{itemize} 1.720 + 1.721 +\subsection{Beware interference between bugs} 1.722 + 1.723 +It's possible that your search for one bug could be disrupted by the 1.724 +presence of another. For example, let's say your software crashes at 1.725 +revision 100, and worked correctly at revision 50. Unknown to you, 1.726 +someone else introduced a different crashing bug at revision 60, and 1.727 +fixed it at revision 80. This could distort your results in one of 1.728 +several ways. 1.729 + 1.730 +It is possible that this other bug completely ``masks'' yours, which 1.731 +is to say that it occurs before your bug has a chance to manifest 1.732 +itself. If you can't avoid that other bug (for example, it prevents 1.733 +your project from building), and so can't tell whether your bug is 1.734 +present in a particular changeset, the \hgcmd{bisect} command cannot 1.735 +help you directly. Instead, you can mark a changeset as untested by 1.736 +running \hgcmdargs{bisect}{--skip}. 1.737 + 1.738 +A different problem could arise if your test for a bug's presence is 1.739 +not specific enough. If you check for ``my program crashes'', then 1.740 +both your crashing bug and an unrelated crashing bug that masks it 1.741 +will look like the same thing, and mislead \hgcmd{bisect}. 1.742 + 1.743 +Another useful situation in which to use \hgcmdargs{bisect}{--skip} is 1.744 +if you can't test a revision because your project was in a broken and 1.745 +hence untestable state at that revision, perhaps because someone 1.746 +checked in a change that prevented the project from building. 1.747 + 1.748 +\subsection{Bracket your search lazily} 1.749 + 1.750 +Choosing the first ``good'' and ``bad'' changesets that will mark the 1.751 +end points of your search is often easy, but it bears a little 1.752 +discussion nevertheless. From the perspective of \hgcmd{bisect}, the 1.753 +``newest'' changeset is conventionally ``bad'', and the older 1.754 +changeset is ``good''. 1.755 + 1.756 +If you're having trouble remembering when a suitable ``good'' change 1.757 +was, so that you can tell \hgcmd{bisect}, you could do worse than 1.758 +testing changesets at random. Just remember to eliminate contenders 1.759 +that can't possibly exhibit the bug (perhaps because the feature with 1.760 +the bug isn't present yet) and those where another problem masks the 1.761 +bug (as I discussed above). 1.762 + 1.763 +Even if you end up ``early'' by thousands of changesets or months of 1.764 +history, you will only add a handful of tests to the total number that 1.765 +\hgcmd{bisect} must perform, thanks to its logarithmic behaviour. 1.766 + 1.767 +%%% Local Variables: 1.768 +%%% mode: latex 1.769 +%%% TeX-master: "00book" 1.770 +%%% End: