hgbook

changeset 130:26b7a4e943aa

Describe the bisect extension.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu Dec 28 14:06:15 2006 -0800 (2006-12-28)
parents 73efa1a01a6c
children 153efeaa8f57
files en/examples/bisect en/undo.tex
line diff
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/en/examples/bisect	Thu Dec 28 14:06:15 2006 -0800
     1.3 @@ -0,0 +1,80 @@
     1.4 +#!/bin/bash
     1.5 +
     1.6 +echo '[extensions]' >> $HGRC
     1.7 +echo 'hbisect =' >> $HGRC
     1.8 +
     1.9 +#$ name: init
    1.10 +
    1.11 +hg init mybug
    1.12 +cd mybug
    1.13 +
    1.14 +#$ name: commits
    1.15 +
    1.16 +buggy_change=37
    1.17 +
    1.18 +for (( i = 0; i < 50; i++ )); do
    1.19 +  if [[ $i = $buggy_change ]]; then
    1.20 +    echo 'i have a gub' > myfile$i
    1.21 +    hg commit -q -A -m 'buggy changeset'
    1.22 +  else
    1.23 +    echo 'nothing to see here, move along' > myfile$i
    1.24 +    hg commit -q -A -m 'normal changeset'
    1.25 +  fi
    1.26 +done
    1.27 +
    1.28 +#$ name: help
    1.29 +
    1.30 +hg help bisect
    1.31 +hg bisect help
    1.32 +
    1.33 +#$ name: search.init
    1.34 +
    1.35 +hg bisect init
    1.36 +
    1.37 +#$ name: search.bad-init
    1.38 +
    1.39 +hg bisect bad
    1.40 +
    1.41 +#$ name: search.good-init
    1.42 +
    1.43 +hg bisect good 10
    1.44 +
    1.45 +#$ name: search.step1
    1.46 +
    1.47 +if grep -q 'i have a gub' *
    1.48 +then
    1.49 +  result=bad
    1.50 +else
    1.51 +  result=good
    1.52 +fi
    1.53 +
    1.54 +echo this revision is $result
    1.55 +hg bisect $result
    1.56 +
    1.57 +#$ name: mytest
    1.58 +
    1.59 +mytest() {
    1.60 +  if grep -q 'i have a gub' *
    1.61 +  then
    1.62 +    result=bad
    1.63 +  else
    1.64 +    result=good
    1.65 +  fi
    1.66 +
    1.67 +  echo this revision is $result
    1.68 +  hg bisect $result
    1.69 +}
    1.70 +  
    1.71 +#$ name: search.step2
    1.72 +
    1.73 +mytest
    1.74 +
    1.75 +#$ name: search.rest
    1.76 +
    1.77 +mytest
    1.78 +mytest
    1.79 +mytest
    1.80 +
    1.81 +#$ name: search.reset
    1.82 +
    1.83 +hg bisect reset
     2.1 --- a/en/undo.tex	Thu Dec 28 09:56:47 2006 -0800
     2.2 +++ b/en/undo.tex	Thu Dec 28 14:06:15 2006 -0800
     2.3 @@ -454,6 +454,296 @@
     2.4  \texttt{examples} directory works, but doesn't handle merge
     2.5  changesets.  Kind of an important omission.
     2.6  
     2.7 +\section{Finding the source of a bug}
     2.8 +
     2.9 +While it's all very well to be able to back out a changeset that
    2.10 +introduced a bug, this requires that you know which changeset to back
    2.11 +out.  Mercurial provides an invaluable extension, called
    2.12 +\hgext{bisect}, that helps you to automate this process and accomplish
    2.13 +it very efficiently.
    2.14 +
    2.15 +The idea behind the \hgext{bisect} extension is that a changeset has
    2.16 +introduced some change of behaviour that you can identify with a
    2.17 +simple binary test.  You don't know which piece of code introduced the
    2.18 +change, but you know how to test for the presence of the bug.  The
    2.19 +\hgext{bisect} extension uses your test to direct its search for the
    2.20 +changeset that introduced the code that caused the bug.
    2.21 +
    2.22 +Here are a few scenarios to help you understand how you might apply this
    2.23 +extension.
    2.24 +\begin{itemize}
    2.25 +\item The most recent version of your software has a bug that you
    2.26 +  remember wasn't present a few weeks ago, but you don't know when it
    2.27 +  was introduced.  Here, your binary test checks for the presence of
    2.28 +  that bug.
    2.29 +\item You fixed a bug in a rush, and now it's time to close the entry
    2.30 +  in your team's bug database.  The bug database requires a changeset
    2.31 +  ID when you close an entry, but you don't remember which changeset
    2.32 +  you fixed the bug in.  Once again, your binary test checks for the
    2.33 +  presence of the bug.
    2.34 +\item Your software works correctly, but runs~15\% slower than the
    2.35 +  last time you measured it.  You want to know which changeset
    2.36 +  introduced the performance regression.  In this case, your binary
    2.37 +  test measures the performance of your software, to see whether it's
    2.38 +  ``fast'' or ``slow''.
    2.39 +\item The sizes of the components of your project that you ship
    2.40 +  exploded recently, and you suspect that something changed in the way
    2.41 +  you build your project.
    2.42 +\end{itemize}
    2.43 +
    2.44 +From these examples, it should be clear that the \hgext{bisect}
    2.45 +extension is not useful only for finding the sources of bugs.  You can
    2.46 +use it to find any ``emergent property'' of a repository (anything
    2.47 +that you can't find from a simple text search of the files in the
    2.48 +tree) for which you can write a binary test.
    2.49 +
    2.50 +We'll introduce a little bit of terminology here, just to make it
    2.51 +clear which parts of the search process are your responsibility, and
    2.52 +which are Mercurial's.  A \emph{test} is something that \emph{you} run
    2.53 +when \hgext{bisect} chooses a changeset.  A \emph{probe} is what
    2.54 +\hgext{bisect} runs to tell whether a revision is good.  Finally,
    2.55 +we'll use the word ``bisect'', as both a noun and a verb, to stand in
    2.56 +for the phrase ``search using the \hgext{bisect} extension''.
    2.57 +
    2.58 +One simple way to automate the searching process would be simply to
    2.59 +probe every changeset.  However, this scales poorly.  If it took ten
    2.60 +minutes to test a single changeset, and you had 10,000 changesets in
    2.61 +your repository, the exhaustive approach would take on average~35
    2.62 +\emph{days} to find the changeset that introduced a bug.  Even if you
    2.63 +knew that the bug was introduced by one of the last 500 changesets,
    2.64 +and limited your search to those, you'd still be looking at over 40
    2.65 +hours to find the changeset that introduced your bug.
    2.66 +
    2.67 +What the \emph{bisect} extension does is use its knowledge of the
    2.68 +``shape'' of your project's revision history to perform a search in
    2.69 +time proportional to the \emph{logarithm} of the number of changesets
    2.70 +to check (the kind of search it performs is called a dichotomic
    2.71 +search).  With this approach, searching through 10,000 changesets will
    2.72 +take less than two hours, even at ten minutes per test.  Limit your
    2.73 +search to the last 500 changesets, and it will take less than an hour.
    2.74 +
    2.75 +The \hgext{bisect} extension is aware of the ``branchy'' nature of a
    2.76 +Mercurial project's revision history, so it has no problems dealing
    2.77 +with branches, merges, or multiple heads in a repoository.  It can
    2.78 +prune entire branches of history with a single probe, which is how it
    2.79 +operates so efficiently.
    2.80 +
    2.81 +\subsection{Using the \hgext{bisect} extension}
    2.82 +
    2.83 +Here's an example of \hgext{bisect} in action.  To keep the core of
    2.84 +Mercurial simple, \hgext{bisect} is packaged as an extension; this
    2.85 +means that it won't be present unless you explicitly enable it.  To do
    2.86 +this, edit your \hgrc\ and add the following section header (if it's
    2.87 +not already present):
    2.88 +\begin{codesample2}
    2.89 +  [extensions]
    2.90 +\end{codesample2}
    2.91 +Then add a line to this section to enable the extension:
    2.92 +\begin{codesample2}
    2.93 +  hbisect =
    2.94 +\end{codesample2}
    2.95 +\begin{note}
    2.96 +  That's right, there's a ``\texttt{h}'' at the front of the name of
    2.97 +  the \hgext{bisect} extension.  The reason is that Mercurial is
    2.98 +  written in Python, and uses a standard Python package called
    2.99 +  \texttt{bisect}.  If you omit the ``\texttt{h}'' from the name
   2.100 +  ``\texttt{hbisect}'', Mercurial will erroneously find the standard
   2.101 +  Python \texttt{bisect} package, and try to use it as a Mercurial
   2.102 +  extension.  This won't work, and Mercurial will crash repeatedly
   2.103 +  until you fix the spelling in your \hgrc.  Ugh.
   2.104 +\end{note}
   2.105 +
   2.106 +Now let's create a repository, so that we can try out the
   2.107 +\hgext{bisect} extension in isolation.
   2.108 +\interaction{bisect.init}
   2.109 +We'll simulate a project that has a bug in it in a simple-minded way:
   2.110 +create trivial changes in a loop, and nominate one specific change
   2.111 +that will have the ``bug''.  This loop creates 50 changesets, each
   2.112 +adding a single file to the repository.  We'll represent our ``bug''
   2.113 +with a file that contains the text ``i have a gub''.
   2.114 +\interaction{bisect.commits}
   2.115 +
   2.116 +The next thing that we'd like to do is figure out how to use the
   2.117 +\hgext{bisect} extension.  We can use Mercurial's normal built-in help
   2.118 +mechanism for this.
   2.119 +\interaction{bisect.help}
   2.120 +
   2.121 +The \hgext{bisect} extension works in steps.  Each step proceeds as follows.
   2.122 +\begin{enumerate}
   2.123 +\item You run your binary test.
   2.124 +  \begin{itemize}
   2.125 +  \item If the test succeeded, you tell \hgext{bisect} by running the
   2.126 +    \hgcmdargs{bisect}{good} command.
   2.127 +  \item If it failed, use the \hgcmdargs{bisect}{bad} command to let
   2.128 +    the \hgext{bisect} extension know.
   2.129 +  \end{itemize}
   2.130 +\item The extension uses your information to decide which changeset to
   2.131 +  test next.
   2.132 +\item It updates the working directory to that changeset, and the
   2.133 +  process begins again.
   2.134 +\end{enumerate}
   2.135 +The process ends when \hgext{bisect} identifies a unique changeset
   2.136 +that marks the point where your test transitioned from ``succeeding''
   2.137 +to ``failing''.
   2.138 +
   2.139 +To start the search, we must run the \hgcmdargs{bisect}{init} command.
   2.140 +\interaction{bisect.search.init}
   2.141 +
   2.142 +In our case, the binary test we use is simple: we check to see if any
   2.143 +file in the repository contains the string ``i have a gub''.  If it
   2.144 +does, this changeset contains the change that ``caused the bug''.  By
   2.145 +convention, a changeset that has the property we're searching for is
   2.146 +``bad'', while one that doesn't is ``good''.
   2.147 +
   2.148 +Most of the time, the revision to which the working directory is
   2.149 +synced (usually the tip) already exhibits the problem introduced by
   2.150 +the buggy change, so we'll mark it as ``bad''.
   2.151 +\interaction{bisect.search.bad-init}
   2.152 +
   2.153 +Our next task is to nominate a changeset that we know \emph{doesn't}
   2.154 +have the bug; the \hgext{bisect} extension will ``bracket'' its search
   2.155 +between the first pair of good and bad changesets.  In our case, we
   2.156 +know that revision~10 didn't have the bug.  (I'll have more words
   2.157 +about choosing the first ``good'' changeset later.)
   2.158 +\interaction{bisect.search.good-init}
   2.159 +
   2.160 +Notice that this command printed some output.
   2.161 +\begin{itemize}
   2.162 +\item It told us how many changesets it must consider before it can
   2.163 +  identify the one that introduced the bug, and how many tests that
   2.164 +  will require.
   2.165 +\item It updated the working directory to the next changeset to test,
   2.166 +  and told us which changeset it's testing.
   2.167 +\end{itemize}
   2.168 +
   2.169 +We now run our test in the working directory.  We use the
   2.170 +\command{grep} command to see if our ``bad'' file is present in the
   2.171 +working directory.  If it is, this revision is bad; if not, this
   2.172 +revision is good.
   2.173 +\interaction{search.step1}
   2.174 +
   2.175 +This test looks like a perfect candidate for automation, so let's turn
   2.176 +it into a shell function.
   2.177 +\interaction{search.mytest}
   2.178 +We can now run an entire test step with a single command,
   2.179 +\texttt{mytest}.
   2.180 +\interaction{search.step2}
   2.181 +A few more invocations of our canned test step command, and we're
   2.182 +done.
   2.183 +\interaction{search.rest}
   2.184 +
   2.185 +Even though we had~40 changesets to search through, the \hgext{bisect}
   2.186 +extension let us find the changeset that introduced our ``bug'' with
   2.187 +only five tests.  Because the number of tests that the \hgext{bisect}
   2.188 +extension grows logarithmically with the number of changesets to
   2.189 +search, the advantage that it has over the ``brute force'' search
   2.190 +approach increases with every changeset you add.
   2.191 +
   2.192 +\subsection{Cleaning up after your search}
   2.193 +
   2.194 +When you're finished using the \hgext{bisect} extension in a
   2.195 +repository, you can use the \hgcmdargs{bisect}{reset} command to drop
   2.196 +the information it was using to drive your search.  The extension
   2.197 +doesn't use much space, so it doesn't matter if you forget to run this
   2.198 +command.  However, \hgext{bisect} won't let you start a new search in
   2.199 +that repository until you do a \hgcmdargs{bisect}{reset}.
   2.200 +\interaction{search.reset}
   2.201 +
   2.202 +\section{Tips for finding bugs effectively}
   2.203 +
   2.204 +\subsection{Give consistent input}
   2.205 +
   2.206 +The \hgext{bisect} extension requires that you correctly report the
   2.207 +result of every test you perform.  If you tell it that a test failed
   2.208 +when it really succeeded, it \emph{might} be able to detect the
   2.209 +inconsistency.  If it can identify an inconsistency in your reports,
   2.210 +it will tell you that a particular changeset is both good and bad.
   2.211 +However, it can't do this perfectly; it's about as likely to report
   2.212 +the wrong changeset as the source of the bug.
   2.213 +
   2.214 +\subsection{Automate as much as possible}
   2.215 +
   2.216 +When I started using the \hgext{bisect} extension, I tried a few times
   2.217 +to run my tests by hand, on the command line.  This is an approach
   2.218 +that I, at least, am not suited to.  After a few tries, I found that I
   2.219 +was making enough mistakes that I was having to restart my searches
   2.220 +several times before finally getting correct results.  
   2.221 +
   2.222 +My initial problems with driving the \hgext{bisect} extension by hand
   2.223 +occurred even with simple searches on small repositories; if the
   2.224 +problem you're looking for is more subtle, or the number of tests that
   2.225 +\hgext{bisect} must perform increases, the likelihood of operator
   2.226 +error ruining the search is much higher.  Once I started automating my
   2.227 +tests, I had much better results.
   2.228 +
   2.229 +The key to automated testing is twofold:
   2.230 +\begin{itemize}
   2.231 +\item always test for the same symptom, and
   2.232 +\item always feed consistent input to the \hgcmd{bisect} command.
   2.233 +\end{itemize}
   2.234 +In my tutorial example above, the \command{grep} command tests for the
   2.235 +symptom, and the \texttt{if} statement takes the result of this check
   2.236 +and ensures that we always feed the same input to the \hgcmd{bisect}
   2.237 +command.  The \texttt{mytest} function marries these together in a
   2.238 +reproducible way, so that every test is uniform and consistent.
   2.239 +
   2.240 +\subsection{Check your results}
   2.241 +
   2.242 +Because the output of a \hgext{bisect} search is only as good as the
   2.243 +input you give it, don't take the changeset it reports as the
   2.244 +absolute truth.  A simple way to cross-check its report is to manually
   2.245 +run your test at each of the following changesets:
   2.246 +\begin{itemize}
   2.247 +\item The changeset that it reports as the first bad revision.  Your
   2.248 +  test should still report this as bad.
   2.249 +\item The parent of that changeset (either parent, if it's a merge).
   2.250 +  Your test should report this changeset as good.
   2.251 +\item A child of that changeset.  Your test should report this
   2.252 +  changeset as bad.
   2.253 +\end{itemize}
   2.254 +
   2.255 +\subsection{Beware interference between bugs}
   2.256 +
   2.257 +It's possible that your search for one bug could be disrupted by the
   2.258 +presence of another.  For example, let's say your software crashes at
   2.259 +revision 100, and worked correctly at revision 50.  Unknown to you,
   2.260 +someone else introduced a different crashing bug at revision 60, and
   2.261 +fixed it at revision 80.  This could distort your results in one of
   2.262 +several ways.
   2.263 +
   2.264 +It is possible that this other bug completely ``masks'' yours, which
   2.265 +is to say that it occurs before your bug has a chance to manifest
   2.266 +itself.  If you can't avoid that other bug (for example, it prevents
   2.267 +your project from building), and so can't tell whether your bug is
   2.268 +present in a particular changeset, the \hgext{bisect} extension cannot
   2.269 +help you directly.  Instead, you'll need to manually avoid the
   2.270 +changesets where that bug is present, and do separate searches
   2.271 +``around'' it.
   2.272 +
   2.273 +A different problem could arise if your test for a bug's presence is
   2.274 +not specific enough.  If you checks for ``my program crashes'', then
   2.275 +both your crashing bug and an unrelated crashing bug that masks it
   2.276 +will look like the same thing, and mislead \hgext{bisect}.
   2.277 +
   2.278 +\subsection{Bracket your search lazily}
   2.279 +
   2.280 +Choosing the first ``good'' and ``bad'' changesets that will mark the
   2.281 +end points of your search is often easy, but it bears a little
   2.282 +discussion neverthheless.  From the perspective of \hgext{bisect}, the
   2.283 +``newest'' changeset is conventionally ``bad'', and the older
   2.284 +changeset is ``good''.
   2.285 +
   2.286 +If you're having trouble remembering when a suitable ``good'' change
   2.287 +was, so that you can tell \hgext{bisect}, you could do worse than
   2.288 +testing changesets at random.  Just remember to eliminate contenders
   2.289 +that can't possibly exhibit the bug (perhaps because the feature with
   2.290 +the bug isn't present yet) and those where another problem masks the
   2.291 +bug (as I discussed above).
   2.292 +
   2.293 +Even if you end up ``early'' by thousands of changesets or months of
   2.294 +history, you will only add a handful of tests to the total number that
   2.295 +\hgext{bisect} must perform, thanks to its logarithmic behaviour.
   2.296 +
   2.297  %%% Local Variables: 
   2.298  %%% mode: latex
   2.299  %%% TeX-master: "00book"