hgbook
annotate en/mq-collab.tex @ 188:d3dd1bedba3c
Backed out changeset 7f07aca44938d38b30ae8713946346123cdf97b6
Bad behaviour has gone away.
Bad behaviour has gone away.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Mon Apr 16 14:22:25 2007 -0700 (2007-04-16) |
parents | 9cbc5d0db542 |
children | 34943a3d50d6 |
rev | line source |
---|---|
bos@104 | 1 \chapter{Advanced uses of Mercurial Queues} |
bos@104 | 2 |
bos@104 | 3 While it's easy to pick up straightforward uses of Mercurial Queues, |
bos@104 | 4 use of a little discipline and some of MQ's less frequently used |
bos@104 | 5 capabilities makes it possible to work in complicated development |
bos@104 | 6 environments. |
bos@104 | 7 |
bos@105 | 8 In this chapter, I will use as an example a technique I have used to |
bos@105 | 9 manage the development of an Infiniband device driver for the Linux |
bos@105 | 10 kernel. The driver in question is large (at least as drivers go), |
bos@105 | 11 with 25,000 lines of code spread across 35 source files. It is |
bos@105 | 12 maintained by a small team of developers. |
bos@104 | 13 |
bos@104 | 14 While much of the material in this chapter is specific to Linux, the |
bos@104 | 15 same principles apply to any code base for which you're not the |
bos@104 | 16 primary owner, and upon which you need to do a lot of development. |
bos@104 | 17 |
bos@104 | 18 \section{The problem of many targets} |
bos@104 | 19 |
bos@104 | 20 The Linux kernel changes rapidly, and has never been internally |
bos@104 | 21 stable; developers frequently make drastic changes between releases. |
bos@104 | 22 This means that a version of the driver that works well with a |
bos@104 | 23 particular released version of the kernel will not even \emph{compile} |
bos@104 | 24 correctly against, typically, any other version. |
bos@104 | 25 |
bos@104 | 26 To maintain a driver, we have to keep a number of distinct versions of |
bos@104 | 27 Linux in mind. |
bos@104 | 28 \begin{itemize} |
bos@104 | 29 \item One target is the main Linux kernel development tree. |
bos@104 | 30 Maintenance of the code is in this case partly shared by other |
bos@104 | 31 developers in the kernel community, who make ``drive-by'' |
bos@104 | 32 modifications to the driver as they develop and refine kernel |
bos@104 | 33 subsystems. |
bos@104 | 34 \item We also maintain a number of ``backports'' to older versions of |
bos@104 | 35 the Linux kernel, to support the needs of customers who are running |
bos@105 | 36 older Linux distributions that do not incorporate our drivers. (To |
bos@105 | 37 \emph{backport} a piece of code is to modify it to work in an older |
bos@105 | 38 version of its target environment than the version it was developed |
bos@105 | 39 for.) |
bos@104 | 40 \item Finally, we make software releases on a schedule that is |
bos@104 | 41 necessarily not aligned with those used by Linux distributors and |
bos@104 | 42 kernel developers, so that we can deliver new features to customers |
bos@104 | 43 without forcing them to upgrade their entire kernels or |
bos@104 | 44 distributions. |
bos@104 | 45 \end{itemize} |
bos@104 | 46 |
bos@104 | 47 \subsection{Tempting approaches that don't work well} |
bos@104 | 48 |
bos@104 | 49 There are two ``standard'' ways to maintain a piece of software that |
bos@104 | 50 has to target many different environments. |
bos@104 | 51 |
bos@104 | 52 The first is to maintain a number of branches, each intended for a |
bos@104 | 53 single target. The trouble with this approach is that you must |
bos@104 | 54 maintain iron discipline in the flow of changes between repositories. |
bos@104 | 55 A new feature or bug fix must start life in a ``pristine'' repository, |
bos@104 | 56 then percolate out to every backport repository. Backport changes are |
bos@104 | 57 more limited in the branches they should propagate to; a backport |
bos@104 | 58 change that is applied to a branch where it doesn't belong will |
bos@104 | 59 probably stop the driver from compiling. |
bos@104 | 60 |
bos@104 | 61 The second is to maintain a single source tree filled with conditional |
bos@104 | 62 statements that turn chunks of code on or off depending on the |
bos@104 | 63 intended target. Because these ``ifdefs'' are not allowed in the |
bos@104 | 64 Linux kernel tree, a manual or automatic process must be followed to |
bos@104 | 65 strip them out and yield a clean tree. A code base maintained in this |
bos@104 | 66 fashion rapidly becomes a rat's nest of conditional blocks that are |
bos@104 | 67 difficult to understand and maintain. |
bos@104 | 68 |
bos@104 | 69 Neither of these approaches is well suited to a situation where you |
bos@104 | 70 don't ``own'' the canonical copy of a source tree. In the case of a |
bos@104 | 71 Linux driver that is distributed with the standard kernel, Linus's |
bos@104 | 72 tree contains the copy of the code that will be treated by the world |
bos@104 | 73 as canonical. The upstream version of ``my'' driver can be modified |
bos@104 | 74 by people I don't know, without me even finding out about it until |
bos@104 | 75 after the changes show up in Linus's tree. |
bos@104 | 76 |
bos@104 | 77 These approaches have the added weakness of making it difficult to |
bos@104 | 78 generate well-formed patches to submit upstream. |
bos@104 | 79 |
bos@104 | 80 In principle, Mercurial Queues seems like a good candidate to manage a |
bos@104 | 81 development scenario such as the above. While this is indeed the |
bos@104 | 82 case, MQ contains a few added features that make the job more |
bos@104 | 83 pleasant. |
bos@104 | 84 |
bos@105 | 85 \section{Conditionally applying patches with |
bos@105 | 86 guards} |
bos@104 | 87 |
bos@104 | 88 Perhaps the best way to maintain sanity with so many targets is to be |
bos@104 | 89 able to choose specific patches to apply for a given situation. MQ |
bos@104 | 90 provides a feature called ``guards'' (which originates with quilt's |
bos@104 | 91 \texttt{guards} command) that does just this. To start off, let's |
bos@104 | 92 create a simple repository for experimenting in. |
bos@104 | 93 \interaction{mq.guards.init} |
bos@104 | 94 This gives us a tiny repository that contains two patches that don't |
bos@104 | 95 have any dependencies on each other, because they touch different files. |
bos@104 | 96 |
bos@104 | 97 The idea behind conditional application is that you can ``tag'' a |
bos@104 | 98 patch with a \emph{guard}, which is simply a text string of your |
bos@104 | 99 choosing, then tell MQ to select specific guards to use when applying |
bos@104 | 100 patches. MQ will then either apply, or skip over, a guarded patch, |
bos@104 | 101 depending on the guards that you have selected. |
bos@104 | 102 |
bos@104 | 103 A patch can have an arbitrary number of guards; |
bos@104 | 104 each one is \emph{positive} (``apply this patch if this guard is |
bos@104 | 105 selected'') or \emph{negative} (``skip this patch if this guard is |
bos@104 | 106 selected''). A patch with no guards is always applied. |
bos@104 | 107 |
bos@104 | 108 \section{Controlling the guards on a patch} |
bos@104 | 109 |
bos@104 | 110 The \hgcmd{qguard} command lets you determine which guards should |
bos@104 | 111 apply to a patch, or display the guards that are already in effect. |
bos@104 | 112 Without any arguments, it displays the guards on the current topmost |
bos@104 | 113 patch. |
bos@104 | 114 \interaction{mq.guards.qguard} |
bos@104 | 115 To set a positive guard on a patch, prefix the name of the guard with |
bos@104 | 116 a ``\texttt{+}''. |
bos@104 | 117 \interaction{mq.guards.qguard.pos} |
bos@104 | 118 To set a negative guard on a patch, prefix the name of the guard with |
bos@104 | 119 a ``\texttt{-}''. |
bos@104 | 120 \interaction{mq.guards.qguard.neg} |
bos@104 | 121 |
bos@104 | 122 \begin{note} |
bos@104 | 123 The \hgcmd{qguard} command \emph{sets} the guards on a patch; it |
bos@104 | 124 doesn't \emph{modify} them. What this means is that if you run |
bos@104 | 125 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on |
bos@104 | 126 the same patch, the \emph{only} guard that will be set on it |
bos@104 | 127 afterwards is \texttt{+c}. |
bos@104 | 128 \end{note} |
bos@104 | 129 |
bos@104 | 130 Mercurial stores guards in the \sfilename{series} file; the form in |
bos@104 | 131 which they are stored is easy both to understand and to edit by hand. |
bos@104 | 132 (In other words, you don't have to use the \hgcmd{qguard} command if |
bos@104 | 133 you don't want to; it's okay to simply edit the \sfilename{series} |
bos@104 | 134 file.) |
bos@104 | 135 \interaction{mq.guards.series} |
bos@104 | 136 |
bos@104 | 137 \section{Selecting the guards to use} |
bos@104 | 138 |
bos@104 | 139 The \hgcmd{qselect} command determines which guards are active at a |
bos@104 | 140 given time. The effect of this is to determine which patches MQ will |
bos@104 | 141 apply the next time you run \hgcmd{qpush}. It has no other effect; in |
bos@104 | 142 particular, it doesn't do anything to patches that are already |
bos@104 | 143 applied. |
bos@104 | 144 |
bos@104 | 145 With no arguments, the \hgcmd{qselect} command lists the guards |
bos@104 | 146 currently in effect, one per line of output. Each argument is treated |
bos@104 | 147 as the name of a guard to apply. |
bos@104 | 148 \interaction{mq.guards.qselect.foo} |
bos@104 | 149 In case you're interested, the currently selected guards are stored in |
bos@104 | 150 the \sfilename{guards} file. |
bos@104 | 151 \interaction{mq.guards.qselect.cat} |
bos@104 | 152 We can see the effect the selected guards have when we run |
bos@104 | 153 \hgcmd{qpush}. |
bos@104 | 154 \interaction{mq.guards.qselect.qpush} |
bos@104 | 155 |
bos@104 | 156 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}'' |
bos@106 | 157 character. The name of a guard must not contain white space, but most |
bos@106 | 158 othter characters are acceptable. If you try to use a guard with an |
bos@106 | 159 invalid name, MQ will complain: |
bos@106 | 160 \interaction{mq.guards.qselect.error} |
bos@104 | 161 Changing the selected guards changes the patches that are applied. |
bos@106 | 162 \interaction{mq.guards.qselect.quux} |
bos@105 | 163 You can see in the example below that negative guards take precedence |
bos@105 | 164 over positive guards. |
bos@104 | 165 \interaction{mq.guards.qselect.foobar} |
bos@104 | 166 |
bos@105 | 167 \section{MQ's rules for applying patches} |
bos@105 | 168 |
bos@105 | 169 The rules that MQ uses when deciding whether to apply a patch |
bos@105 | 170 are as follows. |
bos@105 | 171 \begin{itemize} |
bos@105 | 172 \item A patch that has no guards is always applied. |
bos@105 | 173 \item If the patch has any negative guard that matches any currently |
bos@105 | 174 selected guard, the patch is skipped. |
bos@105 | 175 \item If the patch has any positive guard that matches any currently |
bos@105 | 176 selected guard, the patch is applied. |
bos@105 | 177 \item If the patch has positive or negative guards, but none matches |
bos@105 | 178 any currently selected guard, the patch is skipped. |
bos@105 | 179 \end{itemize} |
bos@105 | 180 |
bos@105 | 181 \section{Trimming the work environment} |
bos@105 | 182 |
bos@105 | 183 In working on the device driver I mentioned earlier, I don't apply the |
bos@105 | 184 patches to a normal Linux kernel tree. Instead, I use a repository |
bos@105 | 185 that contains only a snapshot of the source files and headers that are |
bos@105 | 186 relevant to Infiniband development. This repository is~1\% the size |
bos@105 | 187 of a kernel repository, so it's easier to work with. |
bos@105 | 188 |
bos@105 | 189 I then choose a ``base'' version on top of which the patches are |
bos@105 | 190 applied. This is a snapshot of the Linux kernel tree as of a revision |
bos@105 | 191 of my choosing. When I take the snapshot, I record the changeset ID |
bos@105 | 192 from the kernel repository in the commit message. Since the snapshot |
bos@105 | 193 preserves the ``shape'' and content of the relevant parts of the |
bos@105 | 194 kernel tree, I can apply my patches on top of either my tiny |
bos@105 | 195 repository or a normal kernel tree. |
bos@105 | 196 |
bos@105 | 197 Normally, the base tree atop which the patches apply should be a |
bos@105 | 198 snapshot of a very recent upstream tree. This best facilitates the |
bos@105 | 199 development of patches that can easily be submitted upstream with few |
bos@105 | 200 or no modifications. |
bos@105 | 201 |
bos@105 | 202 \section{Dividing up the \sfilename{series} file} |
bos@105 | 203 |
bos@105 | 204 I categorise the patches in the \sfilename{series} file into a number |
bos@105 | 205 of logical groups. Each section of like patches begins with a block |
bos@105 | 206 of comments that describes the purpose of the patches that follow. |
bos@105 | 207 |
bos@105 | 208 The sequence of patch groups that I maintain follows. The ordering of |
bos@105 | 209 these groups is important; I'll describe why after I introduce the |
bos@105 | 210 groups. |
bos@105 | 211 \begin{itemize} |
bos@105 | 212 \item The ``accepted'' group. Patches that the development team has |
bos@105 | 213 submitted to the maintainer of the Infiniband subsystem, and which |
bos@105 | 214 he has accepted, but which are not present in the snapshot that the |
bos@105 | 215 tiny repository is based on. These are ``read only'' patches, |
bos@105 | 216 present only to transform the tree into a similar state as it is in |
bos@105 | 217 the upstream maintainer's repository. |
bos@105 | 218 \item The ``rework'' group. Patches that I have submitted, but that |
bos@105 | 219 the upstream maintainer has requested modifications to before he |
bos@105 | 220 will accept them. |
bos@105 | 221 \item The ``pending'' group. Patches that I have not yet submitted to |
bos@105 | 222 the upstream maintainer, but which we have finished working on. |
bos@105 | 223 These will be ``read only'' for a while. If the upstream maintainer |
bos@105 | 224 accepts them upon submission, I'll move them to the end of the |
bos@105 | 225 ``accepted'' group. If he requests that I modify any, I'll move |
bos@105 | 226 them to the beginning of the ``rework'' group. |
bos@105 | 227 \item The ``in progress'' group. Patches that are actively being |
bos@105 | 228 developed, and should not be submitted anywhere yet. |
bos@105 | 229 \item The ``backport'' group. Patches that adapt the source tree to |
bos@105 | 230 older versions of the kernel tree. |
bos@105 | 231 \item The ``do not ship'' group. Patches that for some reason should |
bos@105 | 232 never be submitted upstream. For example, one such patch might |
bos@105 | 233 change embedded driver identification strings to make it easier to |
bos@105 | 234 distinguish, in the field, between an out-of-tree version of the |
bos@105 | 235 driver and a version shipped by a distribution vendor. |
bos@105 | 236 \end{itemize} |
bos@105 | 237 |
bos@105 | 238 Now to return to the reasons for ordering groups of patches in this |
bos@105 | 239 way. We would like the lowest patches in the stack to be as stable as |
bos@105 | 240 possible, so that we will not need to rework higher patches due to |
bos@105 | 241 changes in context. Putting patches that will never be changed first |
bos@105 | 242 in the \sfilename{series} file serves this purpose. |
bos@105 | 243 |
bos@105 | 244 We would also like the patches that we know we'll need to modify to be |
bos@105 | 245 applied on top of a source tree that resembles the upstream tree as |
bos@105 | 246 closely as possible. This is why we keep accepted patches around for |
bos@105 | 247 a while. |
bos@105 | 248 |
bos@105 | 249 The ``backport'' and ``do not ship'' patches float at the end of the |
bos@106 | 250 \sfilename{series} file. The backport patches must be applied on top |
bos@106 | 251 of all other patches, and the ``do not ship'' patches might as well |
bos@106 | 252 stay out of harm's way. |
bos@106 | 253 |
bos@106 | 254 \section{Maintaining the patch series} |
bos@106 | 255 |
bos@106 | 256 In my work, I use a number of guards to control which patches are to |
bos@106 | 257 be applied. |
bos@106 | 258 |
bos@106 | 259 \begin{itemize} |
bos@106 | 260 \item ``Accepted'' patches are guarded with \texttt{accepted}. I |
bos@106 | 261 enable this guard most of the time. When I'm applying the patches |
bos@106 | 262 on top of a tree where the patches are already present, I can turn |
bos@106 | 263 this patch off, and the paptches that follow it will apply cleanly. |
bos@106 | 264 \item Patches that are ``finished'', but not yet submitted, have no |
bos@106 | 265 guards. If I'm applying the patch stack to a copy of the upstream |
bos@106 | 266 tree, I don't need to enable any guards in order to get a reasonably |
bos@106 | 267 safe source tree. |
bos@106 | 268 \item Those patches that need reworking before being resubmitted are |
bos@106 | 269 guarded with \texttt{rework}. |
bos@106 | 270 \item For those patches that are still under development, I use |
bos@106 | 271 \texttt{devel}. |
bos@106 | 272 \item A backport patch may have several guards, one for each version |
bos@106 | 273 of the kernel to which it applies. For example, a patch that |
bos@106 | 274 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard. |
bos@106 | 275 \end{itemize} |
bos@106 | 276 This variety of guards gives me considerable flexibility in |
bos@106 | 277 qdetermining what kind of source tree I want to end up with. For most |
bos@106 | 278 situations, the selection of appropriate guards is automated during |
bos@106 | 279 the build process, but I can manually tune the guards to use for less |
bos@106 | 280 common circumstances. |
bos@106 | 281 |
bos@106 | 282 \subsection{The art of writing backport patches} |
bos@106 | 283 |
bos@106 | 284 Using MQ, writing a backport patch is a simple process. All such a |
bos@106 | 285 patch has to do is modify a piece of code that uses a kernel feature |
bos@106 | 286 not present in the older version of the kernel, so that the driver |
bos@106 | 287 continues to work correctly under that older version. |
bos@106 | 288 |
bos@106 | 289 A useful goal when writing a good backport patch is to make your code |
bos@106 | 290 look as if it was written for the older version of the kernel you're |
bos@106 | 291 targeting. The less obtrusive the patch, the easier it will be to |
bos@106 | 292 understand and maintain. If you're writing a collection of backport |
bos@106 | 293 patches to avoid the ``rat's nest'' effect of lots of |
bos@106 | 294 \texttt{\#ifdef}s (hunks of source code that are only used |
bos@106 | 295 conditionally) in your code, don't introduce version-dependent |
bos@106 | 296 \texttt{\#ifdef}s into the patches. Instead, write several patches, |
bos@106 | 297 each of which makes unconditional changes, and control their |
bos@106 | 298 application using guards. |
bos@106 | 299 |
bos@106 | 300 There are two reasons to divide backport patches into a distinct |
bos@106 | 301 group, away from the ``regular'' patches whose effects they modify. |
bos@106 | 302 The first is that intermingling the two makes it more difficult to use |
bos@106 | 303 a tool like the \hgext{patchbomb} extension to automate the process of |
bos@106 | 304 submitting the patches to an upstream maintainer. The second is that |
bos@106 | 305 a backport patch could perturb the context in which a subsequent |
bos@106 | 306 regular patch is applied, making it impossible to apply the regular |
bos@106 | 307 patch cleanly \emph{without} the earlier backport patch already being |
bos@106 | 308 applied. |
bos@106 | 309 |
bos@106 | 310 \section{Useful tips for developing with MQ} |
bos@106 | 311 |
bos@106 | 312 \subsection{Organising patches in directories} |
bos@106 | 313 |
bos@106 | 314 If you're working on a substantial project with MQ, it's not difficult |
bos@106 | 315 to accumulate a large number of patches. For example, I have one |
bos@106 | 316 patch repository that contains over 250 patches. |
bos@106 | 317 |
bos@106 | 318 If you can group these patches into separate logical categories, you |
bos@106 | 319 can if you like store them in different directories; MQ has no |
bos@106 | 320 problems with patch names that contain path separators. |
bos@106 | 321 |
bos@106 | 322 \subsection{Viewing the history of a patch} |
bos@106 | 323 \label{mq-collab:tips:interdiff} |
bos@106 | 324 |
bos@106 | 325 If you're developing a set of patches over a long time, it's a good |
bos@106 | 326 idea to maintain them in a repository, as discussed in |
bos@106 | 327 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that |
bos@106 | 328 using the \hgcmd{diff} command to look at the history of changes to a |
bos@106 | 329 patch is unworkable. This is in part because you're looking at the |
bos@106 | 330 second derivative of the real code (a diff of a diff), but also |
bos@106 | 331 because MQ adds noise to the process by modifying time stamps and |
bos@106 | 332 directory names when it updates a patch. |
bos@106 | 333 |
bos@106 | 334 However, you can use the \hgext{extdiff} extension, which is bundled |
bos@106 | 335 with Mercurial, to turn a diff of two versions of a patch into |
bos@106 | 336 something readable. To do this, you will need a third-party package |
bos@106 | 337 called \package{patchutils}~\cite{web:patchutils}. This provides a |
bos@106 | 338 command named \command{interdiff}, which shows the differences between |
bos@106 | 339 two diffs as a diff. Used on two versions of the same diff, it |
bos@106 | 340 generates a diff that represents the diff from the first to the second |
bos@106 | 341 version. |
bos@106 | 342 |
bos@106 | 343 You can enable the \hgext{extdiff} extension in the usual way, by |
bos@106 | 344 adding a line to the \rcsection{extensions} section of your \hgrc. |
bos@106 | 345 \begin{codesample2} |
bos@106 | 346 [extensions] |
bos@106 | 347 extdiff = |
bos@106 | 348 \end{codesample2} |
bos@106 | 349 The \command{interdiff} command expects to be passed the names of two |
bos@106 | 350 files, but the \hgext{extdiff} extension passes the program it runs a |
bos@106 | 351 pair of directories, each of which can contain an arbitrary number of |
bos@106 | 352 files. We thus need a small program that will run \command{interdiff} |
bos@106 | 353 on each pair of files in these two directories. This program is |
bos@106 | 354 available as \sfilename{hg-interdiff} in the \dirname{examples} |
bos@106 | 355 directory of the source code repository that accompanies this book. |
bos@106 | 356 \excode{hg-interdiff} |
bos@106 | 357 |
bos@106 | 358 With the \sfilename{hg-interdiff} program in your shell's search path, |
bos@106 | 359 you can run it as follows, from inside an MQ patch directory: |
bos@106 | 360 \begin{codesample2} |
bos@106 | 361 hg extdiff -p hg-interdiff -r A:B my-change.patch |
bos@106 | 362 \end{codesample2} |
bos@106 | 363 Since you'll probably want to use this long-winded command a lot, you |
bos@106 | 364 can get \hgext{hgext} to make it available as a normal Mercurial |
bos@106 | 365 command, again by editing your \hgrc. |
bos@106 | 366 \begin{codesample2} |
bos@106 | 367 [extdiff] |
bos@106 | 368 cmd.interdiff = hg-interdiff |
bos@106 | 369 \end{codesample2} |
bos@106 | 370 This directs \hgext{hgext} to make an \texttt{interdiff} command |
bos@106 | 371 available, so you can now shorten the previous invocation of |
bos@106 | 372 \hgcmd{extdiff} to something a little more wieldy. |
bos@106 | 373 \begin{codesample2} |
bos@106 | 374 hg interdiff -r A:B my-change.patch |
bos@106 | 375 \end{codesample2} |
bos@105 | 376 |
bos@107 | 377 \begin{note} |
bos@107 | 378 The \command{interdiff} command works well only if the underlying |
bos@107 | 379 files against which versions of a patch are generated remain the |
bos@107 | 380 same. If you create a patch, modify the underlying files, and then |
bos@107 | 381 regenerate the patch, \command{interdiff} may not produce useful |
bos@107 | 382 output. |
bos@107 | 383 \end{note} |
bos@107 | 384 |
bos@104 | 385 %%% Local Variables: |
bos@104 | 386 %%% mode: latex |
bos@104 | 387 %%% TeX-master: "00book" |
bos@104 | 388 %%% End: |