hgbook: en/mq-collab.tex annotate

hgbook

annotate en/mq-collab.tex @ 188:d3dd1bedba3c

Backed out changeset 7f07aca44938d38b30ae8713946346123cdf97b6
Bad behaviour has gone away.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Mon Apr 16 14:22:25 2007 -0700 (2007-04-16)
parents	9cbc5d0db542
children	34943a3d50d6

rev	line source
bos@104	1 \chapter{Advanced uses of Mercurial Queues}
bos@104	2
bos@104	3 While it's easy to pick up straightforward uses of Mercurial Queues,
bos@104	4 use of a little discipline and some of MQ's less frequently used
bos@104	5 capabilities makes it possible to work in complicated development
bos@104	6 environments.
bos@104	7
bos@105	8 In this chapter, I will use as an example a technique I have used to
bos@105	9 manage the development of an Infiniband device driver for the Linux
bos@105	10 kernel. The driver in question is large (at least as drivers go),
bos@105	11 with 25,000 lines of code spread across 35 source files. It is
bos@105	12 maintained by a small team of developers.
bos@104	13
bos@104	14 While much of the material in this chapter is specific to Linux, the
bos@104	15 same principles apply to any code base for which you're not the
bos@104	16 primary owner, and upon which you need to do a lot of development.
bos@104	17
bos@104	18 \section{The problem of many targets}
bos@104	19
bos@104	20 The Linux kernel changes rapidly, and has never been internally
bos@104	21 stable; developers frequently make drastic changes between releases.
bos@104	22 This means that a version of the driver that works well with a
bos@104	23 particular released version of the kernel will not even \emph{compile}
bos@104	24 correctly against, typically, any other version.
bos@104	25
bos@104	26 To maintain a driver, we have to keep a number of distinct versions of
bos@104	27 Linux in mind.
bos@104	28 \begin{itemize}
bos@104	29 \item One target is the main Linux kernel development tree.
bos@104	30 Maintenance of the code is in this case partly shared by other
bos@104	31 developers in the kernel community, who make ``drive-by''
bos@104	32 modifications to the driver as they develop and refine kernel
bos@104	33 subsystems.
bos@104	34 \item We also maintain a number of ``backports'' to older versions of
bos@104	35 the Linux kernel, to support the needs of customers who are running
bos@105	36 older Linux distributions that do not incorporate our drivers. (To
bos@105	37 \emph{backport} a piece of code is to modify it to work in an older
bos@105	38 version of its target environment than the version it was developed
bos@105	39 for.)
bos@104	40 \item Finally, we make software releases on a schedule that is
bos@104	41 necessarily not aligned with those used by Linux distributors and
bos@104	42 kernel developers, so that we can deliver new features to customers
bos@104	43 without forcing them to upgrade their entire kernels or
bos@104	44 distributions.
bos@104	45 \end{itemize}
bos@104	46
bos@104	47 \subsection{Tempting approaches that don't work well}
bos@104	48
bos@104	49 There are two ``standard'' ways to maintain a piece of software that
bos@104	50 has to target many different environments.
bos@104	51
bos@104	52 The first is to maintain a number of branches, each intended for a
bos@104	53 single target. The trouble with this approach is that you must
bos@104	54 maintain iron discipline in the flow of changes between repositories.
bos@104	55 A new feature or bug fix must start life in a ``pristine'' repository,
bos@104	56 then percolate out to every backport repository. Backport changes are
bos@104	57 more limited in the branches they should propagate to; a backport
bos@104	58 change that is applied to a branch where it doesn't belong will
bos@104	59 probably stop the driver from compiling.
bos@104	60
bos@104	61 The second is to maintain a single source tree filled with conditional
bos@104	62 statements that turn chunks of code on or off depending on the
bos@104	63 intended target. Because these ``ifdefs'' are not allowed in the
bos@104	64 Linux kernel tree, a manual or automatic process must be followed to
bos@104	65 strip them out and yield a clean tree. A code base maintained in this
bos@104	66 fashion rapidly becomes a rat's nest of conditional blocks that are
bos@104	67 difficult to understand and maintain.
bos@104	68
bos@104	69 Neither of these approaches is well suited to a situation where you
bos@104	70 don't ``own'' the canonical copy of a source tree. In the case of a
bos@104	71 Linux driver that is distributed with the standard kernel, Linus's
bos@104	72 tree contains the copy of the code that will be treated by the world
bos@104	73 as canonical. The upstream version of ``my'' driver can be modified
bos@104	74 by people I don't know, without me even finding out about it until
bos@104	75 after the changes show up in Linus's tree.
bos@104	76
bos@104	77 These approaches have the added weakness of making it difficult to
bos@104	78 generate well-formed patches to submit upstream.
bos@104	79
bos@104	80 In principle, Mercurial Queues seems like a good candidate to manage a
bos@104	81 development scenario such as the above. While this is indeed the
bos@104	82 case, MQ contains a few added features that make the job more
bos@104	83 pleasant.
bos@104	84
bos@105	85 \section{Conditionally applying patches with
bos@105	86 guards}
bos@104	87
bos@104	88 Perhaps the best way to maintain sanity with so many targets is to be
bos@104	89 able to choose specific patches to apply for a given situation. MQ
bos@104	90 provides a feature called ``guards'' (which originates with quilt's
bos@104	91 \texttt{guards} command) that does just this. To start off, let's
bos@104	92 create a simple repository for experimenting in.
bos@104	93 \interaction{mq.guards.init}
bos@104	94 This gives us a tiny repository that contains two patches that don't
bos@104	95 have any dependencies on each other, because they touch different files.
bos@104	96
bos@104	97 The idea behind conditional application is that you can ``tag'' a
bos@104	98 patch with a \emph{guard}, which is simply a text string of your
bos@104	99 choosing, then tell MQ to select specific guards to use when applying
bos@104	100 patches. MQ will then either apply, or skip over, a guarded patch,
bos@104	101 depending on the guards that you have selected.
bos@104	102
bos@104	103 A patch can have an arbitrary number of guards;
bos@104	104 each one is \emph{positive} (``apply this patch if this guard is
bos@104	105 selected'') or \emph{negative} (``skip this patch if this guard is
bos@104	106 selected''). A patch with no guards is always applied.
bos@104	107
bos@104	108 \section{Controlling the guards on a patch}
bos@104	109
bos@104	110 The \hgcmd{qguard} command lets you determine which guards should
bos@104	111 apply to a patch, or display the guards that are already in effect.
bos@104	112 Without any arguments, it displays the guards on the current topmost
bos@104	113 patch.
bos@104	114 \interaction{mq.guards.qguard}
bos@104	115 To set a positive guard on a patch, prefix the name of the guard with
bos@104	116 a ``\texttt{+}''.
bos@104	117 \interaction{mq.guards.qguard.pos}
bos@104	118 To set a negative guard on a patch, prefix the name of the guard with
bos@104	119 a ``\texttt{-}''.
bos@104	120 \interaction{mq.guards.qguard.neg}
bos@104	121
bos@104	122 \begin{note}
bos@104	123 The \hgcmd{qguard} command \emph{sets} the guards on a patch; it
bos@104	124 doesn't \emph{modify} them. What this means is that if you run
bos@104	125 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on
bos@104	126 the same patch, the \emph{only} guard that will be set on it
bos@104	127 afterwards is \texttt{+c}.
bos@104	128 \end{note}
bos@104	129
bos@104	130 Mercurial stores guards in the \sfilename{series} file; the form in
bos@104	131 which they are stored is easy both to understand and to edit by hand.
bos@104	132 (In other words, you don't have to use the \hgcmd{qguard} command if
bos@104	133 you don't want to; it's okay to simply edit the \sfilename{series}
bos@104	134 file.)
bos@104	135 \interaction{mq.guards.series}
bos@104	136
bos@104	137 \section{Selecting the guards to use}
bos@104	138
bos@104	139 The \hgcmd{qselect} command determines which guards are active at a
bos@104	140 given time. The effect of this is to determine which patches MQ will
bos@104	141 apply the next time you run \hgcmd{qpush}. It has no other effect; in
bos@104	142 particular, it doesn't do anything to patches that are already
bos@104	143 applied.
bos@104	144
bos@104	145 With no arguments, the \hgcmd{qselect} command lists the guards
bos@104	146 currently in effect, one per line of output. Each argument is treated
bos@104	147 as the name of a guard to apply.
bos@104	148 \interaction{mq.guards.qselect.foo}
bos@104	149 In case you're interested, the currently selected guards are stored in
bos@104	150 the \sfilename{guards} file.
bos@104	151 \interaction{mq.guards.qselect.cat}
bos@104	152 We can see the effect the selected guards have when we run
bos@104	153 \hgcmd{qpush}.
bos@104	154 \interaction{mq.guards.qselect.qpush}
bos@104	155
bos@104	156 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''
bos@106	157 character. The name of a guard must not contain white space, but most
bos@106	158 othter characters are acceptable. If you try to use a guard with an
bos@106	159 invalid name, MQ will complain:
bos@106	160 \interaction{mq.guards.qselect.error}
bos@104	161 Changing the selected guards changes the patches that are applied.
bos@106	162 \interaction{mq.guards.qselect.quux}
bos@105	163 You can see in the example below that negative guards take precedence
bos@105	164 over positive guards.
bos@104	165 \interaction{mq.guards.qselect.foobar}
bos@104	166
bos@105	167 \section{MQ's rules for applying patches}
bos@105	168
bos@105	169 The rules that MQ uses when deciding whether to apply a patch
bos@105	170 are as follows.
bos@105	171 \begin{itemize}
bos@105	172 \item A patch that has no guards is always applied.
bos@105	173 \item If the patch has any negative guard that matches any currently
bos@105	174 selected guard, the patch is skipped.
bos@105	175 \item If the patch has any positive guard that matches any currently
bos@105	176 selected guard, the patch is applied.
bos@105	177 \item If the patch has positive or negative guards, but none matches
bos@105	178 any currently selected guard, the patch is skipped.
bos@105	179 \end{itemize}
bos@105	180
bos@105	181 \section{Trimming the work environment}
bos@105	182
bos@105	183 In working on the device driver I mentioned earlier, I don't apply the
bos@105	184 patches to a normal Linux kernel tree. Instead, I use a repository
bos@105	185 that contains only a snapshot of the source files and headers that are
bos@105	186 relevant to Infiniband development. This repository is~1\% the size
bos@105	187 of a kernel repository, so it's easier to work with.
bos@105	188
bos@105	189 I then choose a ``base'' version on top of which the patches are
bos@105	190 applied. This is a snapshot of the Linux kernel tree as of a revision
bos@105	191 of my choosing. When I take the snapshot, I record the changeset ID
bos@105	192 from the kernel repository in the commit message. Since the snapshot
bos@105	193 preserves the ``shape'' and content of the relevant parts of the
bos@105	194 kernel tree, I can apply my patches on top of either my tiny
bos@105	195 repository or a normal kernel tree.
bos@105	196
bos@105	197 Normally, the base tree atop which the patches apply should be a
bos@105	198 snapshot of a very recent upstream tree. This best facilitates the
bos@105	199 development of patches that can easily be submitted upstream with few
bos@105	200 or no modifications.
bos@105	201
bos@105	202 \section{Dividing up the \sfilename{series} file}
bos@105	203
bos@105	204 I categorise the patches in the \sfilename{series} file into a number
bos@105	205 of logical groups. Each section of like patches begins with a block
bos@105	206 of comments that describes the purpose of the patches that follow.
bos@105	207
bos@105	208 The sequence of patch groups that I maintain follows. The ordering of
bos@105	209 these groups is important; I'll describe why after I introduce the
bos@105	210 groups.
bos@105	211 \begin{itemize}
bos@105	212 \item The ``accepted'' group. Patches that the development team has
bos@105	213 submitted to the maintainer of the Infiniband subsystem, and which
bos@105	214 he has accepted, but which are not present in the snapshot that the
bos@105	215 tiny repository is based on. These are ``read only'' patches,
bos@105	216 present only to transform the tree into a similar state as it is in
bos@105	217 the upstream maintainer's repository.
bos@105	218 \item The ``rework'' group. Patches that I have submitted, but that
bos@105	219 the upstream maintainer has requested modifications to before he
bos@105	220 will accept them.
bos@105	221 \item The ``pending'' group. Patches that I have not yet submitted to
bos@105	222 the upstream maintainer, but which we have finished working on.
bos@105	223 These will be ``read only'' for a while. If the upstream maintainer
bos@105	224 accepts them upon submission, I'll move them to the end of the
bos@105	225 ``accepted'' group. If he requests that I modify any, I'll move
bos@105	226 them to the beginning of the ``rework'' group.
bos@105	227 \item The ``in progress'' group. Patches that are actively being
bos@105	228 developed, and should not be submitted anywhere yet.
bos@105	229 \item The ``backport'' group. Patches that adapt the source tree to
bos@105	230 older versions of the kernel tree.
bos@105	231 \item The ``do not ship'' group. Patches that for some reason should
bos@105	232 never be submitted upstream. For example, one such patch might
bos@105	233 change embedded driver identification strings to make it easier to
bos@105	234 distinguish, in the field, between an out-of-tree version of the
bos@105	235 driver and a version shipped by a distribution vendor.
bos@105	236 \end{itemize}
bos@105	237
bos@105	238 Now to return to the reasons for ordering groups of patches in this
bos@105	239 way. We would like the lowest patches in the stack to be as stable as
bos@105	240 possible, so that we will not need to rework higher patches due to
bos@105	241 changes in context. Putting patches that will never be changed first
bos@105	242 in the \sfilename{series} file serves this purpose.
bos@105	243
bos@105	244 We would also like the patches that we know we'll need to modify to be
bos@105	245 applied on top of a source tree that resembles the upstream tree as
bos@105	246 closely as possible. This is why we keep accepted patches around for
bos@105	247 a while.
bos@105	248
bos@105	249 The ``backport'' and ``do not ship'' patches float at the end of the
bos@106	250 \sfilename{series} file. The backport patches must be applied on top
bos@106	251 of all other patches, and the ``do not ship'' patches might as well
bos@106	252 stay out of harm's way.
bos@106	253
bos@106	254 \section{Maintaining the patch series}
bos@106	255
bos@106	256 In my work, I use a number of guards to control which patches are to
bos@106	257 be applied.
bos@106	258
bos@106	259 \begin{itemize}
bos@106	260 \item ``Accepted'' patches are guarded with \texttt{accepted}. I
bos@106	261 enable this guard most of the time. When I'm applying the patches
bos@106	262 on top of a tree where the patches are already present, I can turn
bos@106	263 this patch off, and the paptches that follow it will apply cleanly.
bos@106	264 \item Patches that are ``finished'', but not yet submitted, have no
bos@106	265 guards. If I'm applying the patch stack to a copy of the upstream
bos@106	266 tree, I don't need to enable any guards in order to get a reasonably
bos@106	267 safe source tree.
bos@106	268 \item Those patches that need reworking before being resubmitted are
bos@106	269 guarded with \texttt{rework}.
bos@106	270 \item For those patches that are still under development, I use
bos@106	271 \texttt{devel}.
bos@106	272 \item A backport patch may have several guards, one for each version
bos@106	273 of the kernel to which it applies. For example, a patch that
bos@106	274 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.
bos@106	275 \end{itemize}
bos@106	276 This variety of guards gives me considerable flexibility in
bos@106	277 qdetermining what kind of source tree I want to end up with. For most
bos@106	278 situations, the selection of appropriate guards is automated during
bos@106	279 the build process, but I can manually tune the guards to use for less
bos@106	280 common circumstances.
bos@106	281
bos@106	282 \subsection{The art of writing backport patches}
bos@106	283
bos@106	284 Using MQ, writing a backport patch is a simple process. All such a
bos@106	285 patch has to do is modify a piece of code that uses a kernel feature
bos@106	286 not present in the older version of the kernel, so that the driver
bos@106	287 continues to work correctly under that older version.
bos@106	288
bos@106	289 A useful goal when writing a good backport patch is to make your code
bos@106	290 look as if it was written for the older version of the kernel you're
bos@106	291 targeting. The less obtrusive the patch, the easier it will be to
bos@106	292 understand and maintain. If you're writing a collection of backport
bos@106	293 patches to avoid the ``rat's nest'' effect of lots of
bos@106	294 \texttt{\#ifdef}s (hunks of source code that are only used
bos@106	295 conditionally) in your code, don't introduce version-dependent
bos@106	296 \texttt{\#ifdef}s into the patches. Instead, write several patches,
bos@106	297 each of which makes unconditional changes, and control their
bos@106	298 application using guards.
bos@106	299
bos@106	300 There are two reasons to divide backport patches into a distinct
bos@106	301 group, away from the ``regular'' patches whose effects they modify.
bos@106	302 The first is that intermingling the two makes it more difficult to use
bos@106	303 a tool like the \hgext{patchbomb} extension to automate the process of
bos@106	304 submitting the patches to an upstream maintainer. The second is that
bos@106	305 a backport patch could perturb the context in which a subsequent
bos@106	306 regular patch is applied, making it impossible to apply the regular
bos@106	307 patch cleanly \emph{without} the earlier backport patch already being
bos@106	308 applied.
bos@106	309
bos@106	310 \section{Useful tips for developing with MQ}
bos@106	311
bos@106	312 \subsection{Organising patches in directories}
bos@106	313
bos@106	314 If you're working on a substantial project with MQ, it's not difficult
bos@106	315 to accumulate a large number of patches. For example, I have one
bos@106	316 patch repository that contains over 250 patches.
bos@106	317
bos@106	318 If you can group these patches into separate logical categories, you
bos@106	319 can if you like store them in different directories; MQ has no
bos@106	320 problems with patch names that contain path separators.
bos@106	321
bos@106	322 \subsection{Viewing the history of a patch}
bos@106	323 \label{mq-collab:tips:interdiff}
bos@106	324
bos@106	325 If you're developing a set of patches over a long time, it's a good
bos@106	326 idea to maintain them in a repository, as discussed in
bos@106	327 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that
bos@106	328 using the \hgcmd{diff} command to look at the history of changes to a
bos@106	329 patch is unworkable. This is in part because you're looking at the
bos@106	330 second derivative of the real code (a diff of a diff), but also
bos@106	331 because MQ adds noise to the process by modifying time stamps and
bos@106	332 directory names when it updates a patch.
bos@106	333
bos@106	334 However, you can use the \hgext{extdiff} extension, which is bundled
bos@106	335 with Mercurial, to turn a diff of two versions of a patch into
bos@106	336 something readable. To do this, you will need a third-party package
bos@106	337 called \package{patchutils}~\cite{web:patchutils}. This provides a
bos@106	338 command named \command{interdiff}, which shows the differences between
bos@106	339 two diffs as a diff. Used on two versions of the same diff, it
bos@106	340 generates a diff that represents the diff from the first to the second
bos@106	341 version.
bos@106	342
bos@106	343 You can enable the \hgext{extdiff} extension in the usual way, by
bos@106	344 adding a line to the \rcsection{extensions} section of your \hgrc.
bos@106	345 \begin{codesample2}
bos@106	346 [extensions]
bos@106	347 extdiff =
bos@106	348 \end{codesample2}
bos@106	349 The \command{interdiff} command expects to be passed the names of two
bos@106	350 files, but the \hgext{extdiff} extension passes the program it runs a
bos@106	351 pair of directories, each of which can contain an arbitrary number of
bos@106	352 files. We thus need a small program that will run \command{interdiff}
bos@106	353 on each pair of files in these two directories. This program is
bos@106	354 available as \sfilename{hg-interdiff} in the \dirname{examples}
bos@106	355 directory of the source code repository that accompanies this book.
bos@106	356 \excode{hg-interdiff}
bos@106	357
bos@106	358 With the \sfilename{hg-interdiff} program in your shell's search path,
bos@106	359 you can run it as follows, from inside an MQ patch directory:
bos@106	360 \begin{codesample2}
bos@106	361 hg extdiff -p hg-interdiff -r A:B my-change.patch
bos@106	362 \end{codesample2}
bos@106	363 Since you'll probably want to use this long-winded command a lot, you
bos@106	364 can get \hgext{hgext} to make it available as a normal Mercurial
bos@106	365 command, again by editing your \hgrc.
bos@106	366 \begin{codesample2}
bos@106	367 [extdiff]
bos@106	368 cmd.interdiff = hg-interdiff
bos@106	369 \end{codesample2}
bos@106	370 This directs \hgext{hgext} to make an \texttt{interdiff} command
bos@106	371 available, so you can now shorten the previous invocation of
bos@106	372 \hgcmd{extdiff} to something a little more wieldy.
bos@106	373 \begin{codesample2}
bos@106	374 hg interdiff -r A:B my-change.patch
bos@106	375 \end{codesample2}
bos@105	376
bos@107	377 \begin{note}
bos@107	378 The \command{interdiff} command works well only if the underlying
bos@107	379 files against which versions of a patch are generated remain the
bos@107	380 same. If you create a patch, modify the underlying files, and then
bos@107	381 regenerate the patch, \command{interdiff} may not produce useful
bos@107	382 output.
bos@107	383 \end{note}
bos@107	384
bos@104	385 %%% Local Variables:
bos@104	386 %%% mode: latex
bos@104	387 %%% TeX-master: "00book"
bos@104	388 %%% End: