hgbook: 9cbc5d0db542 en/mq-collab.tex

hgbook

view en/mq-collab.tex @ 106:9cbc5d0db542

Finish off advanced MQ chapter (maybe).

author	Bryan O'Sullivan <bos@serpentine.com>
date	Mon Oct 23 15:43:04 2006 -0700 (2006-10-23)
parents	ecacb6b4c9fd
children	a0d7e11db169

line source

1 \chapter{Advanced uses of Mercurial Queues}

3 While it's easy to pick up straightforward uses of Mercurial Queues,

4 use of a little discipline and some of MQ's less frequently used

5 capabilities makes it possible to work in complicated development

6 environments.

8 In this chapter, I will use as an example a technique I have used to

9 manage the development of an Infiniband device driver for the Linux

10 kernel. The driver in question is large (at least as drivers go),

11 with 25,000 lines of code spread across 35 source files. It is

12 maintained by a small team of developers.

14 While much of the material in this chapter is specific to Linux, the

15 same principles apply to any code base for which you're not the

16 primary owner, and upon which you need to do a lot of development.

18 \section{The problem of many targets}

20 The Linux kernel changes rapidly, and has never been internally

21 stable; developers frequently make drastic changes between releases.

22 This means that a version of the driver that works well with a

23 particular released version of the kernel will not even \emph{compile}

24 correctly against, typically, any other version.

26 To maintain a driver, we have to keep a number of distinct versions of

27 Linux in mind.

28 \begin{itemize}

29 \item One target is the main Linux kernel development tree.

30 Maintenance of the code is in this case partly shared by other

31 developers in the kernel community, who make ``drive-by''

32 modifications to the driver as they develop and refine kernel

33 subsystems.

34 \item We also maintain a number of ``backports'' to older versions of

35 the Linux kernel, to support the needs of customers who are running

36 older Linux distributions that do not incorporate our drivers. (To

37 \emph{backport} a piece of code is to modify it to work in an older

38 version of its target environment than the version it was developed

39 for.)

40 \item Finally, we make software releases on a schedule that is

41 necessarily not aligned with those used by Linux distributors and

42 kernel developers, so that we can deliver new features to customers

43 without forcing them to upgrade their entire kernels or

44 distributions.

45 \end{itemize}

47 \subsection{Tempting approaches that don't work well}

49 There are two ``standard'' ways to maintain a piece of software that

50 has to target many different environments.

52 The first is to maintain a number of branches, each intended for a

53 single target. The trouble with this approach is that you must

54 maintain iron discipline in the flow of changes between repositories.

55 A new feature or bug fix must start life in a ``pristine'' repository,

56 then percolate out to every backport repository. Backport changes are

57 more limited in the branches they should propagate to; a backport

58 change that is applied to a branch where it doesn't belong will

59 probably stop the driver from compiling.

61 The second is to maintain a single source tree filled with conditional

62 statements that turn chunks of code on or off depending on the

63 intended target. Because these ``ifdefs'' are not allowed in the

64 Linux kernel tree, a manual or automatic process must be followed to

65 strip them out and yield a clean tree. A code base maintained in this

66 fashion rapidly becomes a rat's nest of conditional blocks that are

67 difficult to understand and maintain.

69 Neither of these approaches is well suited to a situation where you

70 don't ``own'' the canonical copy of a source tree. In the case of a

71 Linux driver that is distributed with the standard kernel, Linus's

72 tree contains the copy of the code that will be treated by the world

73 as canonical. The upstream version of ``my'' driver can be modified

74 by people I don't know, without me even finding out about it until

75 after the changes show up in Linus's tree.

77 These approaches have the added weakness of making it difficult to

78 generate well-formed patches to submit upstream.

80 In principle, Mercurial Queues seems like a good candidate to manage a

81 development scenario such as the above. While this is indeed the

82 case, MQ contains a few added features that make the job more

83 pleasant.

85 \section{Conditionally applying patches with

86 guards}

88 Perhaps the best way to maintain sanity with so many targets is to be

89 able to choose specific patches to apply for a given situation. MQ

90 provides a feature called ``guards'' (which originates with quilt's

91 \texttt{guards} command) that does just this. To start off, let's

92 create a simple repository for experimenting in.

93 \interaction{mq.guards.init}

94 This gives us a tiny repository that contains two patches that don't

95 have any dependencies on each other, because they touch different files.

97 The idea behind conditional application is that you can ``tag'' a

98 patch with a \emph{guard}, which is simply a text string of your

99 choosing, then tell MQ to select specific guards to use when applying

100 patches. MQ will then either apply, or skip over, a guarded patch,

101 depending on the guards that you have selected.

102

103 A patch can have an arbitrary number of guards;

104 each one is \emph{positive} (``apply this patch if this guard is

105 selected'') or \emph{negative} (``skip this patch if this guard is

106 selected''). A patch with no guards is always applied.

107

108 \section{Controlling the guards on a patch}

109

110 The \hgcmd{qguard} command lets you determine which guards should

111 apply to a patch, or display the guards that are already in effect.

112 Without any arguments, it displays the guards on the current topmost

113 patch.

114 \interaction{mq.guards.qguard}

115 To set a positive guard on a patch, prefix the name of the guard with

116 a ``\texttt{+}''.

117 \interaction{mq.guards.qguard.pos}

118 To set a negative guard on a patch, prefix the name of the guard with

119 a ``\texttt{-}''.

120 \interaction{mq.guards.qguard.neg}

121

122 \begin{note}

123 The \hgcmd{qguard} command \emph{sets} the guards on a patch; it

124 doesn't \emph{modify} them. What this means is that if you run

125 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on

126 the same patch, the \emph{only} guard that will be set on it

127 afterwards is \texttt{+c}.

128 \end{note}

129

130 Mercurial stores guards in the \sfilename{series} file; the form in

131 which they are stored is easy both to understand and to edit by hand.

132 (In other words, you don't have to use the \hgcmd{qguard} command if

133 you don't want to; it's okay to simply edit the \sfilename{series}

134 file.)

135 \interaction{mq.guards.series}

136

137 \section{Selecting the guards to use}

138

139 The \hgcmd{qselect} command determines which guards are active at a

140 given time. The effect of this is to determine which patches MQ will

141 apply the next time you run \hgcmd{qpush}. It has no other effect; in

142 particular, it doesn't do anything to patches that are already

143 applied.

144

145 With no arguments, the \hgcmd{qselect} command lists the guards

146 currently in effect, one per line of output. Each argument is treated

147 as the name of a guard to apply.

148 \interaction{mq.guards.qselect.foo}

149 In case you're interested, the currently selected guards are stored in

150 the \sfilename{guards} file.

151 \interaction{mq.guards.qselect.cat}

152 We can see the effect the selected guards have when we run

153 \hgcmd{qpush}.

154 \interaction{mq.guards.qselect.qpush}

155

156 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''

157 character. The name of a guard must not contain white space, but most

158 othter characters are acceptable. If you try to use a guard with an

159 invalid name, MQ will complain:

160 \interaction{mq.guards.qselect.error}

161 Changing the selected guards changes the patches that are applied.

162 \interaction{mq.guards.qselect.quux}

163 You can see in the example below that negative guards take precedence

164 over positive guards.

165 \interaction{mq.guards.qselect.foobar}

166

167 \section{MQ's rules for applying patches}

168

169 The rules that MQ uses when deciding whether to apply a patch

170 are as follows.

171 \begin{itemize}

172 \item A patch that has no guards is always applied.

173 \item If the patch has any negative guard that matches any currently

174 selected guard, the patch is skipped.

175 \item If the patch has any positive guard that matches any currently

176 selected guard, the patch is applied.

177 \item If the patch has positive or negative guards, but none matches

178 any currently selected guard, the patch is skipped.

179 \end{itemize}

180

181 \section{Trimming the work environment}

182

183 In working on the device driver I mentioned earlier, I don't apply the

184 patches to a normal Linux kernel tree. Instead, I use a repository

185 that contains only a snapshot of the source files and headers that are

186 relevant to Infiniband development. This repository is~1\% the size

187 of a kernel repository, so it's easier to work with.

188

189 I then choose a ``base'' version on top of which the patches are

190 applied. This is a snapshot of the Linux kernel tree as of a revision

191 of my choosing. When I take the snapshot, I record the changeset ID

192 from the kernel repository in the commit message. Since the snapshot

193 preserves the ``shape'' and content of the relevant parts of the

194 kernel tree, I can apply my patches on top of either my tiny

195 repository or a normal kernel tree.

196

197 Normally, the base tree atop which the patches apply should be a

198 snapshot of a very recent upstream tree. This best facilitates the

199 development of patches that can easily be submitted upstream with few

200 or no modifications.

201

202 \section{Dividing up the \sfilename{series} file}

203

204 I categorise the patches in the \sfilename{series} file into a number

205 of logical groups. Each section of like patches begins with a block

206 of comments that describes the purpose of the patches that follow.

207

208 The sequence of patch groups that I maintain follows. The ordering of

209 these groups is important; I'll describe why after I introduce the

210 groups.

211 \begin{itemize}

212 \item The ``accepted'' group. Patches that the development team has

213 submitted to the maintainer of the Infiniband subsystem, and which

214 he has accepted, but which are not present in the snapshot that the

215 tiny repository is based on. These are ``read only'' patches,

216 present only to transform the tree into a similar state as it is in

217 the upstream maintainer's repository.

218 \item The ``rework'' group. Patches that I have submitted, but that

219 the upstream maintainer has requested modifications to before he

220 will accept them.

221 \item The ``pending'' group. Patches that I have not yet submitted to

222 the upstream maintainer, but which we have finished working on.

223 These will be ``read only'' for a while. If the upstream maintainer

224 accepts them upon submission, I'll move them to the end of the

225 ``accepted'' group. If he requests that I modify any, I'll move

226 them to the beginning of the ``rework'' group.

227 \item The ``in progress'' group. Patches that are actively being

228 developed, and should not be submitted anywhere yet.

229 \item The ``backport'' group. Patches that adapt the source tree to

230 older versions of the kernel tree.

231 \item The ``do not ship'' group. Patches that for some reason should

232 never be submitted upstream. For example, one such patch might

233 change embedded driver identification strings to make it easier to

234 distinguish, in the field, between an out-of-tree version of the

235 driver and a version shipped by a distribution vendor.

236 \end{itemize}

237

238 Now to return to the reasons for ordering groups of patches in this

239 way. We would like the lowest patches in the stack to be as stable as

240 possible, so that we will not need to rework higher patches due to

241 changes in context. Putting patches that will never be changed first

242 in the \sfilename{series} file serves this purpose.

243

244 We would also like the patches that we know we'll need to modify to be

245 applied on top of a source tree that resembles the upstream tree as

246 closely as possible. This is why we keep accepted patches around for

247 a while.

248

249 The ``backport'' and ``do not ship'' patches float at the end of the

250 \sfilename{series} file. The backport patches must be applied on top

251 of all other patches, and the ``do not ship'' patches might as well

252 stay out of harm's way.

253

254 \section{Maintaining the patch series}

255

256 In my work, I use a number of guards to control which patches are to

257 be applied.

258

259 \begin{itemize}

260 \item ``Accepted'' patches are guarded with \texttt{accepted}. I

261 enable this guard most of the time. When I'm applying the patches

262 on top of a tree where the patches are already present, I can turn

263 this patch off, and the paptches that follow it will apply cleanly.

264 \item Patches that are ``finished'', but not yet submitted, have no

265 guards. If I'm applying the patch stack to a copy of the upstream

266 tree, I don't need to enable any guards in order to get a reasonably

267 safe source tree.

268 \item Those patches that need reworking before being resubmitted are

269 guarded with \texttt{rework}.

270 \item For those patches that are still under development, I use

271 \texttt{devel}.

272 \item A backport patch may have several guards, one for each version

273 of the kernel to which it applies. For example, a patch that

274 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.

275 \end{itemize}

276 This variety of guards gives me considerable flexibility in

277 qdetermining what kind of source tree I want to end up with. For most

278 situations, the selection of appropriate guards is automated during

279 the build process, but I can manually tune the guards to use for less

280 common circumstances.

281

282 \subsection{The art of writing backport patches}

283

284 Using MQ, writing a backport patch is a simple process. All such a

285 patch has to do is modify a piece of code that uses a kernel feature

286 not present in the older version of the kernel, so that the driver

287 continues to work correctly under that older version.

288

289 A useful goal when writing a good backport patch is to make your code

290 look as if it was written for the older version of the kernel you're

291 targeting. The less obtrusive the patch, the easier it will be to

292 understand and maintain. If you're writing a collection of backport

293 patches to avoid the ``rat's nest'' effect of lots of

294 \texttt{\#ifdef}s (hunks of source code that are only used

295 conditionally) in your code, don't introduce version-dependent

296 \texttt{\#ifdef}s into the patches. Instead, write several patches,

297 each of which makes unconditional changes, and control their

298 application using guards.

299

300 There are two reasons to divide backport patches into a distinct

301 group, away from the ``regular'' patches whose effects they modify.

302 The first is that intermingling the two makes it more difficult to use

303 a tool like the \hgext{patchbomb} extension to automate the process of

304 submitting the patches to an upstream maintainer. The second is that

305 a backport patch could perturb the context in which a subsequent

306 regular patch is applied, making it impossible to apply the regular

307 patch cleanly \emph{without} the earlier backport patch already being

308 applied.

309

310 \section{Useful tips for developing with MQ}

311

312 \subsection{Organising patches in directories}

313

314 If you're working on a substantial project with MQ, it's not difficult

315 to accumulate a large number of patches. For example, I have one

316 patch repository that contains over 250 patches.

317

318 If you can group these patches into separate logical categories, you

319 can if you like store them in different directories; MQ has no

320 problems with patch names that contain path separators.

321

322 \subsection{Viewing the history of a patch}

323 \label{mq-collab:tips:interdiff}

324

325 If you're developing a set of patches over a long time, it's a good

326 idea to maintain them in a repository, as discussed in

327 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that

328 using the \hgcmd{diff} command to look at the history of changes to a

329 patch is unworkable. This is in part because you're looking at the

330 second derivative of the real code (a diff of a diff), but also

331 because MQ adds noise to the process by modifying time stamps and

332 directory names when it updates a patch.

333

334 However, you can use the \hgext{extdiff} extension, which is bundled

335 with Mercurial, to turn a diff of two versions of a patch into

336 something readable. To do this, you will need a third-party package

337 called \package{patchutils}~\cite{web:patchutils}. This provides a

338 command named \command{interdiff}, which shows the differences between

339 two diffs as a diff. Used on two versions of the same diff, it

340 generates a diff that represents the diff from the first to the second

341 version.

342

343 You can enable the \hgext{extdiff} extension in the usual way, by

344 adding a line to the \rcsection{extensions} section of your \hgrc.

345 \begin{codesample2}

346 [extensions]

347 extdiff =

348 \end{codesample2}

349 The \command{interdiff} command expects to be passed the names of two

350 files, but the \hgext{extdiff} extension passes the program it runs a

351 pair of directories, each of which can contain an arbitrary number of

352 files. We thus need a small program that will run \command{interdiff}

353 on each pair of files in these two directories. This program is

354 available as \sfilename{hg-interdiff} in the \dirname{examples}

355 directory of the source code repository that accompanies this book.

356 \excode{hg-interdiff}

357

358 With the \sfilename{hg-interdiff} program in your shell's search path,

359 you can run it as follows, from inside an MQ patch directory:

360 \begin{codesample2}

361 hg extdiff -p hg-interdiff -r A:B my-change.patch

362 \end{codesample2}

363 Since you'll probably want to use this long-winded command a lot, you

364 can get \hgext{hgext} to make it available as a normal Mercurial

365 command, again by editing your \hgrc.

366 \begin{codesample2}

367 [extdiff]

368 cmd.interdiff = hg-interdiff

369 \end{codesample2}

370 This directs \hgext{hgext} to make an \texttt{interdiff} command

371 available, so you can now shorten the previous invocation of

372 \hgcmd{extdiff} to something a little more wieldy.

373 \begin{codesample2}

374 hg interdiff -r A:B my-change.patch

375 \end{codesample2}

376

377 %%% Local Variables:

378 %%% mode: latex

379 %%% TeX-master: "00book"

380 %%% End: