jerojasro@481: \chapter{Usos avanzados de las Colas de Mercurial}
jerojasro@336: \label{chap:mq-collab}
jerojasro@336: 
jerojasro@481: Auunque es fácil aprender los usos más directos de las Colas de
jerojasro@481: Mercurial, tener algo de disciplina junto con algunas de las
jerojasro@481: capacidadees menos usadas de MQ hace posible trabajar en entornos de
jerojasro@481: desarrollo complejos.
jerojasro@481: 
jerojasro@481: En este capítulo, usaré como ejemplo una técnica que he usado para
jerojasro@481: administrar el desarrollo de un controlador de dispositivo Infiniband
jerojasro@481: para el kernel de Linux. El controlador en cuestión es grande
jerojasro@481: (al menos en lo que se refiere a controladores), con 25,000 líneas de
jerojasro@481: código esparcidas en 35 ficheros fuente. Es mantenido por un equipo
jerojasro@481: pequeño de desarrolladores. 
jerojasro@481: 
jerojasro@481: Aunque mucho del material en este capítulo es específico de Linux, los
jerojasro@481: mismos principios aplican a cualquier base de código de la que usted
jerojasro@481: no sea el propietario principal, y sobre la que usted necesita hacer
jerojasro@481: un montón de desarrollo.
jerojasro@481: 
jerojasro@481: \section{El problema de múltiples objetivos}
jerojasro@481: 
jerojasro@481: El kernel de Linux cambia con rapidez, y nunca ha sido estable
jerojasro@481: internamente; los desarrolladores hacen cambios drásticos entre
jerojasro@481: %TODO no encontré una traducción adecuada para "release". Por eso el
jerojasro@481: %cambio
jerojasro@481: versiones frecuentemente. Esto significa que una versión del
jerojasro@481: controlador que funciona bien con una versión particular del kernel ni
jerojasro@481: siquiera \emph{compilará} correctamente contra, típicamente, cualquier
jerojasro@481: otra versión.
jerojasro@481: 
jerojasro@481: Para mantener un controlador, debemos tener en cuenta una buena
jerojasro@481: cantidad de versiones de Linux en mente.
jerojasro@336: \begin{itemize}
jerojasro@481: \item Un objetivo es el árbol de desarrollo principal del kernel de
jerojasro@481:   Linux. En este caso el mantenimiento del código es compartido
jerojasro@481:   parcialmente por otros desarrolladores en la comunidad del kernel, 
jerojasro@481:   %TODO drive-by. 
jerojasro@481:   quienes hacen modificaciones ``de-afán'' al controlador a medida que 
jerojasro@481:   desarrollan y refinan subsistemas en el kernel.
jerojasro@481:   %TODO backport
jerojasro@481: \item También mantenemos algunos ``backports'' para versiones antiguas
jerojasro@481:   del kernel de Linux, para dar soporte a las necesidades de los
jerojasro@481:   clientes que están corriendo versiones antiguas de Linux que no
jerojasro@481:   incorporan nuestros controladores. (Hacer el \emph{backport} de un
jerojasro@481:   pedazo de código es modificarlo para que trabaje en una versión
jerojasro@481:   de su entorno objetivo anterior a aquella para la cual fue escrito.)
jerojasro@481: \item Finalmente, nosotros liberamos nuestro software de acuerdo a un
jerojasro@481:   cronograma que no necesariamente está alineado con el que usan los
jerojasro@481:   distribuidores de Linux y los desarrolladores del kernel, así que
jerojasro@481:   podemos entregar nuevas características a los clientes sin forzarlos
jerojasro@481:   a actualizar kernels completos o distribuciones.
jerojasro@336: \end{itemize}
jerojasro@336: 
jerojasro@481: \subsection{Aproximaciones tentadoras que no funcionan adecuadamente}
jerojasro@481: 
jerojasro@481: Hay dos maneras estándar de mantener una porción de software que debe
jerojasro@481: funcionar en muchos entornos diferentes.
jerojasro@336: 
jerojasro@485: La primera es mantener varias ramas, cada una pensada para un único
jerojasro@485: entorno. El problema de esta aproximación es que usted debe tener una
jerojasro@485: disciplina férrea con el flujo de cambios entre repositorios. Una
jerojasro@485: nueva característica o un arreglo de fallo deben empezar su vida en un
jerojasro@485: repositorio ``prístino'', y luego propagarse a cada repositorio de
jerojasro@485: backport. Los cambios para backports están más limitados respecto a
jerojasro@485: las ramas a las que deberían propagarse; un cambio para backport que
jerojasro@485: es aplicado a una rama en la que no corresponde probablemente hará que
jerojasro@485: el controlador no compile.
jerojasro@485: 
jerojasro@485: La segunda es mantener un único árbol de código fuente lleno de
jerojasro@485: declaraciones que activen o desactiven secciones de código dependiendo
jerojasro@485: del entorno objetivo. Ya que estos ``ifdefs'' no están permitidos en
jerojasro@485: el árbol del kernel de Linux, debe seguirse algún proceso manual o
jerojasro@485: automático para eliminarlos y producir un árbol limpio. Una base de
jerojasro@485: código mantenida de esta manera se convierte rápidamente en un nido de
jerojasro@485: ratas de bloques condicionales que son difíciles de entender y
jerojasro@485: mantener.
jerojasro@485: 
jerojasro@485: %TODO canónica?
jerojasro@485: Ninguno de estos enfoques es adecuado para situaciones en las que
jerojasro@485: usted no es ``dueño'' de la copia canónica de un árbol de fuentes. En
jerojasro@485: el caso de un controlador de Linux que es distribuido con el kernel
jerojasro@485: estándar, el árbol de Linux contiene la copia del código que será
jerojasro@485: considerada por el mundo como la canónica. La versión oficial de
jerojasro@485: ``mi'' controlador puede ser modificada por gente que no conozco, sin
jerojasro@485: que yo siquiera me entere de ello hasta después de que los cambios
jerojasro@485: aparecen en el árbol de Linus.
jerojasro@485: 
jerojasro@485: Estos enfoques tienen la debilidad adicional de dificultar la
jerojasro@485: %TODO upstream. no no es río arriba
jerojasro@485: generación de parches bien formados para enviarlos a la versión
jerojasro@485: oficial.
jerojasro@485: 
jerojasro@485: En principio, las Colas de Mercurial parecen ser un buen candidato
jerojasro@485: para administrar un escenario de desarrollo como el de arriba. Aunque
jerojasro@485: este es de hecho el caso, MQ tiene unas cuantas características
jerojasro@485: adicionales que hacen el trabajo más agradable.
jerojasro@336: 
jerojasro@336: \section{Conditionally applying patches with 
jerojasro@336:   guards}
jerojasro@336: 
jerojasro@336: Perhaps the best way to maintain sanity with so many targets is to be
jerojasro@336: able to choose specific patches to apply for a given situation.  MQ
jerojasro@336: provides a feature called ``guards'' (which originates with quilt's
jerojasro@336: \texttt{guards} command) that does just this.  To start off, let's
jerojasro@336: create a simple repository for experimenting in.
jerojasro@336: \interaction{mq.guards.init}
jerojasro@336: This gives us a tiny repository that contains two patches that don't
jerojasro@336: have any dependencies on each other, because they touch different files.
jerojasro@336: 
jerojasro@336: The idea behind conditional application is that you can ``tag'' a
jerojasro@336: patch with a \emph{guard}, which is simply a text string of your
jerojasro@336: choosing, then tell MQ to select specific guards to use when applying
jerojasro@336: patches.  MQ will then either apply, or skip over, a guarded patch,
jerojasro@336: depending on the guards that you have selected.
jerojasro@336: 
jerojasro@336: A patch can have an arbitrary number of guards;
jerojasro@336: each one is \emph{positive} (``apply this patch if this guard is
jerojasro@336: selected'') or \emph{negative} (``skip this patch if this guard is
jerojasro@336: selected'').  A patch with no guards is always applied.
jerojasro@336: 
jerojasro@336: \section{Controlling the guards on a patch}
jerojasro@336: 
jerojasro@336: The \hgxcmd{mq}{qguard} command lets you determine which guards should
jerojasro@336: apply to a patch, or display the guards that are already in effect.
jerojasro@336: Without any arguments, it displays the guards on the current topmost
jerojasro@336: patch.
jerojasro@336: \interaction{mq.guards.qguard}
jerojasro@336: To set a positive guard on a patch, prefix the name of the guard with
jerojasro@336: a ``\texttt{+}''.
jerojasro@336: \interaction{mq.guards.qguard.pos}
jerojasro@336: To set a negative guard on a patch, prefix the name of the guard with
jerojasro@336: a ``\texttt{-}''.
jerojasro@336: \interaction{mq.guards.qguard.neg}
jerojasro@336: 
jerojasro@336: \begin{note}
jerojasro@336:   The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it
jerojasro@336:   doesn't \emph{modify} them.  What this means is that if you run
jerojasro@336:   \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on
jerojasro@336:   the same patch, the \emph{only} guard that will be set on it
jerojasro@336:   afterwards is \texttt{+c}.
jerojasro@336: \end{note}
jerojasro@336: 
jerojasro@336: Mercurial stores guards in the \sfilename{series} file; the form in
jerojasro@336: which they are stored is easy both to understand and to edit by hand.
jerojasro@336: (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if
jerojasro@336: you don't want to; it's okay to simply edit the \sfilename{series}
jerojasro@336: file.)
jerojasro@336: \interaction{mq.guards.series}
jerojasro@336: 
jerojasro@336: \section{Selecting the guards to use}
jerojasro@336: 
jerojasro@336: The \hgxcmd{mq}{qselect} command determines which guards are active at a
jerojasro@336: given time.  The effect of this is to determine which patches MQ will
jerojasro@336: apply the next time you run \hgxcmd{mq}{qpush}.  It has no other effect; in
jerojasro@336: particular, it doesn't do anything to patches that are already
jerojasro@336: applied.
jerojasro@336: 
jerojasro@336: With no arguments, the \hgxcmd{mq}{qselect} command lists the guards
jerojasro@336: currently in effect, one per line of output.  Each argument is treated
jerojasro@336: as the name of a guard to apply.
jerojasro@336: \interaction{mq.guards.qselect.foo}
jerojasro@336: In case you're interested, the currently selected guards are stored in
jerojasro@336: the \sfilename{guards} file.
jerojasro@336: \interaction{mq.guards.qselect.cat}
jerojasro@336: We can see the effect the selected guards have when we run
jerojasro@336: \hgxcmd{mq}{qpush}.
jerojasro@336: \interaction{mq.guards.qselect.qpush}
jerojasro@336: 
jerojasro@336: A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''
jerojasro@336: character.  The name of a guard must not contain white space, but most
jerojasro@336: other characters are acceptable.  If you try to use a guard with an
jerojasro@336: invalid name, MQ will complain:
jerojasro@336: \interaction{mq.guards.qselect.error} 
jerojasro@336: Changing the selected guards changes the patches that are applied.
jerojasro@336: \interaction{mq.guards.qselect.quux} 
jerojasro@336: You can see in the example below that negative guards take precedence
jerojasro@336: over positive guards.
jerojasro@336: \interaction{mq.guards.qselect.foobar}
jerojasro@336: 
jerojasro@336: \section{MQ's rules for applying patches}
jerojasro@336: 
jerojasro@336: The rules that MQ uses when deciding whether to apply a patch
jerojasro@336: are as follows.
jerojasro@336: \begin{itemize}
jerojasro@336: \item A patch that has no guards is always applied.
jerojasro@336: \item If the patch has any negative guard that matches any currently
jerojasro@336:   selected guard, the patch is skipped.
jerojasro@336: \item If the patch has any positive guard that matches any currently
jerojasro@336:   selected guard, the patch is applied.
jerojasro@336: \item If the patch has positive or negative guards, but none matches
jerojasro@336:   any currently selected guard, the patch is skipped.
jerojasro@336: \end{itemize}
jerojasro@336: 
jerojasro@336: \section{Trimming the work environment}
jerojasro@336: 
jerojasro@336: In working on the device driver I mentioned earlier, I don't apply the
jerojasro@336: patches to a normal Linux kernel tree.  Instead, I use a repository
jerojasro@336: that contains only a snapshot of the source files and headers that are
jerojasro@336: relevant to Infiniband development.  This repository is~1\% the size
jerojasro@336: of a kernel repository, so it's easier to work with.
jerojasro@336: 
jerojasro@336: I then choose a ``base'' version on top of which the patches are
jerojasro@336: applied.  This is a snapshot of the Linux kernel tree as of a revision
jerojasro@336: of my choosing.  When I take the snapshot, I record the changeset ID
jerojasro@336: from the kernel repository in the commit message.  Since the snapshot
jerojasro@336: preserves the ``shape'' and content of the relevant parts of the
jerojasro@336: kernel tree, I can apply my patches on top of either my tiny
jerojasro@336: repository or a normal kernel tree.
jerojasro@336: 
jerojasro@336: Normally, the base tree atop which the patches apply should be a
jerojasro@336: snapshot of a very recent upstream tree.  This best facilitates the
jerojasro@336: development of patches that can easily be submitted upstream with few
jerojasro@336: or no modifications.
jerojasro@336: 
jerojasro@336: \section{Dividing up the \sfilename{series} file}
jerojasro@336: 
jerojasro@336: I categorise the patches in the \sfilename{series} file into a number
jerojasro@336: of logical groups.  Each section of like patches begins with a block
jerojasro@336: of comments that describes the purpose of the patches that follow.
jerojasro@336: 
jerojasro@336: The sequence of patch groups that I maintain follows.  The ordering of
jerojasro@336: these groups is important; I'll describe why after I introduce the
jerojasro@336: groups.
jerojasro@336: \begin{itemize}
jerojasro@336: \item The ``accepted'' group.  Patches that the development team has
jerojasro@336:   submitted to the maintainer of the Infiniband subsystem, and which
jerojasro@336:   he has accepted, but which are not present in the snapshot that the
jerojasro@336:   tiny repository is based on.  These are ``read only'' patches,
jerojasro@336:   present only to transform the tree into a similar state as it is in
jerojasro@336:   the upstream maintainer's repository.
jerojasro@336: \item The ``rework'' group.  Patches that I have submitted, but that
jerojasro@336:   the upstream maintainer has requested modifications to before he
jerojasro@336:   will accept them.
jerojasro@336: \item The ``pending'' group.  Patches that I have not yet submitted to
jerojasro@336:   the upstream maintainer, but which we have finished working on.
jerojasro@336:   These will be ``read only'' for a while.  If the upstream maintainer
jerojasro@336:   accepts them upon submission, I'll move them to the end of the
jerojasro@336:   ``accepted'' group.  If he requests that I modify any, I'll move
jerojasro@336:   them to the beginning of the ``rework'' group.
jerojasro@336: \item The ``in progress'' group.  Patches that are actively being
jerojasro@336:   developed, and should not be submitted anywhere yet.
jerojasro@336: \item The ``backport'' group.  Patches that adapt the source tree to
jerojasro@336:   older versions of the kernel tree.
jerojasro@336: \item The ``do not ship'' group.  Patches that for some reason should
jerojasro@336:   never be submitted upstream.  For example, one such patch might
jerojasro@336:   change embedded driver identification strings to make it easier to
jerojasro@336:   distinguish, in the field, between an out-of-tree version of the
jerojasro@336:   driver and a version shipped by a distribution vendor.
jerojasro@336: \end{itemize}
jerojasro@336: 
jerojasro@336: Now to return to the reasons for ordering groups of patches in this
jerojasro@336: way.  We would like the lowest patches in the stack to be as stable as
jerojasro@336: possible, so that we will not need to rework higher patches due to
jerojasro@336: changes in context.  Putting patches that will never be changed first
jerojasro@336: in the \sfilename{series} file serves this purpose.
jerojasro@336: 
jerojasro@336: We would also like the patches that we know we'll need to modify to be
jerojasro@336: applied on top of a source tree that resembles the upstream tree as
jerojasro@336: closely as possible.  This is why we keep accepted patches around for
jerojasro@336: a while.
jerojasro@336: 
jerojasro@336: The ``backport'' and ``do not ship'' patches float at the end of the
jerojasro@336: \sfilename{series} file.  The backport patches must be applied on top
jerojasro@336: of all other patches, and the ``do not ship'' patches might as well
jerojasro@336: stay out of harm's way.
jerojasro@336: 
jerojasro@336: \section{Maintaining the patch series}
jerojasro@336: 
jerojasro@336: In my work, I use a number of guards to control which patches are to
jerojasro@336: be applied.
jerojasro@336: 
jerojasro@336: \begin{itemize}
jerojasro@336: \item ``Accepted'' patches are guarded with \texttt{accepted}.  I
jerojasro@336:   enable this guard most of the time.  When I'm applying the patches
jerojasro@336:   on top of a tree where the patches are already present, I can turn
jerojasro@336:   this patch off, and the patches that follow it will apply cleanly.
jerojasro@336: \item Patches that are ``finished'', but not yet submitted, have no
jerojasro@336:   guards.  If I'm applying the patch stack to a copy of the upstream
jerojasro@336:   tree, I don't need to enable any guards in order to get a reasonably
jerojasro@336:   safe source tree.
jerojasro@336: \item Those patches that need reworking before being resubmitted are
jerojasro@336:   guarded with \texttt{rework}.
jerojasro@336: \item For those patches that are still under development, I use
jerojasro@336:   \texttt{devel}.
jerojasro@336: \item A backport patch may have several guards, one for each version
jerojasro@336:   of the kernel to which it applies.  For example, a patch that
jerojasro@336:   backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.
jerojasro@336: \end{itemize}
jerojasro@336: This variety of guards gives me considerable flexibility in
jerojasro@336: qdetermining what kind of source tree I want to end up with.  For most
jerojasro@336: situations, the selection of appropriate guards is automated during
jerojasro@336: the build process, but I can manually tune the guards to use for less
jerojasro@336: common circumstances.
jerojasro@336: 
jerojasro@336: \subsection{The art of writing backport patches}
jerojasro@336: 
jerojasro@336: Using MQ, writing a backport patch is a simple process.  All such a
jerojasro@336: patch has to do is modify a piece of code that uses a kernel feature
jerojasro@336: not present in the older version of the kernel, so that the driver
jerojasro@336: continues to work correctly under that older version.
jerojasro@336: 
jerojasro@336: A useful goal when writing a good backport patch is to make your code
jerojasro@336: look as if it was written for the older version of the kernel you're
jerojasro@336: targeting.  The less obtrusive the patch, the easier it will be to
jerojasro@336: understand and maintain.  If you're writing a collection of backport
jerojasro@336: patches to avoid the ``rat's nest'' effect of lots of
jerojasro@336: \texttt{\#ifdef}s (hunks of source code that are only used
jerojasro@336: conditionally) in your code, don't introduce version-dependent
jerojasro@336: \texttt{\#ifdef}s into the patches.  Instead, write several patches,
jerojasro@336: each of which makes unconditional changes, and control their
jerojasro@336: application using guards.
jerojasro@336: 
jerojasro@336: There are two reasons to divide backport patches into a distinct
jerojasro@336: group, away from the ``regular'' patches whose effects they modify.
jerojasro@336: The first is that intermingling the two makes it more difficult to use
jerojasro@336: a tool like the \hgext{patchbomb} extension to automate the process of
jerojasro@336: submitting the patches to an upstream maintainer.  The second is that
jerojasro@336: a backport patch could perturb the context in which a subsequent
jerojasro@336: regular patch is applied, making it impossible to apply the regular
jerojasro@336: patch cleanly \emph{without} the earlier backport patch already being
jerojasro@336: applied.
jerojasro@336: 
jerojasro@336: \section{Useful tips for developing with MQ}
jerojasro@336: 
jerojasro@336: \subsection{Organising patches in directories}
jerojasro@336: 
jerojasro@336: If you're working on a substantial project with MQ, it's not difficult
jerojasro@336: to accumulate a large number of patches.  For example, I have one
jerojasro@336: patch repository that contains over 250 patches.
jerojasro@336: 
jerojasro@336: If you can group these patches into separate logical categories, you
jerojasro@336: can if you like store them in different directories; MQ has no
jerojasro@336: problems with patch names that contain path separators.
jerojasro@336: 
jerojasro@336: \subsection{Viewing the history of a patch}
jerojasro@336: \label{mq-collab:tips:interdiff}
jerojasro@336: 
jerojasro@336: If you're developing a set of patches over a long time, it's a good
jerojasro@336: idea to maintain them in a repository, as discussed in
jerojasro@336: section~\ref{sec:mq:repo}.  If you do so, you'll quickly discover that
jerojasro@336: using the \hgcmd{diff} command to look at the history of changes to a
jerojasro@336: patch is unworkable.  This is in part because you're looking at the
jerojasro@336: second derivative of the real code (a diff of a diff), but also
jerojasro@336: because MQ adds noise to the process by modifying time stamps and
jerojasro@336: directory names when it updates a patch.
jerojasro@336: 
jerojasro@336: However, you can use the \hgext{extdiff} extension, which is bundled
jerojasro@336: with Mercurial, to turn a diff of two versions of a patch into
jerojasro@336: something readable.  To do this, you will need a third-party package
jerojasro@336: called \package{patchutils}~\cite{web:patchutils}.  This provides a
jerojasro@336: command named \command{interdiff}, which shows the differences between
jerojasro@336: two diffs as a diff.  Used on two versions of the same diff, it
jerojasro@336: generates a diff that represents the diff from the first to the second
jerojasro@336: version.
jerojasro@336: 
jerojasro@336: You can enable the \hgext{extdiff} extension in the usual way, by
jerojasro@336: adding a line to the \rcsection{extensions} section of your \hgrc.
jerojasro@336: \begin{codesample2}
jerojasro@336:   [extensions]
jerojasro@336:   extdiff =
jerojasro@336: \end{codesample2}
jerojasro@336: The \command{interdiff} command expects to be passed the names of two
jerojasro@336: files, but the \hgext{extdiff} extension passes the program it runs a
jerojasro@336: pair of directories, each of which can contain an arbitrary number of
jerojasro@336: files.  We thus need a small program that will run \command{interdiff}
jerojasro@336: on each pair of files in these two directories.  This program is
jerojasro@336: available as \sfilename{hg-interdiff} in the \dirname{examples}
jerojasro@336: directory of the source code repository that accompanies this book.
jerojasro@336: \excode{hg-interdiff}
jerojasro@336: 
jerojasro@336: With the \sfilename{hg-interdiff} program in your shell's search path,
jerojasro@336: you can run it as follows, from inside an MQ patch directory:
jerojasro@336: \begin{codesample2}
jerojasro@336:   hg extdiff -p hg-interdiff -r A:B my-change.patch
jerojasro@336: \end{codesample2}
jerojasro@336: Since you'll probably want to use this long-winded command a lot, you
jerojasro@336: can get \hgext{hgext} to make it available as a normal Mercurial
jerojasro@336: command, again by editing your \hgrc.
jerojasro@336: \begin{codesample2}
jerojasro@336:   [extdiff]
jerojasro@336:   cmd.interdiff = hg-interdiff
jerojasro@336: \end{codesample2}
jerojasro@336: This directs \hgext{hgext} to make an \texttt{interdiff} command
jerojasro@336: available, so you can now shorten the previous invocation of
jerojasro@336: \hgxcmd{extdiff}{extdiff} to something a little more wieldy.
jerojasro@336: \begin{codesample2}
jerojasro@336:   hg interdiff -r A:B my-change.patch
jerojasro@336: \end{codesample2}
jerojasro@336: 
jerojasro@336: \begin{note}
jerojasro@336:   The \command{interdiff} command works well only if the underlying
jerojasro@336:   files against which versions of a patch are generated remain the
jerojasro@336:   same.  If you create a patch, modify the underlying files, and then
jerojasro@336:   regenerate the patch, \command{interdiff} may not produce useful
jerojasro@336:   output.
jerojasro@336: \end{note}
jerojasro@336: 
jerojasro@336: The \hgext{extdiff} extension is useful for more than merely improving
jerojasro@336: the presentation of MQ~patches.  To read more about it, go to
jerojasro@336: section~\ref{sec:hgext:extdiff}.
jerojasro@336: 
jerojasro@336: %%% Local Variables: 
jerojasro@336: %%% mode: latex
jerojasro@336: %%% TeX-master: "00book"
jerojasro@336: %%% End: