# HG changeset patch # User Bryan O'Sullivan # Date 1163463588 28800 # Node ID b74102b56df5cfe9e2609de15321e4a323c1b3a5 # Parent ccff2b25478e499c3229893243557027effa50be Wow! Lots more work detailing the working directory, merging, etc. diff -r ccff2b25478e -r b74102b56df5 en/Makefile --- a/en/Makefile Mon Nov 13 14:34:57 2006 -0800 +++ b/en/Makefile Mon Nov 13 16:19:48 2006 -0800 @@ -33,7 +33,10 @@ tour-merge-pull.svg \ tour-merge-sep-repos.svg \ wdir.svg \ - wdir-after-commit.svg + wdir-after-commit.svg \ + wdir-branch.svg \ + wdir-merge.svg \ + wdir-pre-branch.svg image-svg := $(filter %.svg,$(image-sources)) diff -r ccff2b25478e -r b74102b56df5 en/concepts.tex --- a/en/concepts.tex Mon Nov 13 14:34:57 2006 -0800 +++ b/en/concepts.tex Mon Nov 13 16:19:48 2006 -0800 @@ -204,6 +204,35 @@ after the corrupted section. This would not be possible with a delta-only storage model. +\section{Revision history, branching, + and merging} + +Every entry in a Mercurial revlog knows the identity of its immediate +ancestor revision, usually referred to as its \emph{parent}. In fact, +a revision contains room for not one parent, but two. Mercurial uses +a special hash, called the ``null ID'', to represent the idea ``there +is no parent here''. This hash is simply a string of zeroes. + +In figure~\ref{fig:concepts:revlog}, you can see an example of the +conceptual structure of a revlog. Filelogs, manifests, and changelogs +all have this same structure; they differ only in the kind of data +stored in each delta or snapshot. + +The first revision in a revlog (at the bottom of the image) has the +null ID in both of its parent slots. For a ``normal'' revision, its +first parent slot contains the ID of its parent revision, and its +second contains the null ID, indicating that the revision has only one +real parent. Any two revisions that have the same parent ID are +branches. A revision that represents a merge between branches has two +normal revision IDs in its parent slots. + +\begin{figure}[ht] + \centering + \grafix{revlog} + \caption{} + \label{fig:concepts:revlog} +\end{figure} + \section{The working directory} In the working directory, Mercurial stores a snapshot of the files @@ -266,58 +295,117 @@ After a commit, Mercurial will update the parents of the working directory, so that the first parent is the ID of the new changeset, -and the second is the null ID. This is illustrated in -figure~\ref{fig:concepts:wdir-after-commit}. - -\subsection{Other contents of the dirstate} - -Because Mercurial doesn't force you to tell it when you're modifying a -file, it uses the dirstate to store some extra information so it can -determine efficiently whether you have modified a file. For each file -in the working directory, it stores the time that it last modified the -file itself, and the size of the file at that time. - -When you explicitly \hgcmd{add}, \hgcmd{remove}, \hgcmd{rename} or -\hgcmd{copy} files, the dirstate is updated each time. - -When Mercurial is checking the states of files in the working -directory, it first checks a file's modification time. If that has -not changed, the file must not have been modified. If the file's size -has changed, the file must have been modified. If the modification -time has changed, but the size has not, only then does Mercurial need -to read the actual contents of the file to see if they've changed. -Storing these few extra pieces of information dramatically reduces the -amount of data that Mercurial needs to read, which yields large -performance improvements compared to other revision control systems. - -\section{Revision history, branching, - and merging} - -Every entry in a Mercurial revlog knows the identity of its immediate -ancestor revision, usually referred to as its \emph{parent}. In fact, -a revision contains room for not one parent, but two. Mercurial uses -a special hash, called the ``null ID'', to represent the idea ``there -is no parent here''. This hash is simply a string of zeroes. - -In figure~\ref{fig:concepts:revlog}, you can see an example of the -conceptual structure of a revlog. Filelogs, manifests, and changelogs -all have this same structure; they differ only in the kind of data -stored in each delta or snapshot. - -The first revision in a revlog (at the bottom of the image) has the -null ID in both of its parent slots. For a ``normal'' revision, its -first parent slot contains the ID of its parent revision, and its -second contains the null ID, indicating that the revision has only one -real parent. Any two revisions that have the same parent ID are -branches. A revision that represents a merge between branches has two -normal revision IDs in its parent slots. - -\begin{figure}[ht] - \centering - \grafix{revlog} - \caption{} - \label{fig:concepts:revlog} -\end{figure} +and the second is the null ID. This is shown in +figure~\ref{fig:concepts:wdir-after-commit}. Mercurial doesn't touch +any of the files in the working directory when you commit; it just +modifies the dirstate to note its new parents. + +\subsection{Creating a new head} + +It's perfectly normal to update the working directory to a changeset +other than the current tip. For example, you might want to know what +your project looked like last Tuesday, or you could be looking through +changesets to see which one introduced a bug. In cases like this, the +natural thing to do is update the working directory to the changeset +you're interested in, and then examine the files in the working +directory directly to see their contents as they werea when you +committed that changeset. The effect of this is shown in +figure~\ref{fig:concepts:wdir-pre-branch}. + +\begin{figure}[ht] + \centering + \grafix{wdir-pre-branch} + \caption{The working directory, updated to an older changeset} + \label{fig:concepts:wdir-pre-branch} +\end{figure} + +Having updated the working directory to an older changeset, what +happens if you make some changes, and then commit? Mercurial behaves +in the same way as I outlined above. The parents of the working +directory become the parents of the new changeset. This new changeset +has no children, so it becomes the new tip. And the repository now +contains two changesets that have no children; we call these +\emph{heads}. You can see the structure that this creates in +figure~\ref{fig:concepts:wdir-branch}. + +\begin{figure}[ht] + \centering + \grafix{wdir-branch} + \caption{After a commit made while synced to an older changeset} + \label{fig:concepts:wdir-branch} +\end{figure} + +\begin{note} + If you're new to Mercurial, you should keep in mind a common + ``error'', which is to use the \hgcmd{pull} command without any + options. By default, the \hgcmd{pull} command \emph{does not} + update the working directory, so you'll bring new changesets into + your repository, but the working directory will stay synced at the + same changeset as before the pull. If you make some changes and + commit afterwards, you'll thus create a new head, because your + working directory isn't synced to whatever the current tip is. + + I put the word ``error'' in quotes because all that you need to do + to rectify this situation is \hgcmd{merge}, then \hgcmd{commit}. In + other words, this almost never has negative consequences; it just + surprises people. I'll discuss other ways to avoid this behaviour, + and why Mercurial behaves in this initially surprising way, later + on. +\end{note} + +\subsection{Merging heads} + +When you run the \hgcmd{merge} command, Mercurial leaves the first +parent of the working directory unchanged, and sets the second parent +to the changeset you're merging with, as shown in +figure~\ref{fig:concepts:wdir-merge}. + +\begin{figure}[ht] + \centering + \grafix{wdir-merge} + \caption{Merging two hehads} + \label{fig:concepts:wdir-merge} +\end{figure} + +Mercurial also has to modify the working directory, to merge the files +managed in the two changesets. Simplified a little, the merging +process goes like this, for every file in the manifests of both +changesets. +\begin{itemize} +\item If neither changeset has modified a file, do nothing with that + file. +\item If one changeset has modified a file, and the other hasn't, + create the modified copy of the file in the working directory. +\item If one changeset has removed a file, and the other hasn't (or + has also deleted it), delete the file from the working directory. +\item If one changeset has removed a file, but the other has modified + the file, ask the user what to do: keep the modified file, or remove + it? +\item If both changesets have modified a file, invoke an external + merge program to choose the new contents for the merged file. This + may require input from the user. +\item If one changeset has modified a file, and the other has renamed + or copied the file, make sure that the changes follow the new name + of the file. +\end{itemize} +There are more details---merging has plenty of corner cases---but +these are the most common choices that are involved in a merge. As +you can see, most cases are completely automatic, and indeed most +merges finish automatically, without requiring your input to resolve +any conflicts. + +When you're thinking about what happens when you commit after a merge, +once again the working directory is ``the changeset I'm about to +commit''. After the \hgcmd{merge} command completes, the working +directory has two parents; these will become the parents of the new +changeset. + +Mercurial lets you perform multiple merges, but you must commit the +results of each individual merge as you go. This is necessary because +Mercurial only tracks two parents for both revisions and the working +directory. While it would be technically possible to merge multiple +changesets at once, the prospect of user confusion and making a +terrible mess of a merge immediately becomes overwhelming. \section{Other interesting design features} @@ -460,6 +548,27 @@ performance and increase the complexity of the software, each of which is much more important to the ``feel'' of day-to-day use. +\subsection{Other contents of the dirstate} + +Because Mercurial doesn't force you to tell it when you're modifying a +file, it uses the dirstate to store some extra information so it can +determine efficiently whether you have modified a file. For each file +in the working directory, it stores the time that it last modified the +file itself, and the size of the file at that time. + +When you explicitly \hgcmd{add}, \hgcmd{remove}, \hgcmd{rename} or +\hgcmd{copy} files, the dirstate is updated each time. + +When Mercurial is checking the states of files in the working +directory, it first checks a file's modification time. If that has +not changed, the file must not have been modified. If the file's size +has changed, the file must have been modified. If the modification +time has changed, but the size has not, only then does Mercurial need +to read the actual contents of the file to see if they've changed. +Storing these few extra pieces of information dramatically reduces the +amount of data that Mercurial needs to read, which yields large +performance improvements compared to other revision control systems. + %%% Local Variables: %%% mode: latex %%% TeX-master: "00book" diff -r ccff2b25478e -r b74102b56df5 en/wdir-after-commit.svg --- a/en/wdir-after-commit.svg Mon Nov 13 14:34:57 2006 -0800 +++ b/en/wdir-after-commit.svg Mon Nov 13 16:19:48 2006 -0800 @@ -18,6 +18,16 @@ sodipodi:docname="wdir-after-commit.svg"> + + + id="g6261" + transform="translate(234,0)"> e7639888bb2f 7b064d8bac5e 000000000000 History in repository + transform="translate(262.3254,24.38544)"> + transform="translate(263.0396,49.83106)"> First parent Second parent Parents of working directory Newchangeset diff -r ccff2b25478e -r b74102b56df5 en/wdir-branch.svg --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/wdir-branch.svg Mon Nov 13 16:19:48 2006 -0800 @@ -0,0 +1,418 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + e7639888bb2f + + + + 7b064d8bac5e + + + + 000000000000 + + + + + ffb20e1701ea + + + + 000000000000 + + First parent + Second parent + Parents of working directory + + + + + ffb20e1701ea + + + Pre-existing head + Newly created head (and tip) + + + + diff -r ccff2b25478e -r b74102b56df5 en/wdir-merge.svg --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/wdir-merge.svg Mon Nov 13 16:19:48 2006 -0800 @@ -0,0 +1,425 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + 7b064d8bac5e + + + + 000000000000 + + + + + ffb20e1701ea + + + + e7639888bb2f + + First parent (unchanged) + Second parent + Parents of working directory + + + + + ffb20e1701ea + + + Pre-existing head + Newly created head (and tip) + + + + + + e7639888bb2f + + + diff -r ccff2b25478e -r b74102b56df5 en/wdir-pre-branch.svg --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/en/wdir-pre-branch.svg Mon Nov 13 16:19:48 2006 -0800 @@ -0,0 +1,364 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + e7639888bb2f + + + 7b064d8bac5e + + + + 000000000000 + + History in repository + + + + 7b064d8bac5e + + + + 000000000000 + + First parent + Second parent + Parents of working directory + + + +