hgbook

diff en/ch04-daily.xml @ 683:1a0a78e197c3

Incorporate feedback from Greg Lindahl.
author Bryan O'Sullivan <bos@serpentine.com>
date Fri Apr 24 00:27:05 2009 -0700 (2009-04-24)
parents 29f0f79cf614
children a66f6d499afa
line diff
     1.1 --- a/en/ch04-daily.xml	Thu Apr 16 23:46:45 2009 -0700
     1.2 +++ b/en/ch04-daily.xml	Fri Apr 24 00:27:05 2009 -0700
     1.3 @@ -1,6 +1,6 @@
     1.4  <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
     1.5  
     1.6 -<chapter id="chap:daily">
     1.7 +<chapter &#105;id="chap:daily">
     1.8    <?dbhtml filename="mercurial-in-daily-use.html"?>
     1.9    <title>Mercurial in daily use</title>
    1.10  
    1.11 @@ -673,6 +673,162 @@
    1.12  	track of our progress with each file as we go.</para>
    1.13      </sect2>
    1.14    </sect1>
    1.15 +
    1.16 +  <sect1>
    1.17 +    <title>More useful diffs</title>
    1.18 +
    1.19 +    <para>The default output of the <command role="hg-cmd">hg
    1.20 +	diff</command> command is backwards compatible with the
    1.21 +      regular <command>diff</command> command, but this has some
    1.22 +      drawbacks.</para>
    1.23 +
    1.24 +    <para>Consider the case where we use <command role="hg-cmd">hg
    1.25 +	rename</command> to rename a file.</para>
    1.26 +
    1.27 +    &interaction.ch04-diff.rename.basic;
    1.28 +
    1.29 +    <para>The output of <command role="hg-cmd">hg diff</command> above
    1.30 +      obscures the fact that we simply renamed a file.  The <command
    1.31 +	role="hg-cmd">hg diff</command> command accepts an option,
    1.32 +      <option>--git</option> or <option>-g</option>, to use a newer
    1.33 +      diff format that displays such information in a more readable
    1.34 +      form.</para>
    1.35 +
    1.36 +    &interaction.ch04-diff.rename.git;
    1.37 +
    1.38 +    <para>This option also helps with a case that can otherwise be
    1.39 +      confusing: a file that appears to be modified according to
    1.40 +      <command role="hg-cmd">hg status</command>, but for which
    1.41 +      <command role="hg-cmd">hg diff</command> prints nothing. This
    1.42 +      situation can arise if we change the file's execute
    1.43 +      permissions.</para>
    1.44 +
    1.45 +    &interaction.ch04-diff.chmod;
    1.46 +
    1.47 +    <para>The normal <command>diff</command> command pays no attention
    1.48 +      to file permissions, which is why <command role="hg-cmd">hg
    1.49 +	diff</command> prints nothing by default.  If we supply it
    1.50 +      with the <option>-g</option> option, it tells us what really
    1.51 +      happened.</para>
    1.52 +
    1.53 +    &interaction.ch04-diff.chmod.git;
    1.54 +  </sect1>
    1.55 +
    1.56 +  <sect1>
    1.57 +    <title>Which files to manage, and which to avoid</title>
    1.58 +
    1.59 +    <para>Revision control systems are generally best at managing text
    1.60 +      files that are written by humans, such as source code, where the
    1.61 +      files do not change much from one revision to the next.  Some
    1.62 +      centralized revision control systems can also deal tolerably
    1.63 +      well with binary files, such as bitmap images.</para>
    1.64 +
    1.65 +    <para>For instance, a game development team will typically manage
    1.66 +      both its source code and all of its binary assets (e.g. geometry
    1.67 +      data, textures, map layouts) in a revision control
    1.68 +      system.</para>
    1.69 +
    1.70 +    <para>Because it is usually impossible to merge two conflicting
    1.71 +      modifications to a binary file, centralized systems often
    1.72 +      provide a file locking mechanism that allow a user to say
    1.73 +      <quote>I am the only person who can edit this
    1.74 +	file</quote>.</para>
    1.75 +
    1.76 +    <para>Compared to a centralized system, a distributed revision
    1.77 +      control system changes some of the factors that guide decisions
    1.78 +      over which files to manage and how.</para>
    1.79 +
    1.80 +    <para>For instance, a distributed revision control system cannot,
    1.81 +      by its nature, offer a file locking facility.  There is thus no
    1.82 +      built-in mechanism to prevent two people from making conflicting
    1.83 +      changes to a binary file.  If you have a team where several
    1.84 +      people may be editing binary files frequently, it may not be a
    1.85 +      good idea to use Mercurial&emdash;or any other distributed
    1.86 +      revision control system&emdash;to manage those files.</para>
    1.87 +
    1.88 +    <para>When storing modifications to a file, Mercurial usually
    1.89 +      saves only the differences between the previous and current
    1.90 +      versions of the file.  For most text files, this is extremely
    1.91 +      efficient. However, some files (particularly binary files) are
    1.92 +      laid out in such a way that even a small change to a file's
    1.93 +      logical content results in many or most of the bytes inside the
    1.94 +      file changing.  For instance, compressed files are particularly
    1.95 +      susceptible to this. If the differences between each successive
    1.96 +      version of a file are always large, Mercurial will not be able
    1.97 +      to store the file's revision history very efficiently.  This can
    1.98 +      affect both local storage needs and the amount of time it takes
    1.99 +      to clone a repository.</para>
   1.100 +
   1.101 +    <para>To get an idea of how this could affect you in practice,
   1.102 +      suppose you want to use Mercurial to manage an OpenOffice
   1.103 +      document.  OpenOffice stores documents on disk as compressed zip
   1.104 +      files. Edit even a single letter of your document in OpenOffice,
   1.105 +      and almost every byte in the entire file will change when you
   1.106 +      save it. Now suppose that file is 2MB in size.  Because most of
   1.107 +      the file changes every time you save, Mercurial will have to
   1.108 +      store all 2MB of the file every time you commit, even though
   1.109 +      from your perspective, perhaps only a few words are changing
   1.110 +      each time.  A single frequently-edited file that is not friendly
   1.111 +      to Mercurial's storage assumptions can easily have an outsized
   1.112 +      effect on the size of the repository.</para>
   1.113 +
   1.114 +    <para>Even worse, if both you and someone else edit the OpenOffice
   1.115 +      document you're working on, there is no useful way to merge your
   1.116 +      work. In fact, there isn't even a good way to tell what the
   1.117 +      differences are between your respective changes.</para>
   1.118 +
   1.119 +    <para>There are thus a few clear recommendations about specific
   1.120 +      kinds of files to be very careful with.</para>
   1.121 +
   1.122 +    <itemizedlist>
   1.123 +      <listitem>
   1.124 +	<para>Files that are very large and incompressible, e.g. ISO
   1.125 +	  CD-ROM images, will by virtue of sheer size make clones over
   1.126 +	  a network very slow.</para>
   1.127 +      </listitem>
   1.128 +      <listitem>
   1.129 +	<para>Files that change a lot from one revision to the next
   1.130 +	  may be expensive to store if you edit them frequently, and
   1.131 +	  conflicts due to concurrent edits may be difficult to
   1.132 +	  resolve.</para>
   1.133 +      </listitem>
   1.134 +    </itemizedlist>
   1.135 +  </sect1>
   1.136 +
   1.137 +  <sect1>
   1.138 +    <title>Backups and mirroring</title>
   1.139 +
   1.140 +    <para>Since Mercurial maintains a complete copy of history in each
   1.141 +      clone, everyone who uses Mercurial to collaborate on a project
   1.142 +      can potentially act as a source of backups in the event of a
   1.143 +      catastrophe.  If a central repository becomes unavailable, you
   1.144 +      can construct a replacement simply by cloning a copy of the
   1.145 +      repository from one contributor, and pulling any changes they
   1.146 +      may not have seen from others.</para>
   1.147 +
   1.148 +    <para>It is simple to use Mercurial to perform off-site backups
   1.149 +      and remote mirrors.  Set up a periodic job (e.g. via the
   1.150 +      <command>cron</command> command) on a remote server to pull
   1.151 +      changes from your master repositories every hour.  This will
   1.152 +      only be tricky in the unlikely case that the number of master
   1.153 +      repositories you maintain changes frequently, in which case
   1.154 +      you'll need to do a little scripting to refresh the list of
   1.155 +      repositories to back up.</para>
   1.156 +
   1.157 +    <para>If you perform traditional backups of your master
   1.158 +      repositories to tape or disk, and you want to back up a
   1.159 +      repository named <filename>myrepo</filename>.  Use <command>hg
   1.160 +	clone -U myrepo myrepo.bak</command> to create a
   1.161 +      clone of <filename>myrepo</filename> before you start your
   1.162 +      backups.  The <option>-U</option> option doesn't check out a
   1.163 +      working directory after the clone completes, since that would be
   1.164 +      superfluous and make the backup take longer.
   1.165 +
   1.166 +    <para>If you then back up <filename>myrepo.bak</filename> instead
   1.167 +      of <filename>myrepo</filename>, you will be guaranteed to have a
   1.168 +      consistent snapshot of your repository that won't be pushed to
   1.169 +      by an insomniac developer in mid-backup.</para>
   1.170 +  </sect1>
   1.171  </chapter>
   1.172  
   1.173  <!--