hgbook: es/intro.tex annotate

hgbook

annotate es/intro.tex @ 404:1839fd383e50

translated a few subsections of intro

author	Igor TAmara <igor@tamarapatino.org>
date	Sat Nov 08 21:47:35 2008 -0500 (2008-11-08)
parents	4cdeb830118b
children	779944196e2a

rev	line source
igor@403	1 \chapter{Introducción}
igor@402	2 \label{chap:intro}
igor@402	3
igor@403	4 \section{Acerca del control de revisiones}
igor@403	5
igor@403	6 El control de revisiones es el proceso de administrar diferentes
igor@403	7 versiones de una pieza de información. En su forma más simple es algo
igor@403	8 que la mayoría de gente hace a mano: cada vez que usted modifica un
igor@403	9 fichero, lo graba con un nuevo nombre que contiene un número, el
igor@403	10 siguiente mayor que el anterior.
igor@403	11
igor@403	12 Administrar manualmente muchas versiones de un fichero es una tarea
igor@403	13 propensa a errores, a pesar de que hace bastante tiempo hay
igor@403	14 herramientas que ayudan en este proceso. Las primeras herramientas
igor@403	15 para automatizar el control de revisiones fueron pensadas para que un
igor@403	16 usuario administrara un solo fichero. En las décadas pasadas, el
igor@403	17 alcance de las herramientas de control de revisiones ha ido aumentando
igor@403	18 considerablemente; ahora manejan muchos archivos y facilitan el
igor@403	19 trabajo en conjunto de varias personas. Las mejores herramientas de
igor@403	20 control de revisiones de la actualidad no tienen problema con miles de
igor@403	21 personas trabajando en proyectos que consisten de decenas de miles de
igor@403	22 ficheros.
igor@403	23
igor@403	24 \subsection{¿Por qué usar control de revisiones?}
igor@403	25
igor@403	26 Hay muchas razones por las cuales usted o su equipo desearía usar una
igor@403	27 herramienta automática de control de revisiones para un proyecto.
igor@402	28 \begin{itemize}
igor@403	29 \item Contar con la historia y la evolución de su proyecto, para
igor@403	30 evitar hacer la tarea manualmente. Por cada cambio tendrá una
igor@403	31 bitácora de \emph{quién} lo hizo; \emph{por qué} se hizo;
igor@403	32 \emph{cuándo} se hizo; y de \emph{qué} se trataba el cambio.
igor@403	33 \item Cuando trabaja con más personas, los programas de control de
igor@403	34 revisiones facilitan la colaboración. Por ejemplo, cuando varias
igor@403	35 personas de forma casi simultanea pueden hacer cambios
igor@403	36 incompatibles, el programa le ayudará a identificar y resolver tales
igor@403	37 conflictos.
igor@403	38 \item Puede ayudarle a recuperarse de equivocaciones. Si aplica un
igor@403	39 cambio que posteriormente se evidencia como un error, puede
igor@403	40 revertirlo a una versión previa a uno o muchos ficheros. De hecho,
igor@403	41 una herramienta \emph{realmente} buena, incluso puede ayudarle
igor@403	42 efectivamente a darse cuenta exactamente cuándo se introdujo el
igor@403	43 error( para más detalles ver la sección~\ref{sec:undo:bisect}).
igor@403	44 \item Le permitirá trabajar simultáneamente, y manejar las diferencias
igor@403	45 entre múltiples versiones de su proyecto.
igor@402	46 \end{itemize}
igor@403	47 La mayoría de estas razones son igualmente validas ---por lo menos en
igor@403	48 teoría--- así esté trabajando en un proyecto solo, o con mucha gente.
igor@403	49
igor@403	50 Algo fundamental acerca de lo práctico de un sistema de control de
igor@403	51 revisiones en estas dos escalas (``un hacker solo'' y ``un equipo
igor@403	52 gigantesco'') es cómo se comparan los \emph{beneficios} con los
igor@403	53 \emph{costos}. Una herramienta de control de revisiones que sea
igor@403	54 difícil de entender o usar impondrá un costo alto.
igor@403	55
igor@403	56 Un proyecto de quinientas personas es muy propenso a colapsar
igor@403	57 solamente con su peso inmediatamente sin una herramienta de control de
igor@403	58 versiones y un proceso. En este caso, el costo de usar control de
igor@403	59 revisiones ni siquiera se tiene en cueant, puesto que \emph{sin} él,
igor@403	60 el fracaso está casi garantizado.
igor@403	61
igor@403	62 Por otra parte, un ``arreglo rápido'' de una sola persona, excluiría
igor@403	63 la necesidad de usar una herramienta de control de revisiones, porque
igor@403	64 casi seguramente, el costo de usar una estaría cerca del costo del
igor@403	65 proyecto. ¿No es así?
igor@403	66
igor@403	67 Mercurial solamente soporta \emph{ambas} escalas de de
igor@403	68 desarrollo. Puede aprender lo básico en pocos minutos, y dado su bajo
igor@403	69 sobrecosto, puede aplicar el control de revisiones al proyecto más
igor@403	70 pequeño con facilidad. Su simplicidad significa que no tendrá que
igor@403	71 preocuparse por conceptos obtusos o secuencias de órdenes compitiendo
igor@403	72 por espacio mental con lo que sea que \emph{realmente} esté tratando
igor@403	73 de hacer. Al mismo tiempo, Mercurial tiene alto desempeño y su
igor@403	74 naturaleza peer-to-peer le permite escalar indoloramente para manejar
igor@403	75 grandes proyectos.
igor@403	76
igor@403	77 Ninguna herramienta de control de revisiones puede salvar un
igor@403	78 proyecto mal administrado, pero la elección de herramientas puede
igor@403	79 hacer una gran diferencia en la fluidez con la cual puede trabajar en
igor@403	80 el proyecto.
igor@403	81
igor@403	82 \subsection{La cantidad de nombres del control de revisiones}
igor@403	83
igor@403	84 El control de revisiones es un campo amplio, tan ampli que no hay un
igor@403	85 acrónimo o nombre único. A continuación presentamos un listado de
igor@403	86 nombres comunes y acrónimos que se podrían encontrar:
igor@402	87 \begin{itemize}
igor@403	88 \item Control de revisiones (RCS)
igor@403	89 \item Manejo de Configuraciones de Programas(SCM), o administracón de
igor@403	90 configuraciones
igor@403	91 \item Administración de código fuente
igor@403	92 \item Control de Código Fuente, o Control de Fuentes
igor@403	93 \item Control de Versiones(VCS)
igor@402	94 \end{itemize}
igor@403	95 Algunas personas aducen que estos términos tienen significados
igor@403	96 diversos, pero en la práctica se sobrelapan tanto que no hay un
igor@403	97 acuerdo o una forma adecuada de separarlos.
igor@403	98
igor@403	99 \section{Historia resumida del control de revisiones}
igor@403	100
igor@403	101 La herramienta de control de revisiones más antigua conocida es SCCS
igor@403	102 (Sistema de Control de Código), escrito por Marc Rochkind en Bell
igor@403	103 Labs, a comienzos de los setentas(1970s). SCCS operaba sobre archivos
igor@403	104 individuales, y requería que cada persona que trabajara en el proyecto
igor@403	105 tuviera acceso a un espacio compartido en un solo sistema. Solamente
igor@403	106 una persona podía modificar un archivo en un momento dado; el
igor@403	107 arbitramiento del acceso a los ficheros se hacía con candados. Era
igor@403	108 común que la gente pusiera los candados a los ficheros, y que
igor@403	109 posteriormente olvidara quitarlos, impidiendo que otro pudiera
igor@403	110 modificar los ficheros en cuestión sin la intervención del
igor@403	111 administrador.
igor@403	112
igor@403	113 Walter Tichy desarrolló una alternativa gratutita a SCCS a comienzos
igor@403	114 de los ochentas(1980s), llamó a su programa RCS(Sistema de Control de
igor@403	115 Revisiones). Al igual que SCCS, RCS requería que los desarrolladores
igor@403	116 trabajaran en un único espacio compartido y colocaran candados a los
igor@403	117 ficheros para evitar que varias personas los estuvieran modificando
igor@403	118 simultáneamente.
igor@403	119
igor@403	120 Después en los ochenta, Dick Grune usó RCS como un bloque de
igor@403	121 construcción para un conjunto de guiones de línea de comando, que
igor@403	122 inicialmente llamó cmt, pero que renombró a CVS(Sistema Concurrente de
igor@403	123 Versiones). La gran innovación de CVS era que permitía a los
igor@403	124 desarrolladores trabajar simultáneamente de una forma más o menos
igor@403	125 independiente en sus propios espacios de trabajo. Los espacios de
igor@403	126 trabajo personales impedian que los desarrolladores se pisaran las
igor@403	127 mangueras todo el tiempo, situación común con SCCS y RCS. Cada
igor@403	128 desarrollador tenía una copia de todo el fichero del proyecto y podía
igor@403	129 modificar su copia independientemente, Tenían que fusionar sus
igor@403	130 ediciones antes de consignar los cambios al repositorio central.
igor@403	131
igor@403	132 Brian Berliner tomó los scripts originales de Grune y los reescribió
igor@403	133 en~C, haciéndolos públicos en 1989, código sobre el cual se ha
igor@403	134 desarrollado la versión moderna de CVS. CVS posteriormente adquirió
igor@403	135 la habilidad de operar sobre una conexión de red, dotándolo de una
igor@403	136 arquitectura, cliente/servidor. La arquitectura de CVS es
igor@403	137 centralizada; La historia del proyecto está únicamente en el
igor@403	138 repositorio central. Los espacios de trabajo de los clientes
igor@403	139 contienen únicamente copias recientes de las versiones de los
igor@403	140 ficheros, y pocos metadatos para indicar dónde está el servidor. CVS
igor@403	141 ha tenido un éxito enorme; Es probablemente el sistema de control de
igor@403	142 revisiones más extendido del planeta.
igor@402	143
igor@404	144 A comienzos de los noventa(1990s), Sun MicroSystems desarrollo un
igor@404	145 temprano sistema distribuido de revisión de controles llamado TeamWare
igor@404	146 Un espacio de trabajo TeamWare contiene una copia completa de la
igor@404	147 historia del proyecto. TeamWare no tiene la noción de repositorio
igor@404	148 central. (CVS se basaba en RCS para el almacenamiento de su historia;
igor@404	149 TeamWare usaba SCCS.)
igor@404	150
igor@404	151 A medida que avanzaba la decada de los noventa, se empezño a
igor@404	152 evidenciar los problemas de CVS. Alacena cambios simultáneos a muchos
igor@404	153 archivos de forma individual, en lugar de agruparlos como una
igor@404	154 operación única y atómica lógicamente. No maneja bien su jerarquía de
igor@404	155 ficheros bien; es fácil desordenar un repositorio renombrando ficheros
igor@404	156 y directorios. Peor aún, su código fuente es difícil de leer y
igor@404	157 mantener, lo que hace que su ``umbral de dolor'' para arreglar sus
igor@404	158 problemas arquitecturales algo prohibitivo.
igor@404	159
igor@404	160 En 2001, Jim Blandy y Karl Fogel, dos desarrolladores que habían
igor@404	161 trabajado en CVS, comenzaron un proyecto para reemplazarlo con una
igor@404	162 herramienta con mejor arquitectura y código más limpio. El resultado,
igor@404	163 Subversion, no se separó del modelo centralizado cliente/servidor de
igor@404	164 CVS, pero añadió consignaciones atómicas de varios ficheros, mejor
igor@404	165 manejo de nombres de espacios, y otras características que lo hacen
igor@404	166 mejor que CVS. Desde su versión inicial, ha ido creciendo en
igor@404	167 popularidad.
igor@404	168
igor@404	169 Más o menos en forma simultánea Graydon Hoare comenzó a trabajar en un
igor@404	170 sistema distribuido de control de versiones ambicioso que llamó
igor@404	171 Monotone. Mientras que Monotone se enfocaba a evitar algunas fallas de
igor@404	172 diseño de CVS con una arquitectura peer-to-peer, fue mucho más
igor@404	173 allá(junto con otros proyectos subsecuentes) que unas herramientas de
igor@404	174 control de revisiones en varios aspectos innovadores. Usa hashes
igor@404	175 criptográficos como identificadores, y tiene una noción integral de
igor@404	176 ``confianza'' para código de diversas fuentes.
igor@404	177
igor@404	178 Mercurial nació en el 2005. Algunos de sus aspectos de de diseño
igor@404	179 fueron influenciados por Monotone, pero Mercurial se enfoca en la
igor@404	180 facilidad de uso, gran rendimiento y escalabilidad para proyectos muy
igor@404	181 grandes.
igor@404	182
igor@404	183 \section{Tendencias en el control de revisiones}
igor@404	184
igor@404	185 Ha habido varias tendencias en el desarrollo y uso de las herramientas
igor@404	186 de control de revisiones en las pasadas cuatro décadas, mientras la
igor@404	187 gente se ha vuelto familiar con las capacidades de sus herramientas
igor@404	188 así mismo con sus limitaciones.
igor@404	189
igor@404	190 La primera generación comenzó administrando archivos individuales en
igor@404	191 computadores por persona. A pesar de que tales herramientas
igor@404	192 representaron un avance importante frente al control de revisiones
igor@404	193 manual, su modelo de candados y la limitación a un sólo computador,
igor@404	194 determinó equipos de trabajo pequeños y acoplados.
igor@404	195
igor@404	196 La segunda generación dejó atrás esas limitaciones moviéndose a
igor@404	197 arquitecturas centradas en redes, y administrando proyectos completos
igor@404	198 uno a la vez. A medida que los proyectos crecían, nacieron nuevos
igor@404	199 problemas. Con la necesidad de comunicación frecuente con los
igor@404	200 servidores, escalar estas máquinas se convirtió en un problema en
igor@404	201 proyectos realmente grandes. Las redes con poca estabilidad impidieron
igor@404	202 que usuarios remotos se conectaran al servidor. A medida que los
igor@404	203 proyecos de código abierto comenzaron a ofrecer acceso de sólo lectura
igor@404	204 de forma anónima a cualquiera, la gente sin permiso para consignar,
igor@404	205 vio que no podían usar tales herramientas para interactuar en un
igor@404	206 proyecto de forma natural, puesto que no podían guardar sus cambios.
igor@404	207
igor@404	208 La generación actual de herramientas de control de revisiones es de
igor@404	209 forma natural peer-to-peer. Todos estos sistemas han eliminado la
igor@404	210 dependencia de un único servidor central, y han permitido que la
igor@404	211 gente distribuya sus datos de control de revisiones donde realmente se
igor@404	212 necesita. La colaboración a través de Internet ha cambiado las
igor@404	213 limitantes tecnológicas a la cuestión de elección y consenso. Las
igor@404	214 herramientas modernas pueden operar sin conexión indefinidamenta y
igor@404	215 autónomamente, necesitando una conexión de red solamente para
igor@404	216 sincronizar los cambios con otro repositorio.
igor@404	217
igor@404	218 \section{Algunas ventajas del control distribuido de revisiones}
igor@404	219
igor@404	220 A pesar de que las herramientas para el control distribuido de
igor@404	221 revisiones lleva varios años siendo tan robusto y usable como la
igor@404	222 generación previa de su contraparte, personas que usan herramientas
igor@404	223 más antiguas no se han percatado de sus ventajas. Hay gran cantidad
igor@404	224 de situaciones en las cuales las herramientas distribuidas brillan
igor@404	225 frente a las centralizadas.
igor@404	226
igor@404	227 Para un desarrollador individual, las herramientas distribuidas casi
igor@404	228 siempre son más rápidas que las centralizadas. Por una razón sencilla:
igor@404	229 Una herramienta centralizada necesita comunicarse por red para las
igor@404	230 operaciones más usuales, debido a que los metadatos se almacenan en
igor@404	231 una sola copia en el servidor central. Una herramienta distribuida
igor@404	232 almacena todos sus metadatos localmente. Con todo lo demás de la
igor@404	233 misma forma, comunicarse por red tiene un sobrecosto en una
igor@404	234 herramienta centralizada. No subestime el valor de una herramienta de
igor@404	235 respuesta rápida: Usted empleará mucho tiempo interactuando con su
igor@404	236 programa de control de revisiones.
igor@404	237
igor@404	238 Las herramientas distribuidas son indiferentes a los caprichos de su
igor@404	239 infraestructura de servidores, de nuevo, debido a la replicación de
igor@404	240 metadatos en tantos lugares. Si usa un sistema centralizado y su
igor@404	241 servidor explota, ojalá los medios físicos de su copia de seguridad
igor@404	242 sean confiables, y que su última copia sea reciente y además
igor@404	243 funcione. Con una herramienta distribuida tiene tantas copias de
igor@404	244 seguridad disponibles como computadores de contribuidores.
igor@404	245
igor@404	246 La confiabilidad de su red afectará las herramientas distribuidas de
igor@404	247 una forma mucho menor que las herramientas centralizadas No puede
igor@404	248 siquiera usar una herramienta centralizada sin conexión de red,
igor@404	249 excepto con algunas órdenes muy limitadas. Con herramientas
igor@404	250 distribuidas, si sus conexión cae mientras usted está trabajando,
igor@404	251 podría nisiquiera darse cuenta. Lo único que que no podrá hacer es
igor@404	252 comunicarse con repositorios en otros computadores, algo que es
igor@404	253 relativamente raro comparado con las operaciones locales. Si tiene
igor@404	254 colaboradores remotos en su equipo, puede ser significante.
igor@404	255
igor@404	256 \subsection{Ventajas para proyectos de código abierto}
igor@404	257
igor@404	258 Si descubre un proyecto de código abierto y decide que desea comenzar
igor@404	259 a trabajar en él, y ese proyecto usa una herramienta de control
igor@404	260 distribuido de revisiones, usted es un par con la gente que se
igor@404	261 considera el ``alma'' del proyecto. Si ellos publican los
igor@404	262 repositorios, se puede copiar inmediatamente la historia del proyecto,
igor@404	263 hacer cambios y guardar su trabajo, usando las mismas herramientas de
igor@404	264 la misma forma que ellos. En contraste, con una herramienta
igor@404	265 centralizada, debe usar el programa en un modo ``sólo lectura'' a
igor@404	266 menos que alguien le otorgue permisos para consignar cambios en el
igor@404	267 repositorio central. Hasta entonces, no podrá almacenar sus cambios y
igor@404	268 sus modificaciones locales correrán el riesgo de dañarse cuando trate
igor@404	269 de actualizar su vista del repositorio.
igor@404	270
igor@404	271 \subsubsection{Las bifurcaciones(forks) no son un problema}
igor@404	272
igor@404	273 Se ha mencionado que las herramientas de control distribuido de
igor@404	274 versiones albergan un riesgo a los proyectos de código abierto, puesto
igor@404	275 que se vuelve muy sencillo hacer una ``bifurcanción''\NdT{fork} del
igor@404	276 desarrollo del proyecto. Una bifurcación pasa cuando hay diferencias
igor@404	277 de opinión o actitud entre grupos de desarrolladores que desenvoca en
igor@404	278 la decisión de la imposibilidad de continuar trabajando juntos. Cada
igor@404	279 parte toma una copia más o menos completa del código fuente del
igor@404	280 proyecto y toma su propio rumbo.
igor@404	281
igor@404	282 En algunas ocasiones los líderes de las bifurcaciones reconcilian sus
igor@404	283 diferencias. Con un sistema centralizado de control de revisiones, el
igor@404	284 proceso \emph{técnico} de reconciliarse es doloroso, y se hace de
igor@404	285 forma muy manual. Tiene que decidir qué historia de revisiones va a
igor@404	286 ``win'', e injertar los cambios del otro equipo en el árbol de alguna
igor@404	287 manera. Con esto usualmente se pierde algo o todo del historial de la
igor@404	288 revisión de alguna de las partes.
igor@404	289
igor@404	290 Lo que las herramientas distribuidas hacen con respecto a las
igor@404	291 bifurcaciones, es que las bifurcaciones son la \emph{única} forma de
igor@404	292 desarrollar un proyecto. Cada cambio que usted hace es potencialmente
igor@404	293 un punto de bifurcación. La gran fortaleza de esta aproximación es que
igor@404	294 las herramientas distribuidas de control de revisiones tiene que ser
igor@404	295 bueno al \emph{fusionar} las bifurcaciones, porque las bifurcaciones
igor@404	296 son absolutamente fundamentales: pasan todo el tiempo.
igor@404	297
igor@404	298 Si todas las porciones de trabajo que todos hacen todo el tiempo, se
igor@404	299 enmarca en términos de bifurcaciones y fusiones, entonces a aquello a
igor@404	300 lo que se refiere en el mundo del código abierto a una ``bifurcación''
igor@404	301 se convierte \emph{puramente} en una cuestión social. Lo que hacen las
igor@404	302 herramientas distribuidas es \emph{disminuir} la posibilidad de una
igor@404	303 bifurcación porque:
igor@402	304 \begin{itemize}
igor@404	305 \item Eliminan la distinción social que las herramientas centralizadas
igor@404	306 imponen: esto entre los miembros (personas con permiso de consignar)
igor@404	307 y forasteros(los que no tienen el permiso).
igor@404	308 \item Facilitan la reconciliación después de una bifurcación social,
igor@404	309 porque todo lo que concierne al programa de control de revisiones es
igor@404	310 una fusión.
igor@402	311 \end{itemize}
igor@402	312
igor@404	313 Algunas personas se resisten a las herramientas distribuidas porque
igor@404	314 desean mantener control completo sobre sus proyectos, y creen que las
igor@404	315 herramientas centralizadas les dan tal control. En todo caso, si este
igor@404	316 es su parecer, y publica sus repositorios de CVS o Subversion, hay
igor@404	317 muchas herramientas disponibles que pueden obtener la historia
igor@404	318 completa(A pesar de lo lento) y recrearla en otro sitio que usted no
igor@404	319 controla. Siendo así un control ilusorio, está impidiendo la fluidez
igor@404	320 de colaboración frente a quien se sienta impulsado a obtener una copia
igor@404	321 y hacer una bifurcación con su historia.
igor@402	322
igor@402	323 \subsection{Advantages for commercial projects}
igor@402	324
igor@402	325 Many commercial projects are undertaken by teams that are scattered
igor@402	326 across the globe. Contributors who are far from a central server will
igor@402	327 see slower command execution and perhaps less reliability. Commercial
igor@402	328 revision control systems attempt to ameliorate these problems with
igor@402	329 remote-site replication add-ons that are typically expensive to buy
igor@402	330 and cantankerous to administer. A distributed system doesn't suffer
igor@402	331 from these problems in the first place. Better yet, you can easily
igor@402	332 set up multiple authoritative servers, say one per site, so that
igor@402	333 there's no redundant communication between repositories over expensive
igor@402	334 long-haul network links.
igor@402	335
igor@402	336 Centralised revision control systems tend to have relatively low
igor@402	337 scalability. It's not unusual for an expensive centralised system to
igor@402	338 fall over under the combined load of just a few dozen concurrent
igor@402	339 users. Once again, the typical response tends to be an expensive and
igor@402	340 clunky replication facility. Since the load on a central server---if
igor@402	341 you have one at all---is many times lower with a distributed
igor@402	342 tool (because all of the data is replicated everywhere), a single
igor@402	343 cheap server can handle the needs of a much larger team, and
igor@402	344 replication to balance load becomes a simple matter of scripting.
igor@402	345
igor@402	346 If you have an employee in the field, troubleshooting a problem at a
igor@402	347 customer's site, they'll benefit from distributed revision control.
igor@402	348 The tool will let them generate custom builds, try different fixes in
igor@402	349 isolation from each other, and search efficiently through history for
igor@402	350 the sources of bugs and regressions in the customer's environment, all
igor@402	351 without needing to connect to your company's network.
igor@402	352
igor@402	353 \section{Why choose Mercurial?}
igor@402	354
igor@402	355 Mercurial has a unique set of properties that make it a particularly
igor@402	356 good choice as a revision control system.
igor@402	357 \begin{itemize}
igor@402	358 \item It is easy to learn and use.
igor@402	359 \item It is lightweight.
igor@402	360 \item It scales excellently.
igor@402	361 \item It is easy to customise.
igor@402	362 \end{itemize}
igor@402	363
igor@402	364 If you are at all familiar with revision control systems, you should
igor@402	365 be able to get up and running with Mercurial in less than five
igor@402	366 minutes. Even if not, it will take no more than a few minutes
igor@402	367 longer. Mercurial's command and feature sets are generally uniform
igor@402	368 and consistent, so you can keep track of a few general rules instead
igor@402	369 of a host of exceptions.
igor@402	370
igor@402	371 On a small project, you can start working with Mercurial in moments.
igor@402	372 Creating new changes and branches; transferring changes around
igor@402	373 (whether locally or over a network); and history and status operations
igor@402	374 are all fast. Mercurial attempts to stay nimble and largely out of
igor@402	375 your way by combining low cognitive overhead with blazingly fast
igor@402	376 operations.
igor@402	377
igor@402	378 The usefulness of Mercurial is not limited to small projects: it is
igor@402	379 used by projects with hundreds to thousands of contributors, each
igor@402	380 containing tens of thousands of files and hundreds of megabytes of
igor@402	381 source code.
igor@402	382
igor@402	383 If the core functionality of Mercurial is not enough for you, it's
igor@402	384 easy to build on. Mercurial is well suited to scripting tasks, and
igor@402	385 its clean internals and implementation in Python make it easy to add
igor@402	386 features in the form of extensions. There are a number of popular and
igor@402	387 useful extensions already available, ranging from helping to identify
igor@402	388 bugs to improving performance.
igor@402	389
igor@402	390 \section{Mercurial compared with other tools}
igor@402	391
igor@402	392 Before you read on, please understand that this section necessarily
igor@402	393 reflects my own experiences, interests, and (dare I say it) biases. I
igor@402	394 have used every one of the revision control tools listed below, in
igor@402	395 most cases for several years at a time.
igor@402	396
igor@402	397
igor@402	398 \subsection{Subversion}
igor@402	399
igor@402	400 Subversion is a popular revision control tool, developed to replace
igor@402	401 CVS. It has a centralised client/server architecture.
igor@402	402
igor@402	403 Subversion and Mercurial have similarly named commands for performing
igor@402	404 the same operations, so if you're familiar with one, it is easy to
igor@402	405 learn to use the other. Both tools are portable to all popular
igor@402	406 operating systems.
igor@402	407
igor@402	408 Prior to version 1.5, Subversion had no useful support for merges.
igor@402	409 At the time of writing, its merge tracking capability is new, and known to be
igor@402	410 \href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
igor@402	411 and buggy}.
igor@402	412
igor@402	413 Mercurial has a substantial performance advantage over Subversion on
igor@402	414 every revision control operation I have benchmarked. I have measured
igor@402	415 its advantage as ranging from a factor of two to a factor of six when
igor@402	416 compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
igor@402	417 the fastest access method available. In more realistic deployments
igor@402	418 involving a network-based store, Subversion will be at a substantially
igor@402	419 larger disadvantage. Because many Subversion commands must talk to
igor@402	420 the server and Subversion does not have useful replication facilities,
igor@402	421 server capacity and network bandwidth become bottlenecks for modestly
igor@402	422 large projects.
igor@402	423
igor@402	424 Additionally, Subversion incurs substantial storage overhead to avoid
igor@402	425 network transactions for a few common operations, such as finding
igor@402	426 modified files (\texttt{status}) and displaying modifications against
igor@402	427 the current revision (\texttt{diff}). As a result, a Subversion
igor@402	428 working copy is often the same size as, or larger than, a Mercurial
igor@402	429 repository and working directory, even though the Mercurial repository
igor@402	430 contains a complete history of the project.
igor@402	431
igor@402	432 Subversion is widely supported by third party tools. Mercurial
igor@402	433 currently lags considerably in this area. This gap is closing,
igor@402	434 however, and indeed some of Mercurial's GUI tools now outshine their
igor@402	435 Subversion equivalents. Like Mercurial, Subversion has an excellent
igor@402	436 user manual.
igor@402	437
igor@402	438 Because Subversion doesn't store revision history on the client, it is
igor@402	439 well suited to managing projects that deal with lots of large, opaque
igor@402	440 binary files. If you check in fifty revisions to an incompressible
igor@402	441 10MB file, Subversion's client-side space usage stays constant The
igor@402	442 space used by any distributed SCM will grow rapidly in proportion to
igor@402	443 the number of revisions, because the differences between each revision
igor@402	444 are large.
igor@402	445
igor@402	446 In addition, it's often difficult or, more usually, impossible to
igor@402	447 merge different versions of a binary file. Subversion's ability to
igor@402	448 let a user lock a file, so that they temporarily have the exclusive
igor@402	449 right to commit changes to it, can be a significant advantage to a
igor@402	450 project where binary files are widely used.
igor@402	451
igor@402	452 Mercurial can import revision history from a Subversion repository.
igor@402	453 It can also export revision history to a Subversion repository. This
igor@402	454 makes it easy to ``test the waters'' and use Mercurial and Subversion
igor@402	455 in parallel before deciding to switch. History conversion is
igor@402	456 incremental, so you can perform an initial conversion, then small
igor@402	457 additional conversions afterwards to bring in new changes.
igor@402	458
igor@402	459
igor@402	460 \subsection{Git}
igor@402	461
igor@402	462 Git is a distributed revision control tool that was developed for
igor@402	463 managing the Linux kernel source tree. Like Mercurial, its early
igor@402	464 design was somewhat influenced by Monotone.
igor@402	465
igor@402	466 Git has a very large command set, with version~1.5.0 providing~139
igor@402	467 individual commands. It has something of a reputation for being
igor@402	468 difficult to learn. Compared to Git, Mercurial has a strong focus on
igor@402	469 simplicity.
igor@402	470
igor@402	471 In terms of performance, Git is extremely fast. In several cases, it
igor@402	472 is faster than Mercurial, at least on Linux, while Mercurial performs
igor@402	473 better on other operations. However, on Windows, the performance and
igor@402	474 general level of support that Git provides is, at the time of writing,
igor@402	475 far behind that of Mercurial.
igor@402	476
igor@402	477 While a Mercurial repository needs no maintenance, a Git repository
igor@402	478 requires frequent manual ``repacks'' of its metadata. Without these,
igor@402	479 performance degrades, while space usage grows rapidly. A server that
igor@402	480 contains many Git repositories that are not rigorously and frequently
igor@402	481 repacked will become heavily disk-bound during backups, and there have
igor@402	482 been instances of daily backups taking far longer than~24 hours as a
igor@402	483 result. A freshly packed Git repository is slightly smaller than a
igor@402	484 Mercurial repository, but an unpacked repository is several orders of
igor@402	485 magnitude larger.
igor@402	486
igor@402	487 The core of Git is written in C. Many Git commands are implemented as
igor@402	488 shell or Perl scripts, and the quality of these scripts varies widely.
igor@402	489 I have encountered several instances where scripts charged along
igor@402	490 blindly in the presence of errors that should have been fatal.
igor@402	491
igor@402	492 Mercurial can import revision history from a Git repository.
igor@402	493
igor@402	494
igor@402	495 \subsection{CVS}
igor@402	496
igor@402	497 CVS is probably the most widely used revision control tool in the
igor@402	498 world. Due to its age and internal untidiness, it has been only
igor@402	499 lightly maintained for many years.
igor@402	500
igor@402	501 It has a centralised client/server architecture. It does not group
igor@402	502 related file changes into atomic commits, making it easy for people to
igor@402	503 ``break the build'': one person can successfully commit part of a
igor@402	504 change and then be blocked by the need for a merge, causing other
igor@402	505 people to see only a portion of the work they intended to do. This
igor@402	506 also affects how you work with project history. If you want to see
igor@402	507 all of the modifications someone made as part of a task, you will need
igor@402	508 to manually inspect the descriptions and timestamps of the changes
igor@402	509 made to each file involved (if you even know what those files were).
igor@402	510
igor@402	511 CVS has a muddled notion of tags and branches that I will not attempt
igor@402	512 to even describe. It does not support renaming of files or
igor@402	513 directories well, making it easy to corrupt a repository. It has
igor@402	514 almost no internal consistency checking capabilities, so it is usually
igor@402	515 not even possible to tell whether or how a repository is corrupt. I
igor@402	516 would not recommend CVS for any project, existing or new.
igor@402	517
igor@402	518 Mercurial can import CVS revision history. However, there are a few
igor@402	519 caveats that apply; these are true of every other revision control
igor@402	520 tool's CVS importer, too. Due to CVS's lack of atomic changes and
igor@402	521 unversioned filesystem hierarchy, it is not possible to reconstruct
igor@402	522 CVS history completely accurately; some guesswork is involved, and
igor@402	523 renames will usually not show up. Because a lot of advanced CVS
igor@402	524 administration has to be done by hand and is hence error-prone, it's
igor@402	525 common for CVS importers to run into multiple problems with corrupted
igor@402	526 repositories (completely bogus revision timestamps and files that have
igor@402	527 remained locked for over a decade are just two of the less interesting
igor@402	528 problems I can recall from personal experience).
igor@402	529
igor@402	530 Mercurial can import revision history from a CVS repository.
igor@402	531
igor@402	532
igor@402	533 \subsection{Commercial tools}
igor@402	534
igor@402	535 Perforce has a centralised client/server architecture, with no
igor@402	536 client-side caching of any data. Unlike modern revision control
igor@402	537 tools, Perforce requires that a user run a command to inform the
igor@402	538 server about every file they intend to edit.
igor@402	539
igor@402	540 The performance of Perforce is quite good for small teams, but it
igor@402	541 falls off rapidly as the number of users grows beyond a few dozen.
igor@402	542 Modestly large Perforce installations require the deployment of
igor@402	543 proxies to cope with the load their users generate.
igor@402	544
igor@402	545
igor@402	546 \subsection{Choosing a revision control tool}
igor@402	547
igor@402	548 With the exception of CVS, all of the tools listed above have unique
igor@402	549 strengths that suit them to particular styles of work. There is no
igor@402	550 single revision control tool that is best in all situations.
igor@402	551
igor@402	552 As an example, Subversion is a good choice for working with frequently
igor@402	553 edited binary files, due to its centralised nature and support for
igor@402	554 file locking.
igor@402	555
igor@402	556 I personally find Mercurial's properties of simplicity, performance,
igor@402	557 and good merge support to be a compelling combination that has served
igor@402	558 me well for several years.
igor@402	559
igor@402	560
igor@402	561 \section{Switching from another tool to Mercurial}
igor@402	562
igor@402	563 Mercurial is bundled with an extension named \hgext{convert}, which
igor@402	564 can incrementally import revision history from several other revision
igor@402	565 control tools. By ``incremental'', I mean that you can convert all of
igor@402	566 a project's history to date in one go, then rerun the conversion later
igor@402	567 to obtain new changes that happened after the initial conversion.
igor@402	568
igor@402	569 The revision control tools supported by \hgext{convert} are as
igor@402	570 follows:
igor@402	571 \begin{itemize}
igor@402	572 \item Subversion
igor@402	573 \item CVS
igor@402	574 \item Git
igor@402	575 \item Darcs
igor@402	576 \end{itemize}
igor@402	577
igor@402	578 In addition, \hgext{convert} can export changes from Mercurial to
igor@402	579 Subversion. This makes it possible to try Subversion and Mercurial in
igor@402	580 parallel before committing to a switchover, without risking the loss
igor@402	581 of any work.
igor@402	582
igor@402	583 The \hgxcmd{conver}{convert} command is easy to use. Simply point it
igor@402	584 at the path or URL of the source repository, optionally give it the
igor@402	585 name of the destination repository, and it will start working. After
igor@402	586 the initial conversion, just run the same command again to import new
igor@402	587 changes.
igor@402	588
igor@402	589
igor@402	590 %%% Local Variables:
igor@402	591 %%% mode: latex
igor@402	592 %%% TeX-master: "00book"
igor@402	593 %%% End: