hgbook: b49a7dd4e564 en/hook.tex

hgbook

view en/hook.tex @ 38:b49a7dd4e564

More content for hook chapter.
Overview of hooks.
Description of hook security implications.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Wed Jul 19 00:06:21 2006 -0700 (2006-07-19)
parents	9fd0c59b009a
children	576fef93bb49

line source

1 \chapter{Handling repository events with hooks}

2 \label{chap:hook}

4 Mercurial offers a powerful mechanism to let you perform automated

5 actions in response to events that occur in a repository. In some

6 cases, you can even control Mercurial's response to those events.

8 The name Mercurial uses for one of these actions is a \emph{hook}.

9 Hooks are called ``triggers'' in some revision control systems, but

10 the two names refer to the same idea.

12 \section{An overview of hooks in Mercurial}

14 Here is a brief list of the hooks that Mercurial supports. For each

15 hook, we indicate when it is run, and a few examples of common tasks

16 you can use it for. We will revisit each of these hooks in more

17 detail later.

18 \begin{itemize}

19 \item[\small\hook{changegroup}] This is run after a group of

20 changesets has been brought into the repository from elsewhere. In

21 other words, it is run after a \hgcmd{pull} or \hgcmd{push} into a

22 repository, but not after a \hgcmd{commit}. You can use this for

23 performing an action once for the entire group of newly arrived

24 changesets. For example, you could use this hook to send out email

25 notifications, or kick off an automated build or test.

26 \item[\small\hook{commit}] This is run after a new changeset has been

27 created in the local repository, typically using the \hgcmd{commit}

28 command.

29 \item[\small\hook{incoming}] This is run once for each new changeset

30 that is brought into the repository from elsewhere. Notice the

31 difference from \hook{changegroup}, which is run once per

32 \emph{group} of changesets brought in. You can use this for the

33 same purposes as the \hook{changegroup} hook; it's simply more

34 convenient sometimes to run a hook once per group of changesets,

35 while othher times it's handier once per changeset.

36 \item[\small\hook{outgoing}] This is run after a group of changesets

37 has been transmitted from this repository to another. You can use

38 this, for example, to notify subscribers every time changes are

39 cloned or pulled from the repository.

40 \item[\small\hook{prechangegroup}] This is run before starting to

41 bring a group of changesets into the repository. It cannot see the

42 actual changesets, because they have not yet been transmitted. If

43 it fails, the changesets will not be transmitted. You can use this

44 hook to ``lock down'' a repository against incoming changes.

45 \item[\small\hook{precommit}] This is run before starting a commit.

46 It cannot tell what files are included in the commit, or any other

47 information about the commit. If it fails, the commit will not be

48 allowed to start. You can use this to perform a build and require

49 it to complete successfully before a commit can proceed, or

50 automatically enforce a requirement that modified files pass your

51 coding style guidelines.

52 \item[\small\hook{preoutgoing}] This is run before starting to

53 transmit a group of changesets from this repository. You can use

54 this to lock a repository against clones or pulls from remote

55 clients.

56 \item[\small\hook{pretag}] This is run before creating a tag. If it

57 fails, the tag will not be created. You can use this to enforce a

58 uniform tag naming convention.

59 \item[\small\hook{pretxnchangegroup}] This is run after a group of

60 changesets has been brought into the local repository from another,

61 but before the transaction completes that will make the changes

62 permanent in the repository. If it fails, the transaction will be

63 rolled back and the changes will disappear from the local

64 repository. You can use this to automatically check newly arrived

65 changes and, for example, roll them back if the group as a whole

66 does not build or pass your test suite.

67 \item[\small\hook{pretxncommit}] This is run after a new changeset has

68 been created in the local repository, but before the transaction

69 completes that will make it permanent. Unlike the \hook{precommit}

70 hook, this hook can see which changes are present in the changeset,

71 and it can also see all other changeset metadata, such as the commit

72 message. You can use this to require that a commit message follows

73 your local conventions, or that a changeset builds cleanly.

74 \item[\small\hook{preupdate}] This is run before starting an update or

75 merge of the working directory.

76 \item[\small\hook{tag}] This is run after a tag is created.

77 \item[\small\hook{update}] This is run after an update or merge of the

78 working directory has finished.

79 \end{itemize}

80 Each of the hooks with a ``\texttt{pre}'' prefix has the ability to

81 \emph{control} an activity. If the hook succeeds, the activity may

82 proceed; if it fails, the activity is either not permitted or undone,

83 depending on the hook.

85 \section{Hooks and security}

87 \subsection{Hooks are run with your privileges}

89 When you run a Mercurial command in a repository, and the command

90 causes a hook to run, that hook runs on your system, under your user

91 account, with your privilege level. Since hooks are arbitrary pieces

92 of executable code, you should treat them with an appropriate level of

93 suspicion. Do not install a hook unless you are confident that you

94 know who created it and what it does.

96 In some cases, you may be exposed to hooks that you did not install

97 yourself. If you work with Mercurial on an unfamiliar system,

98 Mercurial will run hooks defined in that system's global \hgrc\ file.

100 If you are working with a repository owned by another user, Mercurial

101 will run hooks defined in that repository. For example, if you

102 \hgcmd{pull} from that repository, and its \sfilename{.hg/hgrc}

103 defines a local \hook{outgoing} hook, that hook will run under your

104 user account, even though you don't own that repository.

105

106 \begin{note}

107 This only applies if you are pulling from a repository on a local or

108 network filesystem. If you're pulling over http or ssh, any

109 \hook{outgoing} hook will run under the account of the server

110 process, on the server.

111 \end{note}

112

113 XXX To see what hooks are defined in a repository, use the

114 \hgcmdargs{config}{hooks} command. If you are working in one

115 repository, but talking to another that you do not own (e.g.~using

116 \hgcmd{pull} or \hgcmd{incoming}), remember that it is the other

117 repository's hooks you should be checking, not your own.

118

119 \subsection{Hooks do not propagate}

120

121 In Mercurial, hooks are not revision controlled, and do not propagate

122 when you clone, or pull from, a repository. The reason for this is

123 simple: a hook is a completely arbitrary piece of executable code. It

124 runs under your user identity, with your privilege level, on your

125 machine.

126

127 It would be extremely reckless for any distributed revision control

128 system to implement revision-controlled hooks, as this would offer an

129 easily exploitable way to subvert the accounts of users of the

130 revision control system.

131

132 Since Mercurial does not propagate hooks, if you are collaborating

133 with other people on a common project, you should not assume that they

134 are using the same Mercurial hooks as you are, or that theirs are

135 correctly configured. You should document the hooks you expect people

136 to use.

137

138 In a corporate intranet, this is somewhat easier to control, as you

139 can for example provide a ``standard'' installation of Mercurial on an

140 NFS filesystem, and use a site-wide \hgrc\ file to define hooks that

141 all users will see. However, this too has its limits; see below.

142

143 \subsection{Hooks can be overridden}

144

145 Mercurial allows you to override a hook definition by redefining the

146 hook. You can disable it by setting its value to the empty string, or

147 change its behaviour as you wish.

148

149 If you deploy a system-~or site-wide \hgrc\ file that defines some

150 hooks, you should thus understand that your users can disable or

151 override those hooks.

152

153 \subsection{Ensuring that critical hooks are run}

154

155 Sometimes you may want to enforce a policy that you do not want others

156 to be able to work around. For example, you may have a requirement

157 that every changeset must pass a rigorous set of tests. Defining this

158 requirement via a hook in a site-wide \hgrc\ won't work for remote

159 users on laptops, and of course local users can subvert it at will by

160 overriding the hook.

161

162 Instead, you can set up your policies for use of Mercurial so that

163 people are expected to propagate changes through a well-known

164 ``canonical'' server that you have locked down and configured

165 appropriately.

166

167 One way to do this is via a combination of social engineering and

168 technology. Set up a restricted-access account; users can push

169 changes over the network to repositories managed by this account, but

170 they cannot log into the account and run normal shell commands. In

171 this scenario, a user can commit a changeset that contains any old

172 garbage they want.

173

174 When someone pushes a changeset to the server that everyone pulls

175 from, the server will test the changeset before it accepts it as

176 permanent, and reject it if it fails to pass the test suite. If

177 people only pull changes from this filtering server, it will serve to

178 ensure that all changes that people pull have been automatically

179 vetted.

180

181 \section{A short tutorial on using hooks}

182 \label{sec:hook:simple}

183

184 It is easy to write a Mercurial hook. Let's start with a hook that

185 runs when you finish a \hgcmd{commit}, and simply prints the hash of

186 the changeset you just created. The hook is called \hook{commit}.

187

188 \begin{figure}[ht]

189 \interaction{hook.simple.init}

190 \caption{A simple hook that runs when a changeset is committed}

191 \label{ex:hook:init}

192 \end{figure}

193

194 All hooks follow the pattern in example~\ref{ex:hook:init}. You add

195 an entry to the \rcsection{hooks} section of your \hgrc\. On the left

196 is the name of the event to trigger on; on the right is the action to

197 take. As you can see, you can run an arbitrary shell command in a

198 hook. Mercurial passes extra information to the hook using

199 environment variables (look for \envar{HG\_NODE} in the example).

200

201 \subsection{Performing multiple actions per event}

202

203 Quite often, you will want to define more than one hook for a

204 particular kind of event, as shown in example~\ref{ex:hook:ext}.

205 Mercurial lets you do this by adding an \emph{extension} to the end of

206 a hook's name. You extend a hook's name by giving the name of the

207 hook, followed by a full stop (the ``\texttt{.}'' character), followed

208 by some more text of your choosing. For example, Mercurial will run

209 both \texttt{commit.foo} and \texttt{commit.bar} when the

210 \texttt{commit} event occurs.

211

212 \begin{figure}[ht]

213 \interaction{hook.simple.ext}

214 \caption{Defining a second \hook{commit} hook}

215 \label{ex:hook:ext}

216 \end{figure}

217

218 To give a well-defined order of execution when there are multiple

219 hooks defined for an event, Mercurial sorts hooks by extension, and

220 executes the hook commands in this sorted order. In the above

221 example, it will execute \texttt{commit.bar} before

222 \texttt{commit.foo}, and \texttt{commit} before both.

223

224 It is a good idea to use a somewhat descriptive extension when you

225 define a new hook. This will help you to remember what the hook was

226 for. If the hook fails, you'll get an error message that contains the

227 hook name and extension, so using a descriptive extension could give

228 you an immediate hint as to why the hook failed (see

229 section~\ref{sec:hook:perm} for an example).

230

231 \subsection{Controlling whether an activity can proceed}

232 \label{sec:hook:perm}

233

234 In our earlier examples, we used the \hook{commit} hook, which is

235 run after a commit has completed. This is one of several Mercurial

236 hooks that run after an activity finishes. Such hooks have no way of

237 influencing the activity itself.

238

239 Mercurial defines a number of events that occur before an activity

240 starts; or after it starts, but before it finishes. Hooks that

241 trigger on these events have the added ability to choose whether the

242 activity can continue, or will abort.

243

244 The \hook{pretxncommit} hook runs after a commit has all but

245 completed. In other words, the metadata representing the changeset

246 has been written out to disk, but the transaction has not yet been

247 allowed to complete. The \hook{pretxncommit} hook has the ability to

248 decide whether the transaction can complete, or must be rolled back.

249

250 If the \hook{pretxncommit} hook exits with a status code of zero, the

251 transaction is allowed to complete; the commit finishes; and the

252 \hook{commit} hook is run. If the \hook{pretxncommit} hook exits with

253 a non-zero status code, the transaction is rolled back; the metadata

254 representing the changeset is erased; and the \hook{commit} hook is

255 not run.

256

257 \begin{figure}[ht]

258 \interaction{hook.simple.pretxncommit}

259 \caption{Using the \hook{pretxncommit} hook to control commits}

260 \label{ex:hook:pretxncommit}

261 \end{figure}

262

263 The hook in example~\ref{ex:hook:pretxncommit} checks that a commit

264 comment contains a bug ID. If it does, the commit can complete. If

265 not, the commit is rolled back.

266

267 \section{Writing your own hooks}

268

269 When you are writing a hook, you might find it useful to run Mercurial

270 either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config

271 item set to ``true''. When you do so, Mercurial will print a message

272 before it calls each hook.

273

274 \subsection{Choosing how your hook should run}

275 \label{sec:hook:lang}

276

277 You can write a hook either as a normal program---typically a shell

278 script---or as a Python function that is executed within the Mercurial

279 process.

280

281 Writing a hook as an external program has the advantage that it

282 requires no knowledge of Mercurial's internals. You can call normal

283 Mercurial commands to get any added information you need. The

284 trade-off is that external hooks are slower than in-process hooks.

285

286 An in-process Python hook has complete access to the Mercurial API,

287 and does not ``shell out'' to another process, so it is inherently

288 faster than an external hook. It is also easier to obtain much of the

289 information that a hook requires by using the Mercurial API than by

290 running Mercurial commands.

291

292 If you are comfortable with Python, or require high performance,

293 writing your hooks in Python may be a good choice. However, when you

294 have a straightforward hook to write and you don't need to care about

295 performance (probably the majority of hooks), a shell script is

296 perfectly fine.

297

298 \subsection{Hook parameters}

299 \label{sec:hook:param}

300

301 Mercurial calls each hook with a set of well-defined parameters. In

302 Python, a parameter is passed as a keyword argument to your hook

303 function. For an external program, a parameter is passed as an

304 environment variable.

305

306 Whether your hook is written in Python or as a shell script, the

307 hook-specific parameter names and values will be the same. A boolean

308 parameter will be represented as a boolean value in Python, but as the

309 number 1 (for ``true'') or 0 (for ``false'') as an environment

310 variable for an external hook. If a hook parameter is named

311 \texttt{foo}, the keyword argument for a Python hook will also be

312 named \texttt{foo} Python, while the environment variable for an

313 external hook will be named \texttt{HG\_FOO}.

314

315 \subsection{Hook return values and activity control}

316

317 A hook that executes successfully must exit with a status of zero if

318 external, or return boolean ``false'' if in-process. Failure is

319 indicated with a non-zero exit status from an external hook, or an

320 in-process hook returning boolean ``true''. If an in-process hook

321 raises an exception, the hook is considered to have failed.

322

323 For a hook that controls whether an activity can proceed, zero/false

324 means ``allow'', while non-zero/true/exception means ``deny''.

325

326 \subsection{Writing an external hook}

327

328 When you define an external hook in your \hgrc\ and the hook is run,

329 its value is passed to your shell, which interprets it. This means

330 that you can use normal shell constructs in the body of the hook.

331

332 An executable hook is always run with its current directory set to a

333 repository's root directory.

334

335 Each hook parameter is passed in as an environment variable; the name

336 is upper-cased, and prefixed with the string ``\texttt{HG\_}''.

337

338 With the exception of hook parameters, Mercurial does not set or

339 modify any environment variables when running a hook. This is useful

340 to remember if you are writing a site-wide hook that may be run by a

341 number of different users with differing environment variables set.

342 In multi-user situations, you should not rely on environment variables

343 being set to the values you have in your environment when testing the

344 hook.

345

346 \subsection{Telling Mercurial to use an in-process hook}

347

348 The \hgrc\ syntax for defining an in-process hook is slightly

349 different than for an executable hook. The value of the hook must

350 start with the text ``\texttt{python:}'', and continue with the

351 fully-qualified name of a callable object to use as the hook's value.

352

353 The module in which a hook lives is automatically imported when a hook

354 is run. So long as you have the module name and \envar{PYTHONPATH}

355 right, it should ``just work''.

356

357 The following \hgrc\ example snippet illustrates the syntax and

358 meaning of the notions we just described.

359 \begin{codesample2}

360 [hooks]

361 commit.example = python:mymodule.submodule.myhook

362 \end{codesample2}

363 When Mercurial runs the \texttt{commit.example} hook, it imports

364 \texttt{mymodule.submodule}, looks for the callable object named

365 \texttt{myhook}, and calls it.

366

367 \subsection{Writing an in-process hook}

368

369 The simplest in-process hook does nothing, but illustrates the basic

370 shape of the hook API:

371 \begin{codesample2}

372 def myhook(ui, repo, **kwargs):

373 pass

374 \end{codesample2}

375 The first argument to a Python hook is always a

376 \pymodclass{mercurial.ui}{ui} object. The second is a repository object;

377 at the moment, it is always an instance of

378 \pymodclass{mercurial.localrepo}{localrepository}. Following these two

379 arguments are other keyword arguments. Which ones are passed in

380 depends on the hook being called, but a hook can ignore arguments it

381 doesn't care about by dropping them into a keyword argument dict, as

382 with \texttt{**kwargs} above.

383

384

385 %%% Local Variables:

386 %%% mode: latex

387 %%% TeX-master: "00book"

388 %%% End: