Before jumping into the broader topic of repository
administration, let's further define what a repository is. How
does it look? How does it feel? Does it take its tea hot or
iced, sweetened, and with lemon? As an administrator, you'll be
expected to understand the composition of a repository both from
a logical perspective—dealing with how data is represented
inside the repository—and from a physical nuts-and-bolts
perspective—how a repository looks and acts with respect
to non-Subversion tools. The following section covers some of
these basic concepts at a very high level.
Understanding Transactions and Revisions
Conceptually speaking, a Subversion repository is a
sequence of directory trees. Each tree is a snapshot of how
the files and directories versioned in your repository looked
at some point in time. These snapshots are created as a
result of client operations, and are called revisions.
Every revision begins life as a transaction tree. When
doing a commit, a client builds a Subversion transaction that
mirrors their local changes (plus any additional changes that
might have been made to the repository since the beginning of
the client's commit process), and then instructs the
repository to store that tree as the next snapshot in the
sequence. If the commit succeeds, the transaction is
effectively promoted into a new revision tree, and is assigned
a new revision number. If the commit fails for some reason,
the transaction is destroyed and the client is informed of the
failure.
Updates work in a similar way. The client builds a
temporary transaction tree that mirrors the state of the
working copy. The repository then compares that transaction
tree with the revision tree at the requested revision (usually
the most recent, or “youngest” tree), and sends
back information that informs the client about what changes
are needed to transform their working copy into a replica of
that revision tree. After the update completes, the temporary
transaction is deleted.
The use of transaction trees is the only way to make
permanent changes to a repository's versioned filesystem.
However, it's important to understand that the lifetime of a
transaction is completely flexible. In the case of updates,
transactions are temporary trees that are immediately
destroyed. In the case of commits, transactions are
transformed into permanent revisions (or removed if the commit
fails). In the case of an error or bug, it's possible that a
transaction can be accidentally left lying around in the
repository (not really affecting anything, but still taking up
space).
In theory, someday whole workflow applications might
revolve around more fine-grained control of transaction
lifetime. It is feasible to imagine a system whereby each
transaction slated to become a revision is left in stasis well
after the client finishes describing its changes to
repository. This would enable each new commit to be reviewed
by someone else, perhaps a manager or engineering QA team, who
can choose to promote the transaction into a revision, or
abort it.