Version Control with Subversion - Repository Maintenance - Migrating a Repository
Migrating a Repository
A Subversion filesystem has its data spread throughout
various database tables in a fashion generally understood by
(and of interest to) only the Subversion developers
themselves. However, circumstances may arise that call for
all, or some subset, of that data to be collected into a
single, portable, flat file format. Subversion provides such
a mechanism, implemented in a pair of
svnadmin
subcommands:
dump and load .
The most common reason to dump and load a Subversion
repository is due to changes in Subversion itself. As
Subversion matures, there are times when certain changes made
to the back-end database schema cause Subversion to be
incompatible with previous versions of the repository. Other
reasons for dumping and loading might be to migrate a Berkeley
DB repository to a new OS or CPU architecture, or to switch
between Berkeley DB and FSFS back-ends. The recommended
course of action is relatively simple:
-
Using your
current
version of
svnadmin
, dump your repositories to
dump files.
-
Upgrade to the new version of Subversion.
-
Move your old repositories out of the way, and create
new empty ones in their place using your
new
svnadmin
.
-
Again using your
new
svnadmin
, load your dump files into
their respective, just-created repositories.
-
Be sure to copy any customizations from your old
repositories to the new ones, including
DB_CONFIG files and hook scripts.
You'll want to pay attention to the release notes for the
new release of Subversion to see if any changes since your
last upgrade affect those hooks or configuration
options.
-
If the migration process made your repository
accessible at a different URL (e.g. moved to a different
computer, or is being accessed via a different schema),
then you'll probably want to tell your users to run
svn switch --relocate
on their existing
working copies. See
svn switch.
svnadmin dump
will output a range of
repository revisions that are formatted using Subversion's
custom filesystem dump format. The dump format is printed to
the standard output stream, while informative messages are
printed to the standard error stream. This allows you to
redirect the output stream to a file while watching the status
output in your terminal window. For example:
$ svnlook youngest myrepos
26
$ svnadmin dump myrepos > dumpfile
* Dumped revision 0.
* Dumped revision 1.
* Dumped revision 2.
…
* Dumped revision 25.
* Dumped revision 26.
At the end of the process, you will have a single file
(dumpfile in the previous example) that
contains all the data stored in your repository in the
requested range of revisions. Note that
svnadmin
dump
is reading revision trees from the repository
just like any other “reader” process would
(
svn checkout
, for example). So it's safe
to run this command at any time.
The other subcommand in the pair,
svnadmin
load
, parses the standard input stream as a
Subversion repository dump file, and effectively replays those
dumped revisions into the target repository for that
operation. It also gives informative feedback, this time
using the standard output stream:
$ svnadmin load newrepos < dumpfile
<<< Started new txn, based on original revision 1
* adding path : A ... done.
* adding path : A/B ... done.
…
------- Committed new rev 1 (loaded from original rev 1) >>>
<<< Started new txn, based on original revision 2
* editing path : A/mu ... done.
* editing path : A/D/G/rho ... done.
------- Committed new rev 2 (loaded from original rev 2) >>>
…
<<< Started new txn, based on original revision 25
* editing path : A/D/gamma ... done.
------- Committed new rev 25 (loaded from original rev 25) >>>
<<< Started new txn, based on original revision 26
* adding path : A/Z/zeta ... done.
* editing path : A/mu ... done.
------- Committed new rev 26 (loaded from original rev 26) >>>
The result of a load is new revisions added to a
repository—the same thing you get by making commits
against that repository from a regular Subversion client. And
just as in a commit, you can use hook scripts to perform
actions before and after each of the commits made during a load
process. By passing the --use-pre-commit-hook
and --use-post-commit-hook options to
svnadmin load
, you can instruct Subversion
to execute the pre-commit and post-commit hook scripts,
respectively, for each loaded revision. You might use these,
for example, to ensure that loaded revisions pass through the
same validation steps that regular commits pass through. Of
course, you should use these options with care—if your
post-commit hook sends emails to a mailing list for each new
commit, you might not want to spew hundreds or thousands of
commit emails in rapid succession at that list for each of the
loaded revisions! You can read more about the use of hook
scripts in
the section called “Hook Scripts”.
Note that because
svnadmin
uses
standard input and output streams for the repository dump and
load process, people who are feeling especially saucy can try
things like this (perhaps even using different versions of
svnadmin
on each side of the pipe):
$ svnadmin create newrepos
$ svnadmin dump myrepos | svnadmin load newrepos
By default, the dump file will be quite large—much
larger than the repository itself. That's because every
version of every file is expressed as a full text in the
dump file. This is the fastest and simplest behavior, and nice
if you're piping the dump data directly into some other
process (such as a compression program, filtering program, or
into a loading process). But if you're creating a dump file for
longer-term storage, you'll likely want to save disk space by
using the --deltas switch. With this option,
successive revisions of files will be output as compressed,
binary differences—just as file revisions are stored in
a repository. This option is slower, but results in a
dump file much closer in size to the original
repository.
We mentioned previously that
svnadmin
dump
outputs a range of revisions. Use the
--revision option to specify a single
revision to dump, or a range of revisions. If you omit this
option, all the existing repository revisions will be
dumped.
$ svnadmin dump myrepos --revision 23 > rev-23.dumpfile
$ svnadmin dump myrepos --revision 100:200 > revs-100-200.dumpfile
As Subversion dumps each new revision, it outputs only
enough information to allow a future loader to re-create that
revision based on the previous one. In other words, for any
given revision in the dump file, only the items that were
changed in that revision will appear in the dump. The only
exception to this rule is the first revision that is dumped
with the current
svnadmin dump
command.
By default, Subversion will not express the first dumped
revision as merely differences to be applied to the previous
revision. For one thing, there is no previous revision in the
dump file! And secondly, Subversion cannot know the state of
the repository into which the dump data will be loaded (if it
ever, in fact, occurs). To ensure that the output of each
execution of
svnadmin dump
is
self-sufficient, the first dumped revision is by default a
full representation of every directory, file, and property in
that revision of the repository.
However, you can change this default behavior. If you add
the --incremental option when you dump your
repository,
svnadmin
will compare the first
dumped revision against the previous revision in the
repository, the same way it treats every other revision that
gets dumped. It will then output the first revision exactly
as it does the rest of the revisions in the dump
range—mentioning only the changes that occurred in that
revision. The benefit of this is that you can create several
small dump files that can be loaded in succession, instead of
one large one, like so:
$ svnadmin dump myrepos --revision 0:1000 > dumpfile1
$ svnadmin dump myrepos --revision 1001:2000 --incremental > dumpfile2
$ svnadmin dump myrepos --revision 2001:3000 --incremental > dumpfile3
These dump files could be loaded into a new repository with
the following command sequence:
$ svnadmin load newrepos < dumpfile1
$ svnadmin load newrepos < dumpfile2
$ svnadmin load newrepos < dumpfile3
Another neat trick you can perform with this
--incremental option involves appending to an
existing dump file a new range of dumped revisions. For
example, you might have a post-commit hook
that simply appends the repository dump of the single revision
that triggered the hook. Or you might have a script that runs
nightly to append dump file data for all the revisions that
were added to the repository since the last time the script
ran. Used like this,
svnadmin
's
dump and load commands
can be a valuable means by which to backup changes to your
repository over time in case of a system crash or some other
catastrophic event.
The dump format can also be used to merge the contents of
several different repositories into a single repository. By
using the --parent-dir option of
svnadmin
load
, you can specify a new virtual root directory
for the load process. That means if you have dump files for
three repositories, say calc-dumpfile ,
cal-dumpfile , and
ss-dumpfile , you can first create a new
repository to hold them all:
$ svnadmin create /path/to/projects
$
Then, make new directories in the repository which will
encapsulate the contents of each of the three previous
repositories:
$ svn mkdir -m "Initial project roots" \
file:///path/to/projects/calc \
file:///path/to/projects/calendar \
file:///path/to/projects/spreadsheet
Committed revision 1.
$
Lastly, load the individual dump files into their
respective locations in the new repository:
$ svnadmin load /path/to/projects --parent-dir calc < calc-dumpfile
…
$ svnadmin load /path/to/projects --parent-dir calendar < cal-dumpfile
…
$ svnadmin load /path/to/projects --parent-dir spreadsheet < ss-dumpfile
…
$
We'll mention one final way to use the Subversion
repository dump format—conversion from a different
storage mechanism or version control system altogether.
Because the dump file format is, for the most part,
human-readable,
[18]
it should be relatively easy to describe generic sets of
changes—each of which should be treated as a new
revision—using this file format. In fact, the
cvs2svn
utility (see
the section called “Converting a Repository from CVS to Subversion”) uses the dump format to represent the
contents of a CVS repository so that those contents can be
copied into a Subversion repository.
[an error occurred while processing this directive]
|