As is especially the case when developing software, the data
that you maintain under version control is often closely related
to, or perhaps dependent upon, someone else's data. Generally,
the needs of your project will dictate that you stay as
up-to-date as possible with the data provided by that external
entity without sacrificing the stability of your own project.
This scenario plays itself out all the time—anywhere that
the information generated by one group of people has a direct
effect on that which is generated by another group.
For example, software developers might be working on an
application which makes use of a third-party library.
Subversion has just such a relationship with the Apache Portable
Runtime library (see
the section called “The Apache Portable Runtime Library”). The
Subversion source code depends on the APR library for all its
portability needs. In earlier stages of Subversion's
development, the project closely tracked APR's changing API,
always sticking to the “bleeding edge” of the
library's code churn. Now that both APR and Subversion have
matured, Subversion attempts to synchronize with APR's library
API only at well-tested, stable release points.
Now, if your project depends on someone else's information,
there are several ways that you could attempt to synchronize that
information with your own. Most painfully, you could issue oral
or written instructions to all the contributors of your project,
telling them to make sure that they have the specific versions
of that third-party information that your project needs. If the
third-party information is maintained in a Subversion
repository, you could also use Subversion's externals
definitions to effectively “pin down” specific
versions of that information to some location in your own
working copy directory (see
the section called “Externals Definitions”).
But sometimes you want to maintain custom modifications to
third-party data in your own version control system. Returning
to the software development example, programmers might need to
make modifications to that third-party library for their own
purposes. These modifications might include new functionality
or bug fixes, maintained internally only until they become part
of an official release of the third-party library. Or the
changes might never be relayed back to the library maintainers,
existing solely as custom tweaks to make the library further
suit the needs of the software developers.
Now you face an interesting situation. Your project could
house its custom modifications to the third-party data in some
disjointed fashion, such as using patch files or full-fledged
alternate versions of files and directories. But these quickly
become maintenance headaches, requiring some mechanism by which
to apply your custom changes to the third-party data, and
necessitating regeneration of those changes with each successive
version of the third-party data that you track.
The solution to this problem is to use vendor
branches. A vendor branch is a directory tree in
your own version control system that contains information
provided by a third-party entity, or vendor. Each version of
the vendor's data that you decide to absorb into your project is
called a vendor drop.
Vendor branches provide two key benefits. First, by storing
the currently supported vendor drop in your own version control
system, the members of your project never need to question
whether they have the right version of the vendor's data. They
simply receive that correct version as part of their regular
working copy updates. Secondly, because the data lives in your
own Subversion repository, you can store your custom changes to
it in-place—you have no more need of an automated (or
worse, manual) method for swapping in your customizations.
General Vendor Branch Management Procedure
Managing vendor branches generally works like this. You
create a top-level directory (such as
/vendor
) to hold the vendor branches.
Then you import the third party code into a subdirectory of
that top-level directory. You then copy that subdirectory
into your main development branch (for example,
/trunk
) at the appropriate location. You
always make your local changes in the main development branch.
With each new release of the code you are tracking you bring
it into the vendor branch and merge the changes into
/trunk
, resolving whatever conflicts
occur between your local changes and the upstream
changes.
Perhaps an example will help to clarify this algorithm.
We'll use a scenario where your development team is creating a
calculator program that links against a third-party complex
number arithmetic library, libcomplex. We'll begin with the
initial creation of the vendor branch, and the import of the
first vendor drop. We'll call our vendor branch directory
libcomplex
, and our code drops will go
into a subdirectory of our vendor branch called
current
. And since
svn
import
creates all the intermediate parent
directories it needs, we can actually accomplish both of these
steps with a single command.
$ svn import /path/to/libcomplex-1.0 \
https://svn.example.com/repos/vendor/libcomplex/current \
-m 'importing initial 1.0 vendor drop'
…
We now have the current version of the libcomplex source
code in /vendor/libcomplex/current
. Now,
we tag that version (see
the section called “Tags”)
and then copy it into the main development branch. Our copy
will create a new directory called
libcomplex
in our existing
calc
project directory. It is in this
copied version of the vendor data that we will make our
customizations.
$ svn copy https://svn.example.com/repos/vendor/libcomplex/current \
https://svn.example.com/repos/vendor/libcomplex/1.0 \
-m 'tagging libcomplex-1.0'
…
$ svn copy https://svn.example.com/repos/vendor/libcomplex/1.0 \
https://svn.example.com/repos/calc/libcomplex \
-m 'bringing libcomplex-1.0 into the main branch'
…
We check out our project's main branch—which now
includes a copy of the first vendor drop—and we get to
work customizing the libcomplex code. Before we know it, our
modified version of libcomplex is now completely integrated
into our calculator program.
[39]
A few weeks later, the developers of libcomplex release a
new version of their library—version 1.1—which
contains some features and functionality that we really want.
We'd like to upgrade to this new version, but without losing
the customizations we made to the existing version. What we
essentially would like to do is to replace our current
baseline version of libcomplex 1.0 with a copy of libcomplex
1.1, and then re-apply the custom modifications we previously
made to that library to the new version. But we actually
approach the problem from the other direction, applying the
changes made to libcomplex between versions 1.0 and 1.1 to our
modified copy of it.
To perform this upgrade, we checkout a copy of our vendor
branch, and replace the code in the
current
directory with the new libcomplex
1.1 source code. We quite literally copy new files on top of
existing files, perhaps exploding the libcomplex 1.1 release
tarball atop our existing files and directories. The goal
here is to make our current
directory
contain only the libcomplex 1.1 code, and to ensure that all
that code is under version control. Oh, and we want to do
this with as little version control history disturbance as
possible.
After replacing the 1.0 code with 1.1 code,
svn
status
will show files with local modifications as
well as, perhaps, some unversioned or missing files. If we
did what we were supposed to do, the unversioned files are
only those new files introduced in the 1.1 release of
libcomplex—we run
svn add
on those to
get them under version control. The missing files are files
that were in 1.0 but not in 1.1, and on those paths we run
svn delete
. Finally, once our
current
working copy contains only the
libcomplex 1.1 code, we commit the changes we made to get it
looking that way.
Our current
branch now contains the
new vendor drop. We tag the new version (in the same way we
previously tagged the version 1.0 vendor drop), and then merge
the differences between the tag of the previous version and
the new current version into our main development
branch.
$ cd working-copies/calc
$ svn merge https://svn.example.com/repos/vendor/libcomplex/1.0 \
https://svn.example.com/repos/vendor/libcomplex/current \
libcomplex
… # resolve all the conflicts between their changes and our changes
$ svn commit -m 'merging libcomplex-1.1 into the main branch'
…
In the trivial use case, the new version of our
third-party tool would look, from a files-and-directories
point of view, just like the previous version. None of the
libcomplex source files would have been deleted, renamed or
moved to different locations—the new version would
contain only textual modifications against the previous one.
In a perfect world, our modifications would apply cleanly to
the new version of the library, with absolutely no
complications or conflicts.
But things aren't always that simple, and in fact it is
quite common for source files to get moved around between
releases of software. This complicates the process of
ensuring that our modifications are still valid for the new
version of code, and can quickly degrade into a situation
where we have to manually recreate our customizations in the
new version. Once Subversion knows about the history of a
given source file—including all its previous
locations—the process of merging in the new version of
the library is pretty simple. But we are responsible for
telling Subversion how the source file layout changed from
vendor drop to vendor drop.
[an error occurred while processing this directive]