Using CVS branches
I do a lot of crazy programming for a living and I use CVS extensively. It allows me to keep track of various versions, see what others have done and so on. It is a great tool.
Recently, I got into my head to use “branches”. The idea is this: suppose I want to work on crazy code without breaking everyone’s code, then I create a private (and maybe temporary) version of the code where I can break everything I want to break. Later, presumably, I can merge my changes back into “HEAD” (HEAD being the main “branch” everyone uses by default).
Branches are scarcely documented: you have the semi-official documentation and other pages written by average users.
Here’s what I understand. Firstly, make sure you commit all your current changes (”cvs ci” will do it). Go into the subdirectory that you want to branch off (I assume here you don’t want to branch off the entire source base). Now, create a new branch like this:
cvs tag -b Branchname
You’ve now created a branch, but your current directory “remains” in the main branch, to switch, do:
cvs update -r Branchname
Hack as much as you want and commit your changes when you are done (”cvs ci”).
Now, suppose your branch is completed and you want to go back to the trunk, you must first exit your current directory from the branch, you do it this way:
cvs update -A
(Question: would “cvs update -r HEAD” do it?)
Finally, bring the changes you did in the branch back to the trunk (j stands for “join”):
cvs update -j Branchname
That’s it. Now, I’ve got no idea if this is safe or if you can go back to the branch later easily.
(Disclaimer: I’ve got no idea if this is accurate or not, but it worked for me.)
Branches can actually be extremely powerful once you learn to trust them. As you suggest, you can have arbitrarily many concurrent experiments without disrupting the version that has to keep working — and you can merge them all together in the end if you want. CVS is actually amazingly good at properly combining even large scale changes.
Another thing it’s very handy for is tracking third party sources. In this model, you import the third party source directly onto a vendor branch (and you can have multiple parallel vendor branches if your project incorporates multiple projects). You do your own work on the HEAD branch. When the third party source is updated, you commit it to the vendor branch and then merge the differences into your own branch. Very handy, and usually simple.
There is a detail regarding the -j option to update that is worth learning. To quote from the CVS manual:
“With two `-j’ options, merge changes from the revision specified with the first `-j’ option to the revision specified with the second `j’ option, into the working directory.
With one `-j’ option, merge changes from the ancestor revision to the revision specified with the `-j’ option, into the working directory. The ancestor revision is the common ancestor of the revision which the working directory is based on, and the revision specified in the `-j’ option.”
So if use just one -j, then you will get what you expect the first time: the differences between the latest version on the branch and the original branch point will be merged into the trunk (HEAD) version. But if you then make more changes on the branch and then repeat the procedure, you will again merge all the differences between the latest version of the branch and the branch point — including all the differences that were included in the first merge. If you had conflicts to resolve the first time, you will have to deal with them again. To avoid this, you can use the two -j option to merge just the differences since the last merge.
Here is a nice safe way to do a merge. The steps are accurate, but the commands are from fading memory and may not be exact — in particular, I almost always use ‘cvs rtag’ to tag whole modules rather than ‘cvs tag’ to tag checked out subsets of files, and I don’t know if the ‘-r branchtag’ option is interpreted the same way. Now that I think about it, branching (and later merging) subsets of a module is probably inviting trouble if you ever have to do merges of overlapping subsets. Was there a particular reason for wanting to avoid tagging a whole module? The cost (both space and time) of tagging an entire module is very low. In any case, here’s the procedure:
1. Tag the branch to label the merge point (as ‘merge_1′ for the first merge, ‘merge_2′ for the second, and so on):
cvs tag -r Branchname merge_1
2. Tag the trunk to create a recoverable version in case the merge goes awry:
cvs tag premerge_Branchname_1
3. Update to the HEAD revision:
cvs update -A
4. Merge the changes:
cvs update -j merge_1 (for the first merge)
cvs update -j merge_1 -j merge_2 (and similarly for all subsequent merges)
5. Resolve conflicts as necessary and commit the changes before complicating things with new changes:
cvs commit
6. Tag the trunk one more time to so that you can exactly recreate the result of the merge if you have to track down a merge-related problem later on:
cvs tag postmerge_Branchname_1
Also note that it’s hard to reconstruct the steps you took from the log file. It’s a good practice to maintain a hand-written log of the exact steps you took (and when) in a file that is kept on the HEAD branch.
Comment by Scott — 7/4/2005 @ 13:39