Sunday, 9 March 2008

Branching and merging strategies in Subversion

I've been using svn for several years privately, and for a bit over a year as part of our core workflow at work and in all that time I had only ever thought about one way of using it.

Turns out there are at least two distinct usage patterns, plus variants. And it even describes them in the svnbook, though I had overlooked the relevant section. At first I couldn't believe there was any other sensible way to do things, but I'm quickly coming to realise that this is an important learning point, so I thought I'd share my progress so far in case it benefits others.

The way I thought it worked:

In Subversion, assuming you're creating a repository per project, you set up three main folders within your repository; Trunk, Tags and Branches.

  • Trunk is the main run of development. This is the release code.
  • Tags are used to snapshot significant versions of the trunk.
  • Branches are used to separate out any parallel course of development, like a major new feature that's going to be worked on over days, weeks or months.

This was such a foundational understanding for me that when a colleague started talking about a different way to do things it sounded like crazy talk - more fool me. The other way: With the same folder structure...

  • Trunk is dev. This is the main course of development and is usually ahead of release code. All checkins are done against the trunk.
  • When code is nearly ready to be released, it's copied to a release branch (e.g. /branches/1.0) for final testing. Development can continue against both the trunk and the branch independently, but only bug fixes get checked into the release branch; feature work continues in the trunk. Changes only get copied between the two at the discretion of the developers.
  • When testing is complete, it's copied to a tag as a reference snapshot, then packaged and released to customers.

The more I think about it, the more useful this new (to me) way of doing things seems. With the first way, you always have problems if you have multiple concurrent lines of development because they quickly get out of step with each other and the trunk, and merging becomes a real problem.

This is because development continues in the trunk as well as on each of the feature branches, and developers don't find and reconcile any conflicts until potentially much later on. By definition, feature branches are based on out-of-date versions of the codebase and this creates enormous scope for bugs, regressions and general mistakes when merging later.

But the release branch method always comes at a cost, which is that the trunk could often be broken. The point is this doesn't matter because if you're using either method properly, you should always have tagged revisions which describe the current state of your release code, and in either method, you always have the ability to take this current working code and iterate on it for minor features or bug fixes without causing problems in the development version.

One thing all this has revealed is that we're not using svn properly at the moment at work. Another thing is that doing it properly either way requires a lot better management and information about the state of the project and the repository than we're used to having.

Here's a link to the svnbook chapter which contains the relevant section on Common branching strategies

There's also some useful information over at Collabnet in a post called "Branching strategy questioned"

No comments: