Friday 16 October 2009

Merge-Friendly Code

If you are working a large application where you have to simultaneously support hotfixes, service packs and new features for customers running several different versions, while developing new versions of your software, chances are you have some kind of branching structure in your source control system.

And before too long, you will experience the joy of merging features from one branch into another. Here’s my top six tips for writing code that is easy to merge…

1. Little and Often

The first principle is that it is better to make many small, focused check-ins, and merge them early, rather than checking a vast collection of changes in one hit and attempting a gigantic merge after several months of development. Sadly, this is not always possible, as sometimes a major change has to be kept out of a branch for a long time.

2. Get Your Branch Strategy Right

It is worth spending some time making sure you choose the correct branching strategy. A lot could be said about this, but here’s just two things to consider.

Don’t put two major features into one branch unless you are willing to deploy them together. Although you can “cherry-pick” changesets to merge, the reality is that this only works if your changes are completely isolated from one another. In other words if changeset B depends on something that was in changeset A, then you can’t just merge changeset B into another branch, changeset A has to come along for the ride too.

Also, avoid getting into the situation where you need to merge between two branches that are not directly related. In TFS, these are called “baseless merges”, and they can result in deleted code getting re-inserted.

3. Check in Clean Code

Nothing messes up a merge more quickly than someone making widespread “cosmetic” changes – introducing or deleting whitespace, renaming things, moving methods to a different part of the file, etc. Sweeping changes like this have a high probability of conflicting with someone else’s change.

The solution is of course, to reduce the need for this kind of change by making sure that what you check in is formatted correctly, and follows the appropriate coding standards and naming conventions. Tools like StyleCop, Resharper, and FxCop are all able to help here.

4. Single Responsibility Principle (SRP)

Simply put, the Single Responsibility Principle dictates that every class should have one and only one responsibility. If it has two or more, you should extract functionality into additional classes. Similarly every method should perform one and only one task.

Adhering to this straightforward principle results in many classes, each composed of short methods. Very often merge conflicts are due to two people working on the same file or method, but changing it for very different reasons. But if a class or method has only “one reason to change”, then the chances of two developers working on different features needing to simultaneously change it are greatly reduced.

5. Open Closed Principle (OCP)

The Open Closed Principle states that classes should be open for extension but closed for modification. Or to say it another way, it should be possible to add new features and capabilities to your codebase simply by creating new classes, rather than having to mess with the internals of existing classes. And if you use technologies like MEF, it really is possible to add whole new features without touching a single line of your existing codebase.

Obviously, in any large real-world application, there will always be the need to make some changes to legacy code. But this should be the exception rather than the norm. In fact, the only real reasons to change the existing code are to fix bugs, and to make it more extensible.

If commits to source control of new features consist mostly of changes to existing files rather than the creation of new ones, maybe you need to do some research into how you can better design your classes to conform to OCP.

6. Unit Tests

After performing a merge, it is vital that you are able to determine as quickly as possible if you made any mistakes in the conflict resolution process. A unit test suite with good code coverage is invaluable in helping with this task. Obviously this may need to be complemented with further integration testing and manual testing, but the quicker you can identify problems with a merge, the better.

So those are my top recommendations for avoiding merging nightmares. Anyone got any other suggestions?

2 comments:

Matt said...

Really good points.

Two of my colleagues recently ended up at work at 10 o'clock at night trying to complete a merge - unfortunately there was absolutely no need for them to have done what they were doing - merging check-ins out of sequence!

Here are some of my thoughts:
1) Totally agree. We follow a scrum-like process so we have break everything down into tasks that are generally less than two days in length - there could be a number of tasks in a user story but there are probably 4 or so user stories in an iteration (2 weeks) - I personally don't think we've got the balance quite right - smaller user stories would allow us to get into the 5-10 user story per iteration range and allow much more frequent check-ins.

2) For this we have a "king" user story - only this story can check-in until it is "done-done" at which point the next feature becomes king. This seems to work well for small stories and a small team. As we've gone from 4 to 8 people it has become more difficult to share work on features that are not the king (shelvesets in TFS are a pain for sharing code). We haven't solved this problem yet and may have to move to branches for each user story (our user stories tend to be reasonably large).

3) I'm not entirely convinced by this - I think it depends on how long it has been since the merge happened - often it is important to be able to move things around and rename them as time goes by otherwise the code can rot. Also good merge tools help (I use Beyond Compare)

4 & 5) I find it worrying that a lot of software developers don't even seem to think about this stuff - good developers generally end up at these patterns even if they don't know the "official" terminology.

6) I very much agree. Unit testing is SO vital - my colleagues who made such a hash of their merge were saved by the unit tests otherwise there would have been a disaster.

Unknown said...

Hi Matt,
great points. I would like to give beyond compare a try as I have heard very good things about it. On point 3 I should clarify that I do a lot of cleaning up on our main development branch, but on support and hotfix branches we discourage this as we want to keep the scope of those changes very limited.