Tuesday, 2 November 2010

Why you should use DVCS for Personal Projects

For a long time it bugged me that there wasn’t an easy way for me to use version control for my personal projects. I have over 100 small applications sitting around on my computer. Most of them are just test apps that will probably never be visited again. But others are more useful utilities, or perhaps future open source projects currently in incubation. Some of them are work related, others purely for personal enjoyment or learning. Often I found myself wishing that I could have the benefits of source control, so I could back out of changes that broke something.

Pre Distributed Version Control Systems


In the days before DVCS (or, to be more accurate, before I knew about DVCS), at different times, I tried all the following approaches:
  
Store projects on my company’s VCS. This usually involved asking permission to have some space on SourceSafe or TFS for my personal projects. Whilst this has the benefit of meaning that I can share my work with others easily (and it is guaranteed to be backed up), there is the hassle of getting this set up in the first place, plus the fact that some of these projects are very shortlived, while others are “skunkworks” ideas which you don’t want to give publicity to until they are ready for it.

Run a private VCS server on my dev machine. It is possible to install Subversion, SourceSafe, TFS etc servers on your local machine, and use them for VCS. However, as well as using up valuable resources on your personal machine, it has a very poor migration story. If you rebuild your PC, need to quickly copy code onto a USB stick and work on it from home, this option ends up being more hassle than it is worth.

Subversion file-based repository. A few years ago I discovered that you could get Subversion to back up to a file:// path. I thought this would be the answer to all my issues. The reality is, that it was quite fragile, especially since I was storing code on USB sticks, so the drive letter might change. I ended up corrupting my repositories so regularly I gave up on this option pretty quickly.

Make it open source. When CodePlex showed up, I immediately moved several of my projects there. This meant I had access to a free central repository, enabling me to work from different computers if I needed to. The downside is that most of my projects weren’t appropriate for making into open source applications.

Don’t bother with version control and make backup zip files. This ended up being my most common approach. Every now and then I would backup to a zip file. Of course, those backups are few and far between and don’t even exist for most projects. And I almost never had a backup available on those few occasions when I genuinely needed one.

Advantages of DVCS


But all that has changed after I decided to find out what all the fuss was about with Distributed Version Control systems. Whilst the idea seemed a little crazy to me at first (everyone gets a copy of the entire repostory?), the obvious advantages for personal projects won me over pretty quickly. I decided to try out Mercurial, since it seemed to have slightly better support on Windows, although I’m sure Git is just as good. Here’s some of the top advantages to using it on your personal projects:

No Server Required. This is a huge benefit. I don’t need a central server on the internet, or on my company network. If I want to move to another PC, I can just copy the code folder over (or Sync folders using something like DropBox) and it just works.

Version Control Everything. It’s now a no-brainer to put a new test project under version control. It takes only a few seconds to do. If for some reason you decide you don’t want version control anymore, just delete the .hg folder and its gone.

Migrate to a Central Repository Later. When I added NAudio to Codeplex, it already had been in development for several years. However, I have no checkin history up to that point, just a bunch of backup zip files. With Mercurial, you can move to using a centralised repository at any point in the future (whether public or private) and all your checkin history comes along for the ride.

Unconstrained branching strategy. Admittedly, for small personal projects, branching is not often that important. But it can come in handy when are half way through implementing one feature, and then want to work on a different task. Without version control, you have to decide whether to bin the half-finished changes, or to copy them somewhere else and manually merge them back in later. With Mercurial, it becomes trivial to create as many branches as you need. And you can merge directly between any two branches, irrespective of how many intermediate branches were created between them.

Merging divergent copies. Sometimes I have a copy of a personal project at home and at work and have no idea which one is the latest and greatest. Or maybe after backing it up to a few places, I have inadvertently made changes to two separate copies. One membership application I wrote for a youth group 8 years ago turns out to be still in use and they asked me for new features recently. I had to work out which of several copies was the one I should be using. With Mercurial, it is trivial to ensure that one copy is not missing any changes in another.

Little and often checkins. One really nice feature for my open source projects is that I can check in little changes without needing to immediately push them to the central server. This means I can check in little and often, and only do a push to the server once I have tested and made sure my feature is robust.

What was I doing?. Another advantage is that by using a DVCS with my personal projects, if I need to come back to one after a couple of years I can quickly examine the log, looking at diffs to see what I was up to last time I worked on it. This can be handy if I had left it in a state where there were some half-finished new features in progress, and actually I want to discard them and resume from an earlier point.

Trivial rollback – one of the things that scared me about DVCS was the idea that once I had checked in a file, it lives on in the repository forever. So accidentally checkin several megabytes of compiled binaries and you have an unnecessarily bloated repository. But the reality is that issue only exists if you push that to a central repository (and even then there are usually ways of working round the issue). If you haven’t you can just clone to the prior revision and your mistake is gone. I’ll perhaps do a post later on my thoughts about DVCS in the enterprise, where matters like this are quite important.

If you take anything away from this post, it is learn a DVCS and use it wherever you can. You will be glad you did.

4 comments:

blorq said...

Just as soon as the tooling crawls out of the dark ages...

Mark H said...

@blorq - have you tried TortoiseHg? I think the tooling is fine. And to be honest, once you get the hang of using the command line, it quickly becomes very natural.

There's also VisualHg for Visual Studio integration if that's important to you.

yanjost said...

and bitbucket.org offers free hosting for mercurial projects, with unlimited users !

Mark H said...

@yanjost - yes, bitbucket is brilliant. now don't need to worry about whether my personal projects can be open sourced or not