Subversion vs Distributed Source Control

Jeff Atwood over at Coding Horror had a post this weekend espousing the wonders of Subversion (SVN) on Windows. Atwood has long been a supporter that developers should be using any source control (except Visual Source Safe), a sentiment that I completely agree with.

However, I find myself disagreeing with Atwood on certain points. First off, I’ve actually met a Developer who was happier with SourceSafe than Microsoft’s new Team Foundation Server. Of course, this was largely because of some bizarro practices at the company he worked for where they had a multitude of Database Stored Procedures (some of which handled things like sending e-mails), and Source Safe allowed them to map a small selection of these stored procedures between repositories. From what I gathered, they had a repository that contained all the stored procs, and then individual project repositories. SourceSafe allowed them to map a selection of stored procedures to the individual project repositories in such a way that they only saw the stored procedures relevant to their project, but any changes made to the stored procedure would be a change shared across all projects that used that procedure. Team System doesn’t allow this. Neither would Subversion or Perforce, for that matter. In both those systems (I don’t know enough about Team System to comment), there seems to be no easy way to map those files, as both of them would create a discrete copy of the procedure file, which would lose the auto-update capability.

Incidentally, PlasticSCM would support this sort of behavior, through the use of custom selectors, though this would take some work to setup. Regrettably, I don’t know of any Free Source Controls capable of this, though there are several projects that I simply have no experience with. However, I’m willing to bet that if you have this level of interplay between repositories, there was likely a major design flaw at some point. git’s “super projects” support, might fulfill this need, but I’ve not used it.

Linus Torvalds, creator of git and of course the Linux Kernel, gave a talk at Google a little while back. Regrettably, git does not yet run on Windows effectively due to some file-system weirdness between Unix and Windows. Git resulted from BitKeeper pulling their open-source licensing plan because of an accusation that Andrew Tridgell reverse engineered bitkeeper. Torvalds harbors a deep hatred of CVS, and by extension SVN. His talk is amusing, and it clearly tainted by his feelings. While I may not agree with him completely, I do agree with the general sentiment that Distributed Source Control is a necessity.

Atwood makes the claim that since Source Control is only recently mainstream, that the idea that most developers would even consider distributed source control is ridiculous. Frankly, I think once you get past the idea that Source Control is hard, distributed source control is a very easy step, and it’s incredibly useful.

With Plastic 2.0, it appears that Codice Software has taken large strides in allowing distributed development, as each developer could have their own Plastic server which they control entirely, and then they simply create change packages to send to whoever needs it. But Plastic also supports centralized development allowing a house to do both.

Should you rush out and buy Plastic? Maybe not. For my development at home, I favor Git without question. At work, I favored Plastic because we do Windows-based development, and centralized source control is more convenient for backup purposes as well as keeping tabs on who is committing what. This is the real reason why I believe that distributed systems will have trouble catching on. By their very nature, true distributed systems make it harder to track statistical information about who is committing what, how often, and how much. Project Managers live for this sort of statistical information (though this is changing as Agile Development and Scrumm catch on). In distributed systems, the only thing to measure a contributor on is the final product of the contribution.

Distributed Development just makes sense though, when you look at the new world of Agile Development. Everyone works in their own sandbox, able to check in and branch at will. Developers are free to share their in-progress changes (if necessary), otherwise integration issues only arise when you reach an integration step, which is the only place you want merge conflicts and issues to occur. Everywhere else development needs to be fluid. Plastic does a great job of bridging the gap between central and distributed development. It’s not a ‘perfect’ distributed system, and there are a few features it’s missing that I really want (mainly an API and ‘triggers’ system), but it allows for heavy branching, and distributed repository servers.

Ultimately, any source control is better than none. However, a distributed model has some really powerful benefits. Consider something like Plastic with its Distributed support, git if you’re on Unix, or even Mercurial (which I’ve never used. Depending on central source control provides a single point of failure, as well as a single point of attack. If your central source repository is compromised, you’ll have to restore from backup (and depending on your backup scheme, you could lose data). With a distributed development system, everyone has the shared development history of the entire project, so you can grab the history from any other developer, losing only the non-shared changes you had been working on. Permissions aren’t an issue since everyone is working on their own local systems, and all anyone has to submit are deltas to whoever requires them. To anyone who has begun using Source Control, the benefits of a system like this should be obvious.