On Source Control

By on November 21st 2007

A frighteningly large number of developers don’t use any sort of Source Control for their projects. I know I didn’t throughout most of my collegiate career. This is a shame, really, because often times we end up spending hours trying to ‘undevelop’ a path that didn’t work out, while with good source control, we’d never have to waste that sort of time again. I recently sat through an interview with a gentleman who had only used Source Control in a single company he’d worked with as a contract programmer in 20+ years in the software game. And this was with big companies, like Disney and Lockheed Martin. To be fair to those companies, he was working on older Mainframe applications, but even he acknowledged that based on his brief experience with Source Control, things would have been easier had it existed at those companies.

It doesn’t matter how large a team you have, from a single developer, to a team of hundreds, Source Control will simply make you more efficient. It provides a full history of the application from the time of the import, it provides the ability to take out ‘bad’ changes from source files, it protects users from overwriting another developers changes. At my current job, we lack Source Control, and it’s occured many times where we’ve had two people begin working on the same file, only noticed when our editor starts complaining about the file being modified outside of itself. To a degree, good communication can fix this problem, but it’s just not cost-effective to have to verify that a file isn’t in use each time you go to use it.

When it comes to Source Control, I view there as being two primary ways of thinking. The first, is the File-centric method. This method was the original method, and it is still embodied in tools like Subversion and Perforce. These systems concern themselves primarily with tracking the history of each file in the repository as it has changed over time. These systems can be configured to allow a user to either lock the file, so no one else can modify it, or simply allow multiple users to edit a file, and race for the check-in. Depending on the software, there will be different levels of maturity on the tools used to resolve conflicts when a file has been modified locally between check-outs. These systems tend not to have very mature branching models, as any branch created is more a copy of the trunk than an extension of it, even when the changes are merged from the branch to the trunk, you lose much of the history of the branch. In many respects, this style represents the old guard, the old way of thinking, and these tools have typically begun to integrate a great deal of the features and ideas from the next methodology.

Tree-centric programs don’t treat the files as any more or less important than the other aspects of the tree, the directory structure. These systems typically have a much easier time moving files within the tree. PlasticSCM and git are the two systems that match this way of thinking that I’m most familiar with. In addition to handling file movement better, these systems tend to have much better branching mechanisms, and better support the “Branch-Per-Task” methodology of development, where you create a new branch for every task (either new feature, or bug) that you encounter. By keeping all the development in various branches, a developer can get full source control while they edit files, but ensure that trunk always remains buildable and runnable, which can be very, very important in production environments. By branching heavily, a developer can ensure that their modifications are working, without having to worry about bizarre interplays between the code their modifying and the code someone else is modifying, until both pieces of code are deemed ready and need to be integrated.

More branching does mean more merging, but git and Plastic both have excellent merge-tracking capabilities, largely because in their tree-centric model, they track versions, instead of files, and a version of a file or directory can have multiple parents, offering full history of a file back through every branch that has ever modified that file up to the current version. A very powerul tool, that makes it easier to track who has done what and when.

I think it’s obvious that I’m a big fan of Plastic and git. I have more experience with git, and will continue to love it for Open Source development on distributed projects. Plastic is my choice for team development, however. Sure, it’s expensive (but still cheaper than Perforce), but it’s a great program and it offers really solid tools to maintain code with. But, there are plenty of other options out there too, that I’m simply not as familiar with. My current team is looking at Microsoft’s Team Foundation Server as an alternative, which my manger likes because it offers a lot of built in reporting support that doesn’t exist elsewhere. I’ll be posting more impressions of Perforce, Plastic, and TFS as I become more exposed to those options.

Every team needs to research and decide on the tool that will work best for them. I think Plastic is a great choice, and I’m pulling for it with my team, but in the end, implementing any source control mechanism will be an enormous benefit to our team. The question is “Which Source Control should I use?” Not “Should we use Source Control?”