RSS

Inline SCM: A Proposal For A Better SVN or Git

Recently, my team had to shift focus in development slightly. Our changes required us to go back to the latest QA verified build in our source control repository (in our case Subversion). We started working off of this version to ensure that any new stuff was branching off from a verified source. There are some problems with this, of course. Source control programs have great ability to tag, branch, and merge code. One thing that is weak is reverting and rolling back changes.

An Example

Joe at Apple checks in r1000 to his team’s SVN repository. This version marks the release candidate of, say, PhotoBooth 1.0. QA tests this version and confirms it works, and Apple ships r1000. Gearing up for the next release, Joe’s team start commiting to the repository. Joe checks in r1500, which is in very good condition. Over the last 500 checkins, Joe’s team has added a bunch of cool new features and fixed up a bunch of pesky bugs that existed in r1000 (the last public release). The only thing wrong with r1500 is that a few of the new features are not quite implemented, and some things that worked in r1000 don’t work in r1500.

Now, Steve Jobs rolls into Joe’s office and has a great idea for a new quick feature to add to Photo Booth, and he wants that new feature done and released to the public in one week. This new feature itself won’t take long at all, but for Joe’s team to fix all the things wrong with r1500, they would need much longer than a week. So instead, Joe reverts back to r1000, which he know works pretty well. The team branches and works off of r1000 and creates the new feature Steve wanted. They are now ready for release.

But… old bugs that were fixed between r1000 and r1500 are now reappearing. Since all the work that’s taken place in between r1000 and r1500 (new features and bug fixes) was dropped to get this new release out, many old problems that were fixed come back up.

Now begins the painful process of seeing at which commit points certain bugs were fixed and then trying to merge all of these back. This is a pain, messy, and needs to be reworked.

A New Way

I suggest a new way to handle source control management. SCM has always done a great job of tracking changes among a large set of files. The one thing it does not do so great a job on is tracking changes within the same file. SVN will let you diff between two file revisions, but we cannot comment what specifically changed at the code level, nor can we even comment what happened at the file level, only at the commit level.

Consider the following screenshot:

svn

Imagine if XCode or any IDE of your choice would handle your SCM. Any time you chose to do a commit, you would be presented with a global list of all files that had modifications (ie: svn stat). You would then go through these modified files and be presented with a diff of the previous revision and the current revision (ie: FileMerge). Instead of just confirming that you are changing what you think you are changing, you take a little extra time for each commit. In place of just writing a commit message for the entire commit with many files, you can write a separate commit message for each particular block of code that changed. This could become tedious, but there could be easy ways to select large portions of a file and mark them all as the same Change.

For example, say we are cleaning up a large method. We might have many different changes here and there every few lines, but all of these little changes group together to make one large Change: that of cleaning up this method. So instead of commenting each one of these minor changes, we would just comment the larger Change and say “we cleaned up this method.” Just having this sort of inlining could greatly help trying to trackdown when some piece of code changed somehwere in the history of the repository and why. As of right now, if you are trying to figure out what changed in a particular file, the only information you have is the commit message, which usually is only a line or two and covers the entire commit not a specific part of it.

This is fine and semi-useful, but I think the next feature of Inline SCM would be incredibly useful. You might have noticed in the picture above the “Mark as” check boxes. Each Change (Change with a capital C does not necessarily mean a single change in the diff but a collective group of minor changes) allows for you to specify some sort of category. I’ve listed two possible categories: Bug Fix and New Feature. Everytime you add some new code or rework some old code, you can tag it as a Bug Fix or a New Feature. The repository would then have all this extra data to know how exactly every line between one revision and another changed.

Returning to Joe

Now imagine our story line with Joe. If Joe and his team were using Inline SCM, the repository would know when any of their Changes fixed bugs and when any Change added a new feature. When Steve asks Joe to pump out a new version, Joe could roll back to r1000, but instead of just throwing away all that hard work, he could keep all the bug fixes between r1000 and r1500 and toss the new features.

Joe now would have a working version of code he knows passed QA along with all the bug fixes they found since they last released.

Other Opportunities

This is scratching the surface of what Inline SCM could accomplish. There could be many more tags other than just Bug Fix and New Feature to add relevance and meta data to each Change within a commit. You could even link Changes with other Changes to communicate interdependencies and other important information.

The other clear opportunity here is for easy scanning of history. Commit messages are broad and global. Change-level comments would be focused and relevant. This could greatly help developers when trying to figure out what was going on in between revisions.

Subversion and Git allow great branching and tagging of entire commits, but truly lack the power that Inline SCM could add. Inline SCM would be tagging and adding information on a level so much closer to the source code than a single commit message could ever hope to achieve.

If you can think of any other useful applications of Inline SCm, I’d love to hear more in the comments. I wrote this article to facilitate discussion, so please add your opinion.

Finally I wanted to note, that this type of Inline commenting would not even require a new UI. You could make this work with TextEdit. For instance, you could place your Inline SCM messages in comments like below. Then, the SCM application would analyze your comments and interpret them properly.

snv2

Perceived Problems

The major deterrent to this whole approach would be the amount of time investment required by developers. I know many developers (I might succomb to it too once and awhile) who don’t even bother writing descriptive commit messages. To get these developers to write comments for code-level Changes might be difficult. Inline SCM would have to provide some benefit far outweighing the negative of having to tag everything. For some companies rolling-back might be critical. For others, perhaps not. That’s why I hope to get feedback as to other positive reasons to use it.

Conclusion

This idea came to me today because of our rolling back to an older revision, and it struck me as incredibly useful to solve the very problem of rollbacks. Obviously, there are tons of other things that could make use of Inline SCM and there are, of course, many negatives to it as well. Please join the discussion and add your opinions or ideas.

  • Jason Skicewicz
    This is a very interesting idea, and is very similar to an idea I had at one point that had to do with carving up the useful parts of the web, and attaching notes to blocks of web content. Its similar in that it is carving up the web at a finer granularity than bookmarks, which is what inline SCM is doing for source control. I think the usefulness would far outweigh how tedious it may become, and would encourage smaller commits, which is useful in its own right. I personally, check the diff on every file before I commit code to the repository to 1) come up with a descriptive commit message, and 2) to make sure I did not leave any debug code in what I am about to commit. So this whole process of looking at every block of changed code, would work extremely well in my workflow since I do it anyway.

    I thought I would point out that your example is not exactly valid, although it is the way that many teams work. In a team that is following good process, all new features are developed in an entirely different branch, and are pulled in to your release branch as they are finalized / wanted / requested by biz. This way, the main branch, which is usually the trunk, only contains bug fixes and any approved features that were pulled in (nothing is developed in the main branch). I think what happened to the project that you are referencing, is that new features were developed in the main release branch. This should never be the case. The reason this happened, is because it was felt that these features would never be stripped out of the product, and would be present in the next release. A very bad assumption. This idea is very interesting because no teams (or very few) follow good version control practice. I mean, just take a look at how useful commit messages are on most projects.

    In any case, this is a very good idea. One that I would love to help you build out =). You could easily build this on top of subversion or git without rewriting anything by essentially creating a markup language in the commit message itself. Sort of like the way InterTrac does it.
  • Guest
    Interactive rebasing in git solves your problem.
blog comments powered by Disqus
« Hawaii for the College Student (or college-like budget) | XCode Feature Proposal: Linking NSLog to Code »