Subversion or Git

There’s been a lot of ambient (and not so ambient) noise in the geeky corners where I often linger about Git. There seems to be some sort of mass exodus underway as folks indicate that they have, or are planning to, move away from Subversion for their source control needs and I have to admit that I’m feeling a little ignorant.

I haven’t used Git (yet?), but I’ve been doing some reading and I still don’t quite understand the infatuation. Don’t get me wrong, it looks like there are some nice features, but at what cost?

The key difference between the two systems seems to be the model itself. Where Subversion offers a centralized model, Git provides a decentralized model. At the risk of over simplification, this means that Git offers each developer their very own, fully autonomous copy of the entire repository. With Subversion, each developer has their own working copy, but commits changes to a single, central repository.

The arguments in favor of the decentralized approach, as I understand them, include:

Speed

Okay, I like speed. Since each developer has a local copy of the entire repository, fundamental actions like diff, commit, etc. are all local. That’s always going to be faster. Fast is good.

That said, server interaction with Subversion isn’t exactly excessive. Developers work in local working directories and only interact with the server in short bursts. I’ve never found myself thinking, “Great Scott, this is unbearably slow.” For me, at least, this seems like a fix for something that isn’t broken.

Smaller Space Requirement

A Subversion working directory contains two copies of the entire code base. One that is actually being worked on and another tucked away in the .svn directory. I remember reading somewhere that the Mozilla repository required ~12Gb of storage in Subversion and ~420Mb in Git.

This is a substantial difference, to be sure, but there’s a phrase that everyone in technology hears on a daily basis: storage is cheap. That can’t be a compelling argument for one problem and not for another, so I don’t think this difference is one that’s going to precipitate a switch all by itself.

Line Ending Conversion

This is an important factor for mixed environment teams. Evidently early versions of Git made no changes and more recent version handle the conversion automatically. That decisioning is manual with Subversion, but is spectacularly easy to do by making a simple change to ~/.subversion/config for each file extension impacted to specify the line ending desired:

*.sh = svn:eol-style=LF;

Other Concerns

In addition to the fact that none of the arguments presented in favor of Git are particularly compelling to me, at least not the way I understand them, I have a few other concerns about the decentralized model. Actually, they probably all roll up to a single concern – accountability.

Every project I’ve worked on requires some level of immediate or semi-immediate accountability. A project manager who needs to ensure that an appropriate amount of progress has been made against the project plan, a boss who needs to spot review code or maybe a continuous integration process that needs regular repository updates so that its not just rebuilding the same old thing.

It seems like the decentralization of the repository would hamper those efforts. Sure, it can be done, but it would have to be done with process.

I also wonder if, in a team environment, having too many people doing too much of their own thing for intervals that stretch too long would make effective merges all but impossible. It seems like it would.

Am I missing something? Are there additional benefits? Is there anything else I haven’t even mentioned that I should be considering? I’d love to hear from other source control users who have moved to Git from Subversion.

Subscribe18 Comments on Subversion or Git

  1. Bill Mill said...

    Well, I don’t currently use Git or Mercurial, though I just opened up my first mercurial repo and plan to learn how to use it.

    The reasons for it, in my case, are threefold:

    1. Reasonable branching and merging. Branching in svn is slow, but the real killer of it is that svn doesn’t remember when you branched. This means that, when you go to merge, you have to manually go back and recover the version at which you last merged, create a diff, and patch it back into the main branch. And god help you if you needed to merge just a few small changes from trunk back into a branch, and then a branch back into trunk. Who knows what was in there and what wasn’t? I’ve had to check, manually, thousand-line diffs and more or less click and pray. It’s a bad scene, and hacks like eazysvn are just evidence that it needs to be fixed.
    2. Offline use. I was offline for a few days there, and not being able to check in was extremely frustrating. It also led to a huge checkin once I got back online, which defeats the purpose of version control in the first place. This is compounded by:
    3. Committing chunks. Svn always assumes that you want to check in all of your changes to a file. Git/Hg, on the other hand, will (if you want) ask you about each change and whether you want to commit it, and make seperate patches for each chunk that you want to commit.
    4. Svn is slow-moving. It takes forever to get changes into the main builds of svn; I found and hit a bug with locking that had existed with a fix for over a year that just hadn’t been put in yet.

    All of these are leading me to try mercurial. As for accountability, in my opinion this is already solved by process and using decentralized source control doesn’t affect it much. Having a developer who only checks in to a local repo is exactly the same as having a developer who never checks in his code.

  2. Bill Mill said...

    I got fired up there and remembered a fourth reason while I was writing; that should be “the reasons are four-fold”

  3. Rob Wilkerson said...

    Fair enough. You make some good points or, more accurately, put some personal experience behind some that I had already seen and what you say makes sense. I wasn’t aware of the branching pain you’ve had. I work primarily on the trunk and only branch for:

    1. Developing large, destabilizing features, and
    2. Maintenance of current releases

    As a result, my merging needs are minimal. If they became more sophisticated, then I might have to look elsewhere.

    I do like the idea of offline use (even though I feel like my entire life is spent online), but I find it hard to wrap my mind around how it could work in a team environment. Maybe installing a process of daily merges with the “king” repository is all it would take.

  4. Bill Mill said...

    Well, I think merges should take place once a feature is complete. Another nice thing about the new breed of source control is that it makes it very easy to create email patchsets, which are perfect for code review before merging into the trunk on the shared repo.

    It was enlightening for me to read this tutorial about how the SymPy team uses mercurial to fix bugs and review code before checkin to the main repository.

  5. Rob Wilkerson said...

    @Bill – True. I didn’t mean to imply that I think merges should be done daily regardless of current state. I was responding to “typical” development where lots of small changes are being made. Iterative improvements, if you will, rather than new features or larger scale changes.

    I’ll check out the tutorial you linked. It’s always interesting to see how others are using tools within their daily workflow. Thanks.

  6. Bill Mill said...

    Well, interesting how these things happen in bunches. Jeff Atwood just posted on source control, and a lot of his commenters suggested DVCS to him. Then, Bill de Hora posted on why he prefers DVCS, including a couple reasons we talked about and one we didn’t.

  7. Chad Kieffer said...

    The version control realm is certainly expanding and diversifying. Seems like this is a direct result of the growing number of decentralized development teams brought about by the growth of the open source community and an increase in telecommuting.

  8. Rob Wilkerson said...

    @Chad –

    And offshoring, although that could be considered a specialized type of telecommuting. I have a team in India, but I guess my point is that even with that team to consider, Git seems to solve problems that I don’t have. I was wondering what, if anything, I was missing. If I can find a free service offering Git hosting, I may move a project over and try it out in order to get some hands-on experience.

  9. Bill Mill said...

    Sorry to monopolize the thread, but Ryan Tomayko also just weighed in .

  10. Rob Wilkerson said...

    @Bill Mill – As long as you’re providing good information, monopolize away. :-) As Git’s leading evangelist (that I know), do you (or anyone you know) have any invites to the GitHub beta? I wouldn’t mind seeing what all the fuss is about first hand.

  11. Bill Mill said...

    I’ve never even installed git, I’ve been playing around with mercurial, mostly just because I like that it’s in python. I’ve liked the idea of distributed source control for a while, but am just now getting up to switching momentum.

    You can create a project at assembla , then go to admin/tools and do “add git & trac” to get a git repo and a trac set up for you; I did it with mercurial and it was pretty easy.

  12. Bill Mill said...

    Oh, and I’ll admit to some github jealousy as well, that’s a pretty sweet service.

  13. Rob Wilkerson said...

    No shit? I use Assembla’s Svn+Trac offering for several personal projects (including this site). I didn’t realize that they had a Git offering. I thought I actually looked a while back, but either they didn’t have it then or I didn’t see it. I’ll definitely be checking that out.

    Thanks.

  14. Bill Mill said...

    Since I seem to be posting my source control link feed here, here’s another good “why svn is suboptimal” post: http://blogs.sun.com/smarks/entry/why_i_don_t_like .

  15. Dustin said...

    You mentioned you use branches for “Developing large, destabilizing features, and Maintenance of current releases.” Is that because they’re difficult to use in your tools, or do you just really feel like it’s the right thing to do?

    I use branches when I think, “Huh, I wonder if this will work?” Or, “Note to self, we’re going to upgrade soon and I’ll have to do this a different way.” Or, “Ugh, none of those changes I made are ready to go to production yet, I need to go back and slice these last few changesets into a branch and work on another feature.”

    Small distances can seem very far away when you have to walk them.

  16. Rob Wilkerson said...

    @Dustin –

    I don’t think branches are particularly difficult, but since most work occurs on the trunk I tend to move off on a branch for those efforts that will impact the experience of other developers working on the trunk.

    If I’m reading you right, one of the key differences between our experiences is that, using Svn, I tend to feel like I have to think ahead and identify those changes that should be branched before I start working on them. I don’t suppose that’s entirely accurate, but to the extent that I feel that way, I guess branches are more difficult.

    I’m not sure that answers your question very well, but maybe it’s a start.

  17. João Marcus said...

    @Rob

    I didn’t really understand all the fuzz about DVCS. Not until I got to use it regularly.
    We use SVN at work, but I’m using Git-SVN myself. Now that I got the hang of Git’s branching/merging, I realize how much SVN branches suck. I no longer have to remind myself what exactly I’ve been doing, and usually all I need to do is to “git merge my_branch”.

    It’s well worth the try. Nobody really sees you’re using Git-SVN. It’s transparent.

  18. Rob Wilkerson said...

    Thanks, João. It’s at the top of my list. Once I get a minute to migrate one of my existing projects over, I’m going to start learning. I have an 18 hour flight coming up and it would be nice to be able to develop and commit. My goal is to transition something by then so that I can work during the flight.