Switching projects to Git

The purpose of this post is to tell you the story of the Version Control System (VCS) choices I have made while maintaining my open source projects ATF, Kyua and Lutok. It also details where my thoughts are headed to these days. This is not a description of centralized vs. distributed VCSs, and it does not intend to be one. This does not intend to compare Monotone to Git either, although you'll probably feel like it while reading the text. Note that I have fully known the advantages of DVCSs over centralized systems for many years, but for some reason or another I have been "forced" to use centralized systems on and off. The Subversion hiccup explained below is... well... regrettable, but it's all part of the story! Hope you enjoy the read. Looking back at Monotone (and ATF) I still remember the moment I discovered Monotone in 2004: simply put, it blew my mind. It was clear to me that Distributed Version Control Systems (DVCSs) were going to be the future, and I eagerly adopted Monotone for my own projects. A year later, Git appeared and it took all the praise for DVCSs: developers all around started migrating en masse to Git, leaving behind other (D)VCSs. Many of these developers then went on to make Git usable (it certainly wasn't at first) and well-documented. (Note: I really dislike Git's origins... but I won't get into details; it has been many years since that happened.) One of the projects in which I chose to use Monotone was ATF. That might have been a good choice at the time despite being very biased, but it has caused problems over time. These have been: Difficulty to get Monotone installed: While most Linux distributions come with a Monotone binary package these days, it was not the case years ago. But even nowadays if all Linux distributions have binary packages, the main consumers of ATF are NetBSD users, and their only choice is to build their own binaries. This generates discomfort because there is a lot of FUD surrounding C++ and Boost.High entry barrier to potential contributors: It is a fact that Monotone is not popular, which means that nobody is familiar with it. Monotone's CLI is very similar to CVS, and I'd say the knowledge transition for basic usage is trivial, but the process of cloning a remote project was really convoluted until "recently". The lack of binary packages, combined with complex instructions on just how to fetch the sources of a project only help in scaring people away.Missing features: Despite years have passed, Monotone still lacks some important features that impact its usability. For example, to my knowledge, it's still not possible to do work-directory merges and, while the interactive merges offered by the tool seem like a cool idea, they are not really practical as you get no chance to validate the merge. It is also not possible, for example, to reference the parent commit of any given commit without looking at the parent's ID. (Yeah, yeah, in a DAG there may be more than one parent, but that's not the common case.) Or know what a push/pull operation is going to change on both sides of the connection. And key management and trust has been broken since day one and is still not fixed. Etc, etc, etc.No hosting: None of the major project hosting sites support Monotone. While there are some playground hosting sites, they are toys. I have also maintained my own servers sometimes, but it's certainly inconvenient and annoying.No tools support: Pretty much no major development tools support Monotone as a VCS backend. Consider Ohloh, your favorite bug tracking system or your editor/IDE. (I attempted to install Trac with some alpha plugin to add Monotone support and it was a huge mess.)No more active development: This is the drop that spills the cup. The developers of Monotone that created the foundations of the project left years ago. While the rest of the developers did a good job in coming up with a 1.0 release by March 2011, nothing else has happened since then. To me, it looks like a dead project at this point :-(Despite all this, I have been maintaining ATF in its Monotone repository, but I have felt the pain points above for years. Furthermore, the few times some end user has approached ATF to offer some contribution, he has had tons of trouble getting a fresh checkout of the repository and given up. So staying with Monotone hurts the project more than it helps. The adoption of Subversion (in Kyua) To fix this mess, when I created the Kyua project two years ago, I decided to use Subversion instead of a DVCS. I knew upfront that it was a clear regression from a functionality point of view, but I was going to live with it. The rationale for this decision was to make the entry barrier to Kyua much lower by using off-the-shelf project hosting. And, because NetBSD developers use CVS (shrugh), choosing Subversion was a reasonable choice because of the workflow similarities to CVS and thus, supposedly, the low entry barrier. Sincerely, the choice of Subversion has not fixed anything, and it has introduced its own trouble. Let's see why: ATF continues to be hosted in a Monotone repository, and Kyua depends on ATF. You can spot the problem, can't you? It's a nightmare to check out all the dependencies of Kyua, using different tools, just to get the thing working.As of today, Git is as popular, if not more, than Subversion. All the major operating systems have binary packages for Git and/or bundle Git in their base installation (hello, OS X!). Installing Git on NetBSD is arguably easier (at least faster!) than Subversion. Developers are used to Git. Or let me fix that: developers love Git.Subversion gets on the way more than it helps; it really does once you have experienced what other VCSs have to offer. I currently maintain independent checkouts of the repository (appropriately named 1, 2 and 3) so that I can develop different patches on each before committing the changes. This gets old really quickly. Not to mention when I have to fly for hours, as being stuck without an internet connection and plain-old Subversion... is suboptimal. Disconnected operation is key.The fact that Subversion is slowing down development, and the fact that it really does not help in getting new contributors more than Git would, make me feel it is time to say Subversion goodbye. The migration to Git At this point, I am seriously considering switching all of ATF, Lutok and Kyua to Git. No Mercurial, no Bazaar, no Fossil, no anything else. Git. I am still not decided, and at this point all I am doing is toying around the migration process of the existing Monotone and Subversion repositories to Git while preserving as much of the history as possible. (It's not that hard, but there are a couple of details I want to sort out first.) But why Git?First and foremost, because it is the most popular DVCS. I really want to have the advantages of disconnected development back. (I have tried git-svn and svk and they don't make the cut.)At work, I have been using Git for a while to cope with the "deficiencies" of the centralized VCS of choice. We use the squashing functionality intensively, and I find this invaluable to constantly and shamelessly commit incomplete/broken pieces of code that no-one will ever see. Not everything deserves being in the recorded history!Related to the above, I've grown accustomed to keeping unnamed, private branches in my local copy of the repository. These branches needn't match the public repository. In Monotone, you had this functionality in the form of "multiple heads for a given branch", but this approach is not as flexible as named private branches.Monotone is able to export a repository to Git, so the transition is easy for ATF. I have actually been doing this periodically so that Ohloh can gather stats for ATF.Lutok and ATF are hosted in Google Code, and this hosting platform now supports Git out of the box.No Mercurial? Mercurial looks a lot like Monotone, and it is indeed very tempting. However, the dependency on Python is not that appropriate in the NetBSD context. Git, without its documentation, builds very quickly and is lightweight enough. Plus, if I have to change my habits, I would rather go with Git given that the other open source projects I am interested in use Git.No Bazaar? No, not that popular. And the fact that this is based on GNU arch makes me cringe.No Fossil? This tool looks awesome and provides much more than DVCS functionality: think about distributed wiki and bug tracking; cool, huh? It also appears to be a strong contender in the current discussions of what system should NetBSD choose to replace CVS. However, it is a one-man effort, much like Monotone was. And few people are familiar with it, so Fossil wouldn't solve the issue of lowering the entry barrier. Choosing Fossil would mean repeating the same mistake as choosing Monotone.So, while Git has its own deficiencies — e.g. I still don't like the fact that it is unable to record file moves (heuristics are not the same) — it seems like a very good choice. The truth is, it will ease development by a factor of a million (OK, maybe not that much) and, because the only person (right?) that currently cares about the upstream sources for any of these projects is me, nobody should be affected by the change. The decision may seem a bit arbitrary given that the points above don't provide too much rationale to compare Git against the other alternatives. But if I want to migrate, I have to make a choice and this is the one that seems most reasonable. Comments? Encouragements? Criticisms?

February 11, 2012 · Tags: <a href="/tags/atf">atf</a>, <a href="/tags/git">git</a>, <a href="/tags/kyua">kyua</a>, <a href="/tags/lutok">lutok</a>, <a href="/tags/monotone">monotone</a>, <a href="/tags/vcs">vcs</a>
Continue reading (about 8 minutes)

New version of the monotone-server package in pkgsrc

Wow, it has been a long time... 5 years ago, I created the monotone-server package in pkgsrc, a package that provided an interactive script to set up a monotone server from scratch with, what I though, minimal hassle. My package did the job just fine, but past year I was blown away by the simplicity of the same package in Fedora: their init.d script provides a set of extra commands to initialize the server before starting it up, and that is it. No need to mess with a separate interactive script; no need to create and memorize passphrases that you will never use; and, what's more, all integrated in the only single place that makes sense: in the init.d "service management" script. It has been a while since I became jealous of their approach, but I've finally got to it: I've spent the last few days rewriting the monotone-server package in pkgsrc and came up with a similar scheme. And this new package just made its way to pkgsrc-HEAD! The new package comes with what I think is a detailed manual page that explains how to configure the server from scratch. Take a look and, if you find any mistakes, inconsistencies or improvements to be done, let me know! In the meantime, I will log into my home server, rebuild the updated package and put it in production :-)

March 12, 2010 · Tags: <a href="/tags/monotone">monotone</a>, <a href="/tags/netbsd">netbsd</a>, <a href="/tags/pkgsrc">pkgsrc</a>
Continue reading (about 2 minutes)

Back to Stone Age

For a rather long while I had been able to avoid the use of the Subversion services offered by my research group even if they were omnipresent. But today, this lucky trend vanished. I have been "forced" to use one of these devilish repositories to add some of my stuff. Using this goes against my "principles", as a colleague said. If you don't know it, Subversion is a centralized version control system. Linear history, the non-transparent way to back up the master server, primitive merging interfaces and, the worst thing of all, the need to access the network for every single operation are unbearable facts. Using a centralized VCS is like going back in time a million years. (Oh, excuse me, a million is too few.) I hate it! I recently went on a trip and didn't have Internet access neither on the plane nor on the hotel; do you know how cool it was to still have full access (not just the working copy, that is) to my code, documents and everything else? And even if you have Internet access, can you imagine how fast you can work without having to wait for the network? Well, I can't really blame the administrators. As far as I can tell, they are not too familiar with VCSs and, when making a decision, they just went for what was everywhere, which unfortunately is Subversion is everywhere. Everybody is making that mistake in this department and university. Let's see when I will have some free time to prepare a presentation about DVCSs (including Monotone as a case study) and give it to the whole department. Given today facts, I should do this as soon as possible. Administrators, I know you are reading me. Don't take this the wrong way! ;-)

April 12, 2008 · Tags: <a href="/tags/monotone">monotone</a>, <a href="/tags/vcs">vcs</a>
Continue reading (about 2 minutes)

Daggy fixes (in Monotone)

If you inspect the ATF's source code history, you'll see a lot of merges. But why is that, if I'm the only developer working in the project? Shouldn't the revision history be linear? Well, the thing is it needn't and it shouldn't; the subtle difference is important here :-) It needn't be linear because Monotone is a VCS that stores history in a DAG, so it is completely natural to have a non-linear history. In fact, distributed development requires such a model if you want to preserve the original history (instead of stacking changes on top of revisions different than the original ones). On the other hand, it shouldn't be linear because there are better ways to organize the history. As the DaggyFixes page in the Monotone Wiki mentions: All software has bugs, and not all changes that you commit to a source tree are entirely good. Therefore, some commits can be considered "development" (new features), and others can be considered "bugfixes" (redoing or sometimes undoing previous changes). It can often be advantageous to separate the two: it is common practice to try and avoid mixing new code and bugfixes together in the same commit, often as a matter of project policy. This is because the fix can be important on its own, such as for applying critical bugfixes to stable releases without carrying along other unrelated changes.The key idea here is that you should group bug fixes alongside the original change that introduced them, if it is clear which commit is that and you can easily locate it. And if you do that, you end up with a non-linear history that requires a merge per each bug-fix to resolve the divergences inside a single branch. I certainly recommend you to read the DaggyFixes page. One more reason to do the switch to Monotone (or any other DAG-based VCS system, of course)? ;-) Oh, I now notice I once blogged about this same idea, but that page is far more clear than my explanation. That is why you'll notice lots of merges in the ATF source tree: I've started applying this methodology to see how well it behaves and I find it very interesting so far. I'd now hate switching to CVS and losing all the history for the project (because attempting to convert it to CVS's model could be painful), even if it is that not interesting.

July 17, 2007 · Tags: <a href="/tags/atf">atf</a>, <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)

Monotone's help rewrite merged

I have just merged my net.venge.monotone.help-rewrite branch into the mainline Monotone's source code. I already explained its purpose in a past post, so please refer to it to see what has changed. There is still some work to do on the "help rewrite" area, but I won't have the time to do it in the near future. Hence I added some items to the ROADMAP file explaining what needs to be done, hoping that someone else can pick them up and do the work. They are not difficult but they can introduce you to Monotone's development if you are interested! ;-)

May 20, 2007 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 1 minute)

Talk about Git

I've been using Git (or better said Cogito) recently as part of my PFC and, although I don't like the way Git was started, I must confess I like it a lot. In some ways it is very similar to Monotone (the version control system I prefer now) but it has its own features that make it very interesting. One of these is the difference between local and remote branches, something I'll talk about in a future post. For now I would just like to point you to a talk about Git by Linus given at Google. He focuses more on general concepts of distributed version systems than on Git itself, so most of the ideas given there apply to many other systems as well. If you still don't see the advantages of distributed VCSs over centralized ones, you must watch this. Really. Oh, and it is quite "funny" too ;-)

May 19, 2007 · Tags: <a href="/tags/cogito">cogito</a>, <a href="/tags/dvcs">dvcs</a>, <a href="/tags/git">git</a>, <a href="/tags/monotone">monotone</a>
Continue reading (about 1 minute)

Monotone's help rewrite

A couple of weeks ago, I updated Monotone to 0.34 and noticed a small style problem in the help output: the line wrapping was not working properly, so some words got cut on the terminal's boundary. After resolving this minor issue, I realized that I didn't know what most of the commands shown in the main help screen did. Virtually all other command-line utilities that have integrated help show some form of an abstract description for each command which allows the novice to quickly see what they are about. So why wouldn't Monotone? I started extending the internal commands interface to accept a little abstract for each command and command group, to be later shown in the help output. This was rather easy, and I posted some preliminary changes in to the mailing list. But you know what happens when proposing trivial changes... People complained that the new output was too long to be useful, which I agreed on and fixed by only showing commands in a given group at a time. But... there was also an interesting request: allow the documentation of subcommands (e.g. list keys) in a consistent way with how primary commands (e.g. checkout) are defined. There is even a bug (#18281) about this issue. And... that has kept me busy for way longer than I expected. I've ended up rewriting the way commands are defined internally by constructing a tree of commands instead of a plain list. This allows the generic command lookup algorithm to locate commands at any level in the tree, thus being able to standarize the way to define help and options on them. The work is almost done and can be seen in the net.venge.monotone.help-rewrite branch. I've also been messing with Cogito recently and found some of its user interface features to be very convenient. These include automatic paging of long output and colored diffs straight on the console. Something to borrow from them if I ever have the time for it, I guess ;-)

April 23, 2007 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)

Monotone: Got T-Shirt

The last weekend of past November, Monotone's main author, Graydon Hoare, proposed a "bugathon": a time frame in which all efforts could be focused on fixing existing bugs in the code. The price for solving a bug was a T-Shirt (or any other object from its CafePress shop), so I decided to help a bit by fixing some portability bugs to NetBSD. And today, I received the long sleeved T-Shirt I ordered in exchange for the fixes :-) (Will post a photo of this one too when I learn how to do it.)

December 23, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 1 minute)

Monotone: Using mini-branches to apply patches

The Monotone VCS provides the concept of mini-branches. A mini-branch is a lightweight branch created inside a formal branch whenever a commit causes "conflicts" with the actual contents of the repository. For example, if your working copy is not up to date and you commit something, you will create a new head within the branch (that is, a mini-branch), that you will later need to (possibly manually) merge with the other head to remove the divergence.

October 8, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 3 minutes)

Manual ChangeLogs; a thing of the past?

If you have ever examined the source distribution of an open source project, you'll probably have noticed a ChangeLog file. This file lists, in good detail, all changes done to the source code in reverse order, giving their description, the name of the affected files and the name of the author who did the change. So far, so good. But I really think that these files, or better said, the way they are written and managed, is flawed. Let's see why:

August 21, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 3 minutes)

Monotone's CVS gateway, part 2

After explaining what is the Monotone's CVS gateway, I've been asked to post a little step by step tutorial about it. I'll focus the example towards pkgsrc. Here it goes: The first step is to create a local database for Monotone and a key for personal use: $ monotone --db=~/pkgsrc.db db init $ monotone --db=~/pkgsrc.db genkey user@example.com Once this is done, we can proceed to import the CVS repository into the database. We can do this in two different ways:

July 6, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)

Monotone's CVS gateway

After a long time, I've finally decided to give Monotone's net.venge.monotone.cvssync branch a try. The code in it implements a bidirectional gateway between Monotone and CVS. What this means is that Monotone can be used for private development while working on a project that already uses CVS (doing the inverse could be... stupid?). The way it works is basically the following: first of all, you synchronize your local Monotone database with a remote CVS repository, importing the whole revision tree into it using cvs_pull. Secondly, you commit to your local Monotone tree as much as you want. At last, when you want to publish your changes, you push them against the CVS repository and they get integrated nicely (each revision in your local database is translated into a single CVS commit) using cvs_push. There are some small problems, though: during a push, all the new CVS revisions get the same date, but I think this is unsolvable.

June 28, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)

Monotone dedicated server

Some weeks ago, I installed Monotone on my main machine to act as a dedicated server for Vigipac's source code. During the process, I had to write a rc.d script and configure multiple things to get everything working safely. The overall process is not difficult once you know how Monotone works, but it is quite time consuming and error prone (due to concrete file permissions, for example). So I thought I could share my work to make this process easier to other people and love pkgsrc even more ;-)

January 12, 2005 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)

Impressions on Monotone

I'm amazed after having played with Monotone during the whole evening. Simply put, it is a distributed version control system, similar to CVS in the sense that it keeps track of changes across files and lets multiple people work at the same time with them. But, unlike CVS, it has many other cool features. The front page of its website contains a nice paragraph summarizing all available features, so I'm not repeating them here. But let me discuss here what has taken my attention.

November 28, 2004 · Tags: <a href="/tags/monotone">monotone</a>
Continue reading (about 2 minutes)