Why merge tracking matters

Thursday, December 16, 2010 , 0 Comments

My goal with this blog post is to explain why merge tracking is so huge for all the Subversion, CVS and VSS users out there who still think “hey, I don’t care about the new SCM systems, I don’t need distributed development!”

I know the GitHubbers and Mercurial users and Plastikers out there know what I’m talking about, but for the sake of clarity: the new DVCS systems are not only good because they’re distributed, but specially because they’re able to handle merging correctly. (Yes, I hear SVN users shouting: “we do have merge tracking after 1.5”, I know, I know, and I still say… switch to another system! :P.)

So, I will try to explain why merge tracking is so important with a very simple scenario. Let’s go.

Change my code


I start with the following piece of code. We’re going to modify, in parallel, the fragment that is highlighted in blue.



Basically we’re going to do what the next image shows: one of the developers is going to make a change to the code on the “main” branch (“master” in Git jargon) branch and the other coder will modify it on “featurebranch” branch. You can visualize your Git repo history using GitJungle which displays graphics exactly like this.



Merging nightmare


Let’s go now and merge “featurebranch” back into “main”. This time we’re facing a “manual merge” because we’ve created a non-automatic conflict. It isn’t a very tough conflict to resolve, because only a small code fragment is involved. But in the real world, you sometimes have to deal with heavily modified files, and the merge isn’t so easy.

Almost every 3-way merge tool out there works in the way pictured below (whether you’re using KDiff 3, WinMerge, Araxis Merge, Guiffy or the great Scooter Software’s BeyondCompare!).

As you can see you’ll have to deal with the “base” (or common ancestor) and the two “contributors”. In our case: the file as it was at the beginning (base) and the two parallel changes made on different branches (destination, the revision made on “main” and source, the revision made on “featurebranch”).



Because we modified exactly the same code, the 3-way merge tool shows something like the following:



You, the developer or integrator, have to decide what goes into the result file. In this case the desired result is the following: we keep the new method introduced on “main” and the new loop variables introduced on “featurebranch”. And, the most important thing is: check the “merge arrow” that now is coming from “featurebranch” into “master”. It means the underlying SCM “knows” the branch is already merged. That’s the key.



And the fun starts when you merge again


What if know you need to make a new change on the same file on the “featurebranch”? Suppose the “merge” wasn’t so trivial and it took you quite a few minutes to figure out how to resolve it. I bet you wouldn’t like to go through it again, would you?

Well, the magic of “merge tracking” is that it will allow you to make a new change on “featurebranch”, on the same file, and merge it again but without having to merge again the conflicts you already resolved! That’s the key and that’s something you didn’t have with older systems (and probably the main reason why you used to hate branching and merging before…).

The following image shows how you’ve introduced a new change on the branch, and then it will be time to merge again.



Common ancestors – finding the root of all revisions


So, what will happen now when you try to merge from “featurebranch” back into “master”? Well, thanks to the “merge link” the system created before, some things will change.

Look at the situation before doing our first merge. The contributors and the ancestor are highlighted. As you can see, the changeset (commit) tagged as BL189 is the common ancestor.



But now, when you’re going to merge again, the common-ancestor calculation will be different.



Now the newly calculated “common ancestor” for the merge is the recent changeset on the “featurebranch” (the contributor to the first merge). And the results of the first merge, including all the difficult-to-resolve conflicts, are recorded in the “main” branch contributor. With all the complicated work behind it, the merge algorithm can deal only with the new changes on “featurebranch”, and merge automatically for you.

Wrapping up


So, well, that’s all: I’m sure you’ve faced a situation like the above. Only having full merge tracking will save the day!

0 comentarios: