More on recursive merge strategy
NB: This article has been updated through out the years. It's the continuation to this initial post explaining how recursive merge works. Last update was done on December 2018.
Example: Only one ancestor is the right option
Recursive merge is not only good for “criss-cross” merge situations but also for more “regular” ones where merge history simply gets complicated. I’d like to go back to the original example and explain it now in bigger detail:We’re going to merge from changeset 5 to changeset 6. In this diagram (another explanation using code will come later) we simply say the “content” of the file is A or B or C, but it applies to lines of code (containing important changes, bug fixes and so on).
DAG format
Just in case anyone has trouble understanding that the green lines are merges and they’re rendered from source to destination, here’s an alternative diagram:Common ancestors
In order to run a merge, you need to identify the “source”, the “destination” and the nearest common ancestor. In this case the “src” and “dst” are clear but the problem is the “common ancestor” because there’s more than one candidate.If you look carefully you’ll see that both changesets “2” and “3” can be taken as common ancestors.
A non-recursive merge algorithm will only use one of them, and depending on the “choice”, the result can be quite different.
Selecting “3” as common ancestor
If the algorithm chooses “3” (and as we explained in our previous post, this is the choice taken by Hg, and also would be the one taken by former plastic’s algorithm previous to “merge recursive” implementation) then the following situation will happen:It will lead to a manual merge, where the user will be forced to choose.
Selecting “2” as common ancestor – the lucky shot
If the algorithm uses “2” as common ancestor, then the merge will be between content “B”, “B” and “C•”, so, since two contributors are the same, the result will be “C”.It is the expected result as we will see later, and it is a pretty valid choice, the problem is that there’s no good consistent way to choose “2”. This time, the result matches the expected result, but sometimes it simply leads to an incorrect automatic choice, which is the real problem and the real reason why “recursive” is implemented by git and plastic.
The recursive option
Again, here’s how the “recursive works”: since there’re two ancestors at the same distance, it will “merge them” creating a new virtual ancestor that will be used as the “base” for the final merge between the merge candidates “5” and “6”:The following image depicts the first step, the calculation of the “virt” common ancestor (the intermediate one):
And then the second step, where the result is calculated automatically, this time without user intervention:
Conversation was posted on Reddit today:
ReplyDeletehttp://www.reddit.com/r/programming/comments/ovviu/discussion_on_recursive_merge_hg_vs_git/
Your first diagram is missing an arrow in comparison to the second. (In the first, the base for a 3-way-merge is pretty obvious.)
ReplyDelete@Andreas: which one is missing an arrow? http://3.bp.blogspot.com/-EvhJDUNxc0k/TxiiWdrT-gI/AAAAAAAABQA/kDmQJ_clNAQ/s1600/image00.png Are you sure?
ReplyDeleteWell, yes. In the second image (where time goes down), Node 6 is the result of a merge of 2 and 3, but in the first image (where time goes right) there is no arrow from 3 to 6.
ReplyDelete@Andreas: should be fixed now.
ReplyDeleteAnd what about the situation when there are more than 2 common ancestors? How will be they merged? In what order?
ReplyDeleteThanks
Once I understand how a 3-way merge is "smarter" than a 2-way merge, I can understand how a recursive merge can result in a "smarter" "virtual merge" with fewer conflicts; recursive is great! Thanks for your explanations.
ReplyDeleteOnce I understand "3-way merge" from another SCM blog; I can appreciate how a "recursive merge" results in a "smarter" virtual merge from common ancestors. Great explanation , thanks!
ReplyDeleteHi Pablo,
ReplyDeletesorry for resurrecting this old topic, but I think you failed to address Mercurial-guy's only interesting point: What to do when you can't automatically create the virtual ancestor because the merge to create it has conflicts.
Do you have any thoughts on that?
Thanks,
Jorge
Well, I actually addressed some of the mercurial's post mistakes, starting with the assumption that Plastic was a layer on top of Git :-)
ReplyDeleteThe thing is: if there is a merge in the intermediate virtual ancestor Plastic will go and ask you to merge it. And it will do a better job than Git dealing with intermediate file merges, because it won't write the resulting file with several >> marks saying what was modified, but will launch the 3-way merge tool on each intermediate merge.
Taking the wrong ancestor is worse than having to fix some conflicts :-)