Put your hands on a programming-language-aware, refactor ready, merge tool
Merging source code files can rapidly become a nightmare if the files to be merged were first refactored. You move a method to a different location on a class while someone else modifies the same method, and then the resulting merge blows away the current-gen merge tools.Look at the following example:
You have a “Socket class”, and then another developer decides to sort the methods in visibility order (because he just read “Implementation Patterns” or something), while yet another developer needs to modify the “Send” method.
Try to merge this with text-based 3-way merge tools. You’ll get a horrible merge, something you can’t cope with. You’ll need to go and manually place the “Send” method inside the reordered “Socket” class.
But what if the merge tools were able to “parse” the code, then understand that “developer 2” just reordered methods and “developer 1” modified “Send”? Well, a nightmare merge would turn into an automatic one.
Enter a new kind of merge tool
We’ve been working on source code merge technology for a while now. We believe that good branching needs extremely good merging, and that having the two lets your team implement cleaner and higher quality code, making everyone happier from coders (who like to feel they’re doing things right) to sales (better code means lower maintenance costs).That’s why we’ve created a merge tool that is “programming language aware” and hence it “understands your code”.
And we’re looking for feedback!
If you want to become one of the first coders to give a try to the tool, visit our “teaser page” and grab a beta.
The long story: "... But it should understand the code!"
Whenever I get the chance to introduce “file merging” to a developer who is not familiar with it, his first concern is: “Hey, but how is this going to understand whether the changes collide or not?” Then I explain how the 3-way merge works, how it is able to solve conflicts automatically if the same lines were not modified in parallel, and so on, and then they tend to buy the idea.But their first gut feeling was: “It should understand the code”.
The funny thing here is that if you ask someone experienced in version control, familiar with the state-of-the-art in merge technology, he will explain why text-based merging does a good job and why you can trust 3-way merges. His original gut feeling just sort of vanished.
Sometimes interesting points arise: “What if two developers modify the same method in parallel? I touch the beginning and you touch the end of the method…. The merge may be automatic but it’s also incorrect!!”
Interesting, right?
The Merge Panacea
I wrote about “code aware version control” long ago but more interestingly the well-known Martin Fowler did it too in his famous “Semantic Conflict” blog post.Very recently I found a post from ThoughtWorks’ Paul Hammant titled “Features I would love source control tools to have” where the number one feature is “Semantic Diff/Merge”. He seems to be reading my mind when he wrote:
”source-control tool[s] at the top level, should also be able to understand ‘method rename’ (and other refactorings), rather than a series of adds and deletes. Another example is ‘method reorder’ where one method/function is moved below another. There are many others that are closer to our understanding of refactoring operations.”
We’ve been thinking about a merge tool being able to do these sort of things for years. We started with XDiff/Xmerge, tools able to detect code that has been moved inside a file, as a first step towards our “final” goal.
Making Things a Little Bit More Complex
Let’s now go for the next merge scenario that comes to everyone’s mind: What if you split a class moving and modifying methods?Take a look at the following figure:
- You have a “Socket” class
- Developer 1 goes and creates a new “ClientSocket” class and moves a couple of methods from “Socket” into “ClientSocket”. Then he modifies the “Send()” method.
- Meanwhile Developer 2 goes and creates “ServerSocket”, moves two methods there, and finally renames “Socket” to “DNS”. Then he modifies the “Send()” method.
Nightmare is the word that comes to the mind of the coder doing the merge of the two files.
But, again, what if the merge tool was able to parse the code and see what happened?
Well, you’d get the resulting file with the 3 classes... automatically. And the conflict in the “Send()” method will be detected and you’ll be prompted to solve it independently of where the method is located.
Feedback is much appreciated... :-)
ReplyDeleteThere's a conversation going on in hackernews too: https://news.ycombinator.com/item?id=5520321
I hope you include(d) [INSERT PROGRAMMING LANGUAGE HERE]. We don't write in [OTHER LANGUAGE YOU INCLUDED].
ReplyDeleteI had this idea back when I began my programming career (as has pretty much every programmer I'm guessing) and actually spent about 7 years of my life trying to implement it. My idea was to write an extension to PEGs automatically able to handle AST generation suitable for semantic-aware diff/merge. Spent several years working on research extending PEGs then implemented this research in several different language... only to decide I should tackle smaller projects and move onto something more basic.
ReplyDeleteGood luck with this, I hope you'll consider open sourcing it!
I had this idea back when I began my programming career (as has pretty much every programmer I'm guessing) and actually spent about 7 years of my life trying to implement it. My idea was to write an extension to PEGs automatically able to handle AST generation suitable for semantic-aware diff/merge. Spent several years working on research extending PEGs then implemented this research in several different language... only to decide I should tackle smaller projects and move onto something more basic.
ReplyDeleteGood luck with this, I hope you'll consider open sourcing it!
Hey James, your work sounds awesome!!
ReplyDeleteIt think it would be worth to have a conversation with you.
I'm impressed.
We simplified the resolution, we deal with it on a method basis... we don't handle the bodies as "structure" but as text, using xdiff/xmerge (our own).
Guys, there's a conversation going on here too at reddit:
ReplyDeletehttp://redd.it/1c0djq
In case you need to set it up with git, we made a fix to the instructions. Consider this:
ReplyDelete.gitconfig:
[merge]
tool = SemanticMerge
[mergetool "SemanticMerge"]
path = C:/Program Files (x86)/PlasticSCM4/semanticmerge/semanticmergetool.exe
keepBackup = false
trustExitCode = false
cmd = \"C:/Program Files (x86)/PlasticSCM4/semanticmerge/semanticmergetool.exe\" -b=\"$BASE\" -d=\"$LOCAL\" -s=\"$REMOTE\" -r=\"$MERGED\" -l=csharp -emt=\"mergetool.exe -b=\"\"@basefile\"\" -bn=\"\"@basesymbolic\"\" -s=\"\"@sourcefile\"\" -sn=\"\"@sourcesymbolic\"\" -d=\"\"@destinationfile\"\" -dn=\"\"@destinationsymbolic\"\" -r=\"\"@output\"\" -t=\"\"@filetype\"\" -i=\"\"@comparationmethod\"\" -e=\"\"@fileencoding\"\"\" -edt=\"mergetool.exe -s=\"\"@sourcefile\"\" -sn=\"\"@sourcesymbolic\"\" -d=\"\"@destinationfile\"\" -dn=\"\"@destinationsymbolic\"\" -t=\"\"@filetype\"\" -i=\"\"@comparationmethod\"\" -e=\"\"@fileencoding\"\"\"
The blogpost certainly makes this seem like very cool functionality. I spent some time on making a diff tool that would find common changes across multiple change-locations. I think my tool shares some functionality with what you present. Is there any way to get more info on your approach and technology? My spdiff tool is documented somewhat at : http://www.diku.dk/~jespera/ and the corresponding git repo is at github: https://github.com/jespera/spdiff
ReplyDelete