Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

Put your hands on a programming-language-aware, refactor ready, merge tool

Tuesday, April 09, 2013 Pablo Santos 8 Comments

Merging source code files can rapidly become a nightmare if the files to be merged were first refactored. You move a method to a different location on a class while someone else modifies the same method, and then the resulting merge blows away the current-gen merge tools.

Look at the following example:

You have a “Socket class”, and then another developer decides to sort the methods in visibility order (because he just read “Implementation Patterns” or something), while yet another developer needs to modify the “Send” method.

Try to merge this with text-based 3-way merge tools. You’ll get a horrible merge, something you can’t cope with. You’ll need to go and manually place the “Send” method inside the reordered “Socket” class.

But what if the merge tools were able to “parse” the code, then understand that “developer 2” just reordered methods and “developer 1” modified “Send”? Well, a nightmare merge would turn into an automatic one.

Enter a new kind of merge tool

We’ve been working on source code merge technology for a while now. We believe that good branching needs extremely good merging, and that having the two lets your team implement cleaner and higher quality code, making everyone happier from coders (who like to feel they’re doing things right) to sales (better code means lower maintenance costs).

That’s why we’ve created a merge tool that is “programming language aware” and hence it “understands your code”.

And we’re looking for feedback!

If you want to become one of the first coders to give a try to the tool, visit our “teaser page” and grab a beta.

The long story: "... But it should understand the code!"

Whenever I get the chance to introduce “file merging” to a developer who is not familiar with it, his first concern is: “Hey, but how is this going to understand whether the changes collide or not?” Then I explain how the 3-way merge works, how it is able to solve conflicts automatically if the same lines were not modified in parallel, and so on, and then they tend to buy the idea.

But their first gut feeling was: “It should understand the code”.

The funny thing here is that if you ask someone experienced in version control, familiar with the state-of-the-art in merge technology, he will explain why text-based merging does a good job and why you can trust 3-way merges. His original gut feeling just sort of vanished.

Sometimes interesting points arise: “What if two developers modify the same method in parallel? I touch the beginning and you touch the end of the method…. The merge may be automatic but it’s also incorrect!!”

Interesting, right?

The Merge Panacea

I wrote about “code aware version control” long ago but more interestingly the well-known Martin Fowler did it too in his famous “Semantic Conflict” blog post.

Very recently I found a post from ThoughtWorks’ Paul Hammant titled “Features I would love source control tools to have” where the number one feature is “Semantic Diff/Merge”. He seems to be reading my mind when he wrote:

”source-control tool[s] at the top level, should also be able to understand ‘method rename’ (and other refactorings), rather than a series of adds and deletes. Another example is ‘method reorder’ where one method/function is moved below another. There are many others that are closer to our understanding of refactoring operations.”

We’ve been thinking about a merge tool being able to do these sort of things for years. We started with XDiff/Xmerge, tools able to detect code that has been moved inside a file, as a first step towards our “final” goal.

Making Things a Little Bit More Complex

Let’s now go for the next merge scenario that comes to everyone’s mind: What if you split a class moving and modifying methods?

Take a look at the following figure:

  • You have a “Socket” class
  • Developer 1 goes and creates a new “ClientSocket” class and moves a couple of methods from “Socket” into “ClientSocket”. Then he modifies the “Send()” method.
  • Meanwhile Developer 2 goes and creates “ServerSocket”, moves two methods there, and finally renames “Socket” to “DNS”. Then he modifies the “Send()” method.

Nightmare is the word that comes to the mind of the coder doing the merge of the two files.

But, again, what if the merge tool was able to parse the code and see what happened?

Well, you’d get the resulting file with the 3 classes... automatically. And the conflict in the “Send()” method will be detected and you’ll be prompted to solve it independently of where the method is located.

We’re Working On It

Merge is our core concern down here while developing Plastic SCM and that’s why we took some bits and pieces from the Plastic SCM merge engine, plus XMerge/XDiff, plus a ton of new stuff, and we’ve created a tool able to deal with the cases described above and many more! :-)

Become a pioneer!!

Want to try the cutting edge technology? Go to our teaser site and download it and give it a try at http://plasticscm.com/sm/index.html
Pablo Santos
I'm the CTO and Founder at Códice.
I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.
I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.
I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.
And I love simple code. You can reach me at @psluaces.

8 comments:

  1. Feedback is much appreciated... :-)

    There's a conversation going on in hackernews too: https://news.ycombinator.com/item?id=5520321

    ReplyDelete
  2. I hope you include(d) [INSERT PROGRAMMING LANGUAGE HERE]. We don't write in [OTHER LANGUAGE YOU INCLUDED].

    ReplyDelete
  3. I had this idea back when I began my programming career (as has pretty much every programmer I'm guessing) and actually spent about 7 years of my life trying to implement it. My idea was to write an extension to PEGs automatically able to handle AST generation suitable for semantic-aware diff/merge. Spent several years working on research extending PEGs then implemented this research in several different language... only to decide I should tackle smaller projects and move onto something more basic.

    Good luck with this, I hope you'll consider open sourcing it!

    ReplyDelete
  4. I had this idea back when I began my programming career (as has pretty much every programmer I'm guessing) and actually spent about 7 years of my life trying to implement it. My idea was to write an extension to PEGs automatically able to handle AST generation suitable for semantic-aware diff/merge. Spent several years working on research extending PEGs then implemented this research in several different language... only to decide I should tackle smaller projects and move onto something more basic.

    Good luck with this, I hope you'll consider open sourcing it!

    ReplyDelete
  5. Hey James, your work sounds awesome!!

    It think it would be worth to have a conversation with you.

    I'm impressed.

    We simplified the resolution, we deal with it on a method basis... we don't handle the bodies as "structure" but as text, using xdiff/xmerge (our own).

    ReplyDelete
  6. Guys, there's a conversation going on here too at reddit:

    http://redd.it/1c0djq

    ReplyDelete
  7. In case you need to set it up with git, we made a fix to the instructions. Consider this:

    .gitconfig:

    [merge]

    tool = SemanticMerge

    [mergetool "SemanticMerge"]
    path = C:/Program Files (x86)/PlasticSCM4/semanticmerge/semanticmergetool.exe
    keepBackup = false
    trustExitCode = false
    cmd = \"C:/Program Files (x86)/PlasticSCM4/semanticmerge/semanticmergetool.exe\" -b=\"$BASE\" -d=\"$LOCAL\" -s=\"$REMOTE\" -r=\"$MERGED\" -l=csharp -emt=\"mergetool.exe -b=\"\"@basefile\"\" -bn=\"\"@basesymbolic\"\" -s=\"\"@sourcefile\"\" -sn=\"\"@sourcesymbolic\"\" -d=\"\"@destinationfile\"\" -dn=\"\"@destinationsymbolic\"\" -r=\"\"@output\"\" -t=\"\"@filetype\"\" -i=\"\"@comparationmethod\"\" -e=\"\"@fileencoding\"\"\" -edt=\"mergetool.exe -s=\"\"@sourcefile\"\" -sn=\"\"@sourcesymbolic\"\" -d=\"\"@destinationfile\"\" -dn=\"\"@destinationsymbolic\"\" -t=\"\"@filetype\"\" -i=\"\"@comparationmethod\"\" -e=\"\"@fileencoding\"\"\"

    ReplyDelete
  8. The blogpost certainly makes this seem like very cool functionality. I spent some time on making a diff tool that would find common changes across multiple change-locations. I think my tool shares some functionality with what you present. Is there any way to get more info on your approach and technology? My spdiff tool is documented somewhat at : http://www.diku.dk/~jespera/ and the corresponding git repo is at github: https://github.com/jespera/spdiff

    ReplyDelete