Track refactored code across files with Plastic SCM

Tuesday, August 11, 2015 , 4 Comments

You refactor some code before doing a bug fix, you clean it up to better understand how it works, and leave the code in a better status than you found it, in purest boy-scout style :P.

Then you move methods to a new class in a new file. And later make some modifications to the moved methods. Easy, uh?

Well, Plastic SCM can now track the refactor and diff it correctly, because we have just implemented multi-file semantic diff or "analyze refactors":

This is another step forward towards semantic version control as we explained about one month ago with our initial release.

Now Plastic can track moved code across files, and soon we'll apply the same tech to merging.

Step one: a multi-file refactor scenario: split a class in three - intro

To explain how Plastic can track a multi-file refactor I'm going to split a class in three, step by step. I'm using Visual Studio 2015 and the latest ReSharper to perform the method extractions.

I start with the following sample code (quite dummy to focus on code structure and keep it as simple as possible):

What I'm going to do is pretty simple: I'll create a ServerSocket class, a ClientSocket class and a DNS class (to contain GetHostByName) and move the methods in the original Socket class to the new 3 destinations. Note that this refactor is going to be slightly more complex than the one displayed on the first screenshot of the blogpost.

Step two: extract methods to new classes

First I extract a few methods into a new SocketServer class (later I renamed it to ServerSocket):

ReSharper makes it quite simple: you just have to type the name of the new class, select the methods to move and decide whether you want the new class on the same file or in a new one. I select "its own file" since I want to show multi-file diff in its full glory :P

Then I extract the method GetHostByName to a new DNS class, as follows:

Finally I'm going to rename the old Socket class to ClientSocket since I consider it now contains "client" oriented methods (again, it is sample code so don't take it so seriously).

Once I'm done with the refactor I'm going to checkin the code. Since I'm on Visual Studio 2015 I just go to View/Plastic SCM and select the Pending Changes view:

Note that:

  • The old Socket.cs file is renamed to ClientSocket.cs (and Plastic is correctly tracking the move, since it is one of its strengths).
  • There are two new files, DNS.cs and ServerSocket.cs. These are the two new classes I created with ReSharper.
  • The old Socket.cs, now renamed as ClientSocket.cs, shows up as changed because it has been changed and moved (renamed).
  • The .csproj is changed due to the 2 new files, and the main Program.cs code that invokes the Socket classes is also changed (I don't show it for the sake of simplicity).

Once I checkin I can display the Branch Explorer within Visual Studio 2015:

Analyze Refactors - the new option to launch multi-file semantic diff

What I'm going to do now is to right click on the changeset I just created to launch a changeset diff. And then I'm going to click on the new "Analyze Refactors" button and... here comes the magic:

Look at it carefully because this is the main thing we've been working on :-)

  • First you see there is a new element in the tree called "Refactor Group". It basically says it has detected a group of files that have been modified together, in this case ClientSocket.cs, DNS.cs and ServerSocket.cs.
  • Once you click on the Refactor Group itself you see the Visual Semantic Diff but it is now able to explain what happened because it is showing the 3 involved files (in fact it also shows the old Socket.cs on the left, prior to be renamed to ClientSocket.cs).
  • Check "GetHostByName()": it was moved from Socket.cs into DNS.cs and Plastic detects it was not only moved but also modified. Why? Click on the "M" "C" icons on the diagram and you'll be able to diff the code as follows: basically I made the method static, so that's why it is changed. Awesome, isn't it? :-D
  • The "Listen()", "Accept()" and "Recv()" methods were moved from Socket to the new ServerSocket class. It is quite clear in the diagram and a pain to track if you don't have it. Note that the "Accept()" method is also marked as changed. Let's see why: the return is no longer a Socket but a ServerSocket now.
  • And finally the "ConnectTo()", "Send()" and the member "mSocket" are detected as moved from Socket to the new ClientSocket class. Here Plastic is detecting the class as new because there are too many overall changes (remember the body of the methods were empty, which is not a real scenario) that it can't say the old one is "the same" (more than 50% of it changed). Anyway, it knows the methods are the same so you can diff "ConnectTo()" if needed.

And this shows basically the full power of the new feature. We've been using it internally for a few days already and I found myself diffing code that was moved (which happens very often) easily, while previously you had to trust your memory to say "yep, it was just moved" and probably missed some changes during code reviews.

By the way, the regular file diffs are also improved, check the following screenshot:

There is a new category called "Multi-file moved" and now the "Moved" icon gets a different color and improved actions: "Diff moved code..." will show up a diff taking the code from the two involved files and letting you diff the method as you seen before from the diagram. "Go to moved code..." will now jump to a different file.

Watch it in action

We just recorded a short screencast to demo the new feature:

Motivation - refactoring as a key best practice

Needless to say refactoring is a key practice to keep code readable and ready to be changed. And developers don't do it for fun, but in order to be able to react fast, move forward and adapt to customer and market changes.

But under real world circumstances sometimes teams avoid refactoring because someone else will be working in parallel on the same code and then understanding the changes and merging the code will be a nightmare.

Most of our work in branching and merging and semantic is focused on helping developers work in parallel without having to stop refactors due to parallel changes.

But we also think that the version control handles tons of info that is not being used on a daily basis. We're not talking about reporting or metrics (and other management stuff) but useful data that can make developers more effective. And that's precisely what this "analyze refactors" is all about: making it easier to diff code even when you moved it to a different file.

As you might understand, we'll be applying the same ideas to Semantic Method History and merge.

Get it now!

Download Plastic SCM (and later) to put your hands on the most advanced diff machine known to man :-D

If you are .NET developer you'll also find this link interesting: Version control for .NET developers focusing on all the new semantic stuff.



  1. Been waiting for this one for a long time. Excellent work. The multi-file merge functionality will be awesome when it lands, I'm sure. :-)

    1. Thanks Gerard! Thanks for your support.

      Yes, multi-file semantic merge will be the next big step :)

  2. C'mon, Pablo, I'm ANXIOUSLY waiting for multi-file semantic merge to do the switch!

    1. Gracias Ignacio!

      Most likely multi-file will be initially released just as part of Plastic, before making it available in SemanticMerge.

      The reason is that we need to get all the files involved in the merge, and the standard way to invoke merge tools doesn't do that, so we can't make it version control agnostic.

      We already studied how to modify git mergetool command to support it, so probably we'll go for Git support too, only after giving it a try in Plastic.