Moving from SVN to Plastic SCM - How we did the migration at Surgical Science

Thursday, September 24, 2015 0 Comments

Hi there! My name is Göran Wallgren and I was invited to write this guest blog post to share how we switched our version control system from SVN into Plastic SCM. I work at Surgical Science (Sweden), where we have been developing products for medical simulation training for more than 15 years.

Background

We started out using CVS for version control, and then migrated into SVN (Subversion) and TortoiseSVN almost 10 years ago. In summer 2015 we finally decided to migrate into Plastic SCM. Besides the advanced merging and handling of large binary files, one of our main reasons for choosing Plastic over Git or Mercurial was that it can work both centralized and distributed (DVCS), which made the switch from SVN easier. The changeset numbering scheme is also closely resembling the one in SVN.

At the point of migration, we had close to 28000 revisions from 14 years of code and data history in our main SVN repository (ca 20 GB). We wanted to keep all revisions from trunk but decided to leave out inactive branches in the import and also to split some parts into separate repositories in Plastic. In the end, this left us with a bit over 17000 imported changesets in the main Plastic repo.

Migration would have to go via Git, since Plastic won't import directly from SVN. Earlier, one had to use Git fast-export and Plastic fast-import which had some issues (mainly since Git does not log directory removals in the fast-export file). However, that has all changed with the Plastic GitSync feature that means Plastic SCM can now speak directly with Git over the HTTP/HTTPS and GIT protocols.

Besides the actual import of the data, we needed to find ways to replace some of the features we had been using in SVN, mainly svn:externals and the SubWCRev tool from TortoiseSVN. At the end of this post we'll give some useful tips regarding this.

Preparations

To make things a bit faster, we did the migration locally on the Windows Server machine where our SVN repos resided. Before starting, we installed Git for Windows and Plastic SCM Server. The Git distribution includes a Git Bash shell that makes it easy to use Unix-like commands and scripts for the migration, and the command lines listed below are meant to be entered in such a shell. When installing the Plastic server we chose to enable SSL on port 8088, this means the local Plastic URL is ssl://localhost:8088.

NOTE: The migration instructions given here are targeting the Windows platform, but the shell-based (and Java-based) commands should work on Mac and Linux as well. However, the Plastic GitSync feature does not have a GUI yet on Mac and Linux so on these platforms we need to use the cm sync git shell command instead.

For the following guide, let's assume we are migrating a repository from SVN located at D:/Dev/svn/repo1/. It has a non-standard layout, where the trunk and branches directories are located not directly at the repository root but below a directory called myroot. We'll import the trunk and a branch called mybranch.

Step 1: Start local SVN server

For fast local read-only access to Subversion, we start a lightweight SVN server daemon by issuing the following command in a separate shell window:

> svnserve -d -r D:/Dev/svn/ -R

The svnserve command creates a server process which will be busy until we stop it. Our repository is now accessible locally over the SVN protocol (default port 3690). For security, we may need to ensure this port is not exposed to the Internet.

Step 2: Initialize Git repository

We'll now ask Git to create a new Git repository at D:/Dev/git/repo1/ and prepare it for importing from our SVN repo.

> mkdir -p D:/Dev/git/repo1/
> cd D:/Dev/git/repo1/
> git svn init svn://localhost/repo1/

The git svn init command creates a .git sub-directory and initializes an empty Git repository in the current directory. Now, we should edit the local file .git/config to specify the branches we want to import:

...
[svn-remote "svn"]
url = svn://localhost/repo1
fetch = myroot/trunk:refs/heads/master
fetch = myroot/branches/mybranch:refs/heads/mybranch_imported
...

As you can see in this example, we map our trunk to the Git master branch and our mybranch into a Git branch that we call mybranch_imported. This way, we get a chance to re-map branch names during migration.

Step 3: Fetch version history into Git

Having edited the above config-file, we continue by fetching all the relevant revisions from SVN:

> git svn fetch
> git reset --hard 

The git svn fetch command will fetch any previously unfetched revisions from the SVN repository we are tracking. The git reset command is needed to clean-up the status of the Git working tree after the fetch is completed. (The git svn commands can be used for bidirectional sync with SVN, but here we are only using them to import history.)

IMPORTANT: The git svn fetch command can take a few days (!) to complete if we have a very large number of versions and a lot of data in our repository. If our users are continuing to commit into SVN, we can re-run the above fetch command to fetch any newer revisions. Before continuing from this step, we may want to change permissions on our SVN repository to make it read-only and then run the fetch command again one last time...

When the fetch is completed, we should check the imported branches and history in Git. If our SVN history contains some moved branches, Git could have created some extra branches that we may want to delete before importing into Plastic.

ADVANCED: If we want a chance to re-map author usernames during migration, we can pass a mapping text-file to the fetch command: git svn fetch -A authors.txt. Also, for each imported commit, Git appends to the log message a text string starting with git-svn-id that contains a reference to the corresponding branch and revision number in SVN. If we want to simplify these strings for readability, this can be done using an advanced command (git filter-branch): git filter-branch --msg-filter <command>. (While Plastic allows changing checkin comments in the GUI, it is not currently possible to do this from a script. However, it may be possible using advanced queries and the execquery command in Plastic.)

Step 4: Start local Git server

Here, we tried using the git daemon command for access to the Git repository via the GIT protocol. This worked at first, but we soon ran into some sort of timeout problems with repositories containing lots of very large files. So we ended up using the HTTP protocol instead.

While there are several ways to serve Git over HTTP/HTTPS, we found a free Java-based tool called SCM-Manager that made it quick and easy to serve our Git repository locally over HTTP on a custom port. (For security, we may need to ensure this port is not exposed to the Internet.)

The instructions for how to setup the SCM-Manager tool are found in the wiki on their site. Basically, after installation we edit the included file conf/server-config.xml to setup which custom HTTP port to use. In our example we set it like this:

<SystemProperty name="jetty.port" default="8082" />

Then we start the SCM-Manager web server via the included bin/scm-server.bat file. (On Mac/Linux use bin/scm-server.) We open a web browser on http://localhost:8082/ and login using the default user/password (scmadmin/scmadmin).

The SCM-Manager web interface

We then use the web interface to import our existing Git repository for access over HTTP. We go to Config - Repository Types - Git Settings - Repository directory and enter the path to our Git repos. We Save this setting, then go to Main - Import Repositories - Git - Import from directory. When done, we can test the connection by using the following Git command:

> git ls-remote http://localhost:8082/scm/git/repo1/

Step 5: Import into Plastic via GitSync

Now, we should be all set for importing from Git into Plastic via the GitSync feature. (Note that here we only use GitSync to import, but it's actually bi-directional so we could continue syncing with Git if we wanted.) First, we start the main Plastic GUI and create a new repository on our Plastic server. Right-click the new repo and select View branches. In the Branches view, right-click on /main and select Replication – Sync with Git.

Choose 'Sync with Git' from the Replication sub-menu

Enter the Repository URL as http://localhost:8082/scm/git/repo1/ and enter the default username and password for the SCM-Manager tool.

Enter Repository URL, username and password

Click Sync and wait for the Plastic to replicate our Git repository.

Example of GitSync progress dialog while syncing

Once the sync is finished, the master branch from Git (originally trunk from SVN) has been imported into the /main branch in Plastic. The other branches have been imported as top-level branches.

NOTE: On Mac and Linux there is no GUI for GitSync yet, so on these platforms we must use the following shell command instead:
cm sync repo1@ssl://localhost:8088 git http://localhost:8082/scm/git/repo1/ --user=scmadmin --pwd=scmadmin

Step 6: Finding replacements for some useful SVN features

Once the migration was finished we needed to find suitable replacements for a few special features we had been using in Subversion. Here we list our most useful tips:

svn:ignore

Replacing svn:ignore is simple - we just put a file ignore.conf in the root of our workspace, specifying files/folders that should be considered Private items (not put under version control). This file can be checked in so our coworkers get the same settings. The syntax is described here. Additionally, in the Items view of the Plastic GUI there are context menu commands to add/remove ignored file-patterns.

svn:externals

Replacing svn:externals with Xlinks in Plastic is also simple as long as we put each of the external folders in a separate repository (or at least in a separate branch).

Back in the SVN days, we were mainly using externals in order to exclude folders full of very large data files from checkout/update. In Plastic this can be done by the feature called cloaking, which is similar to ignoring. We put a file cloaked.conf in the root of our workspace, specifying version-controlled files/folders that should be skipped during an Update operation. In the Items view of the Plastic GUI there are context menu commands to add/remove cloaked file-patterns.

However, we ran into problems when we tried to cloak xlinks. These problems were solved when we found out that we must not cloak the actual xlink folder - instead we must cloak the contents of the xlinked folder!

Text/binary file types

When importing from Git, some text file-types were incorrectly detected as binary, which means diffs won't be displayed correctly. This is easily fixed through the cm chgrevtype command. We can apply it (once per existing branch) to every file with a certain file-extension (here exemplified on .vcxproj files), by issuing the following command at a Windows CMD prompt or in a .bat file:

> dir /S /B *.vcxproj | cm chgrevtype -type=txt -
NOTE: To control the txt/bin detection for files added in the future we can setup a filetypes.conf file in the workspace root folder.

SubWCRev

In SVN we had been using the SubWCRev tool included in TortoiseSVN. This is a small command line utility that reads the status of an SVN working copy and can perform keyword replacement in a template file. We used it in our build process to stamp revision numbers into the output executable. To replace this in Plastic, we wrote a little C# console program PlasticWcRev that mimics part of the functionality from SubWCRev in a Plastic workspace. The source code is available here for anyone who finds it useful.

Final thoughts

At Surgical Science, we had an intense learning-period during and after the migration process but in the end we're very happy about finally making the switch to Plastic SCM. We are now much more confident in branching and merging, and the branch-per-task pattern has already improved our workflow!

0 comentarios: