Cross-firewall, P2P inspired DVCS

Thursday, September 17, 2015 0 Comments

You're coding on your laptop (anywhere but the office), you checkin to your local repo, and then you want to share it with your colleague bob for a quick review.

One option is to go and push to the central server, then ask bob to pull from there. Not bad, the usual cloud based server placed somewhere.

Another option is to find out bob's public IP address, ask him to tweak the router so his Plastic server becomes reachable and... well, we won't go that route…

Wouldn't it be just awesome if you could simply do something as follows?

Locate bob's repo just using his email, no matter where he is, as soon as he is connected to the internet.

This way it won't matter whether you both are behind firewalls, or if you switch locations, you'll be always reachable.

This way to reach Plastic SCM servers is what we called The Tube and we opened it up for beta testing back in May 2015.

What is The Tube

In order to reach bob@robotmaker.com we need to locate him first. So some sort of "directory service" is required for this to happen. This "directory service" or "rendezvous point" is what we call the tube server.

Each of the servers establish an SSL connection to the tube server in order to be "reachable" by others:

We call this connection the control connection. Its only purpose is to sign in the server to the cloud and let the system know that it is reachable. It simply opens up the SSL control connection, negotiates the sign in, and keeps listening for potential requests coming from the Tube Server.

At the end of the day the servers that are behind the firewalls can't be reached from the outside (without tweaking the router, which is uncomfortable at best, and impossible if you're on a public Wi-Fi), so the option is to make them open a socket connection to the central rendezvous point, and keep listening from commands.

How the Tube data connections are established

When I want to reach the server bob@robotmaker.com I prefix it with tube: so Plastic knows it has to use the Tube protocol.

The process is depicted in the following image and explained below:

The first thing my server does (step 1) is to start a "data connection" to tube.plasticscm.com. It first asks if the connection to bob@robotmaker.com is possible. The first thing that the server checks is whether the connection between me and bob is allowed or not (more on this later). Then it checks if bob is connected, and if it isn't it will just abort my connection saying bob is not online.

In case bob is connected the Tube Server uses the "control connection" (step 2) to send bob a request to establish a data connection.

Bob's server will create a new connection (step 3) ready to start sending and receiving data from another peer.

Once bob server "called back" the Tube Server can create "the tube connection" between the two peers. This "tube" is actually what names the whole thing, since we always saw the entire thing as a data tube between the two peers across the internet, crossing firewalls and so on.

All the Tube Server does once the "tube connection" is stablished is quite simple: it will send to me everything coming from bob, and it will send to bob everything coming from me. The Tube Server doesn't know the underlying Plastic protocol being used and doesn't do anything with the data, it just sits in the middle exchanging bytes between the two peers.

How The Tube preserves privacy

The interesting piece is that once the "tube connection" is stablished, my server will start a regular Plastic SCM SSL connection to the remote server. Instead of opening up a new socket to a given IP and port, it will use the socket that was already negotiated with the Tube Server. But the regular communication will be tunneled through the "tube connection", which means it is secured as a regular SSL connection would be.

You'll be using the SSL certificate provided by bob in order to secure your communication. Yes, it will be tunneled through a central Tube Server, but even if someone compromised this server, he couldn't decrypt your data, or at least he wouldn't be more successful than if he does it at the network level compromising your Wi-Fi.

This is the reason why you'll get an error message if you try to connect your server to The Tube but your server is not configured to accept SSL connections. There's no other option to receive Tube traffic than through an SSL connection due to security reasons. (I mean, it would be perfectly possible to do it through TCP but we just removed the option to avoid potential security issues).

How do you allow or deny access to your server through The Tube?

There are 3 layers to consider.

The first thing you need to do is to connect your server to The Tube. If your server is not connected, nobody will be able to access it.

Second you need to configure who can connect to your server through The Tube. Currently you can do that from the CLI using the cm tube create command:

> cm tube create bob@robotmaker.com
The tube bob@robotmaker.com -> pablo@robotmaker.com was correctly created.

And then you need to grant access to the tube user to your repo. You can use the cm tube share command for that. You can grant push access, pull access or both.

Once you grant permissions, the user will be able to access you repo till you "unshare" your repo or you remove the tube connection.

While the "permission to establish a tube connection" is handled by the Tube Server, the actual permissions to do operations on your repo are controlled by your own server. Take a look at the permissions of my codice repo:

As you can see there are a group of special users prefixed with tube: which are tube users.

The tube users are different than regular Plastic SCM users because they're not retrieved and authenticated from LDAP, User/Password or any other of the auth modes we support. Tube users are plasticscm.com users, so they're authenticated by the Tube Server and once they reach your server through a Tube connection, your server doesn't need to authenticate them, just grant or deny access.

As you can see, you can also grant checkin permission (or any other permission) to the Tube users. It means that while I was always talking about using the Tube for push and pull operations, you can actually perform any other operation you want.

We just introduced the concept of share/unshare to simplify the whole setup, but at the end of the day the permissions are just common Plastic SCM permissions mapped to ACLs.

Instead of using the command line, you can take advantage of the Windows GUI to share and unshare repos (and granting the connection for the remote user will be automatically done for you when you share a repo both from GUI and CLI).

If your license is activated to use The Tube you'll have an icon on the Windows GUI as follows:

Some potential scenarios

There many potential scenarios, just to mention a few:

  • Direct Peer to Peer communication between team members without the need of a central server.
  • Granting access to a central server that is behind a firewall. We consider this case very useful because you can just setup a server at the office and forget about tweaking your firewall, setting rules and so on. You can just make the server accessible from anywhere through The Tube. This also avoids having to purchase a DNS or a static IP.
  • Granting access to a cloud server you setup on Amazon, Azure, Rackspace... (or your favorite provider). This is just a variation of the previous case applied to a server accessible on the internet, but saving the setup and cost of a DNS + static IP (not very expensive these days but anyway :P).

By the way, The Tube has one downside and it is that every single roundtrip to the server is being routed through the remote Tube Server, so the performance will be always worse than a direct connection. But it is a matter of running some tests.

We've been using it for months to access one of our main servers and for daily push/pull operations you don't really care whether you go through The Tube or not. Of course, if you have to transfer several GB of data then you'll be hit by the extra roundtrip.

How can you activate The Tube for your team?

If you're a customer of Team Edition or Enterprise Edition, just contact support and we'll activate The Tube for you. You'll have to install a new license.

If you're a Personal Edition or Community Edition user or you're just evaluating, also reach support to ask for access to The Tube. At the time of writing this it is still in pre-commercial phase so we're not charging for it, but since it runs on cloud infrastructure the goal is to cover the transfer costs with a small fee in the future.

Remember that all emails reachable through The Tube must belong to registered plasticscm.com users. So every team member must be able to login to plasticscm.com with his email.

More info

You can find extended info about The Tube in the official page and we also recorded a video with a detailed step by step tutorial on YouTube.

Conclusion

We started thinking about some sort of P2P cross-firewall communication long ago, I think it was around 2011 or even earlier. We ran some initial tests, we played with NAT libraries to do firewall hole punching and so on. But at the end of the day most firewalls come totally closed by default, so you have to enable the ability to allow software to set rules and open ports... which is something we wanted to avoid.

So we ended up implementing the central rendezvous point option, which is always there on P2P as fallback solution if firewalls can't be directly crossed.

We think it is a really cool feature and while similar things can be achieved setting up ad-hoc VPNs, it is very useful to have it out-of-the-box.

Coincidentally I read some post about a git p2p torrent based protocol after we launched the closed beta back in May, but the implementation is more about everyone having a copy of the same repo and then finding several sources to pull changesets, than what we achieve with The Tube which is designed with teams working on commercial software in mind.

Since we work with a lot of game studios we're dabbling with the idea of taking advantage of torrent-based solutions to use peer computers inside the same LAN to speed up the download of huge files... even when you use a remote cloud server... but that's a totally different story :-)

On the other hand we're doing some tests with WiFi Direct and the new Windows 10 APIs. The goal is to enable casual pushing and pulling between laptops of team members when they're on an event, on the train, or simply the hotel Wi-Fi is not working as expected. Again, that's a story for a future blogpost :-)



Enjoy!

0 comentarios: