Plastic stress testing

Tuesday, January 08, 2008 , 0 Comments

I've been focusing on performance for the last month or so. Basically I've been checking how Plastic works under heavy load.

It is very interesting to note how behavior differs from one single user accessing the server to several users running operations at the same time. I've been always concerned about speeding up plastic operations (as you've probably seen from my previous posts) and simulating hundreds of clients against a single server has been a good experience.

Ok, which was the testing scenario and how did we set it up? Well, we have several ''independent'' testing agents performing the following operations:

  • Create a workspace
  • Download the latest on the main branch
  • Create a branch (*)
  • Switch to the branch
  • Modify up to 10 files (3 changes each), checking in intermediate changes
  • Check in everything
  • Go to (*)

    Repeating the whole process 5 times. The operations from the command line look like:

    $ cm mkwk -wk /tmp/-wk
    $ cd /tmp/-wk
    $ cm update .
    $ cm mkbr br:/main/task[ITERATION NUMBER]
    $ cm co file01.c
    # modify the file
    $ cm ci file01.c
    $ cm co file01.c
    # modify the file again
    $ cm ci file01.c
    $ cm co file01.c
    # modify the file (last time)
    # (no check in is done this time)
    # go to another file and repeat the process

    # check in all the checked out files
    $ cm fco --format={4} | cm ci -

    Well, a really simple ''testbot'' which lets you measure how well the server scales up. We have another more complex ''scenario'' in which we have a more complete ''testbot'' which is able to mark a branch as finished (using an attribute) and jump to the ''recommended'' baseline. Another ''bot'' plays the integrator role: it waits until at least 10 branches has been finished and then integrates everything into main (of course it doesn't make clever decisions when a manual conflict arises) and moves the ''recommended baseline''. This way we can simulate heavy load with very simple and independent test bots.

    How do we launch the tests and gather the results? Well, using PUNit, the NUnit extension we've developed long time ago. PNUnit is totally open source and will be integrated into the next NUnit release, so maybe it becomes better known.

    So, basically we're using one server which runs the Plastic SCM server and the PNUnit launcher. The launcher reads a xml configuration file and tells the agents which tests they should run. This way, using different xml files, we can define different testing scenarios.

    As I mentioned above, it is very interesting to study the server's performance under heavy load. Code working fast with only a few users tends to be horribly slow when hundreds of clients make requests at the same time. I want say anything really new here, but the main things we've changed are:

  • Replacing lock sentences with ReaderWriterLocks. It can have an impact under heavy load. The problem here is that it seems some profilers tend to identify lock sentences as the root of problems, and this is not always the case. We're currently using AQTime, an excellent product, but I've found the following problem repeteadly: it tells you a method is eating 12% (for instance) of the time due to a lock. You get rid of the lock and the method is no longer eating a 12% but just 0.08%. Sometimes the overall time running inside the profiler is improved, but normally you don't get a benefit when the test is run with the standalone server. Ok, hopefully this is not always the case and you can find lots of performance bottlenecks using the profiler. We're also starting to use the one included in Mono.

  • Reducing remoting operations. We make extensive use of the .NET/Mono remoting, and normally reducing the roundtrips lead to a worse design, but to better performance, specially when lots and lots of calls are being made.

  • Trying to reduce the number of database operations. Things get really interesting here: some optimizations which make no difference with 1 client, can save lots of seconds with a big number of client testbots.

    And, what about the numbers? Well, we've tried with up to 40 machines (clients) and 1 server so far. All the clients were running Linux and have exactly the same hardware configuration. We have tried up to 200 simultaneous testbots but the regular testsuite was trying 1, 20, 35, 40 and 80 testbots against a single server. It is important to note that a testbot is not the same as a user but a number of them. I mean, a regular user doesn't create a branch, modify 30 files and check all the changes back in in less than 6 seconds, which is basically what a test bot does. So in reality we're simulating hundreds (even thousands) of simultaneous users.

    How good are the results? and, compared to what? Well, my intention here is publish the entire test suite in a few weeks, so Plastic users can set up the testing environment to check how it performs on their environments before they make the buy decision, and then you'll be able to really check our numbers. What I can say right now is that we've created exactly the same tests for Subversion and some other SCM products (which I won't disclose yet) and right now (using BL079 as code base, preview 1 is BL081) we're faster than any of them in this specific scenario I described above. How faster? Well, from 3 times to 6 times depending on the load if you compare us with SVN, for instance. But, we still have to run other scenarios to gain a more complete view and provide better results.
  • 0 comentarios: