Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

GuiTestSharp: Multiplatform GUI Testing with dotnet

Wednesday, January 30, 2019 Sergio L. 0 Comments

We are happy to announce the release of GuiTestSharp, an open source GUI-test framework used to automate the testing of desktop apps written in C# on Windows, Linux and macOS.

As far as we know, it might be the first cross-platform GUI testing framework for .NET/Mono/Xamarin.

You can see it in action here:

And grab the source code now from its GitHub repository

We've been using this framework internally for years with success to test our Windows Forms, GtkSharp, Xamarin.Mac and WPF applications.

The framework itself is quite thin, and you simply have to fill a number of "test interfaces" to actually make your app testable.

This blogpost and the GitHub repository are the starting points to show you how to best use it and start GUI testing your C# applications.

We'd really like to find other teams who find it as useful as we do!

Grab the framework source code and examples

You can find the repository with the framework and samples here: github.com/PlasticSCM/GuiTestSharp

Multiplatform GUI


We added a simple, multiplatform application with some GUI tests, and it is documented with instructions on how to build and run everything. I recommend you give it a try before continuing – at least cloning the repository and browsing the code for a few minutes. This way, you can get a first-hand peek at what I'm about to tell you.

How GuiTestSharp works

Your GUI application loads a test assembly during startup. This test assembly is where your tests are. Your application invokes the tests and they run on a separate "test thread".

How GuiTestSharp works


The test code is typical NUnit code; we'll be looking in more detail at it later.

The test code looks like this:

As you can see, it can do things like ClickOKButton() which, not surprisingly, will click a button.

Now, how does it work?

We don't rely on UIAutomation in Windows, nor any equivalent in GNU/Linux and macOS. We don't inject messages or anything. We tried these approaches over the years with commercial tools and open source projects on top of UIAutomation (remember White?) and they always ended up with the same problem: you repeatedly run into situations with "control not found". You can open a GUI object inspector and check that the control is not detected by the spy tool.

So, we decided to "cheat" and make things different. When you invoke the ClickOkButton() method, there's some code doing a okButton.Click() underneath. The actual code varies depending on whether the GUI is Windows Forms, WPF, GtkSharp or Xamarin.Mac, but the idea is the same: you perform actions on the GUI layer directly.

It can be thought as "cheating", but it works like a charm and doesn't fail.

Of course, you shouldn't fool yourself. I mean, you might be tempted to implement ClickOkButton() as a call to the actual code that okButton.Click() calls, but that wouldn't be what you want to achieve with GUI testing at all.

So, very simply put:

  • You inject some testing code into your application.
  • You code tests in the NUnit style.
  • Your application has to implement some "testing interfaces" to allow the test code to perform actions.
  • GuiTestSharp provides the infrastructure (PNUnit; more on this below) to start your application in test mode, collect results, provide synchronization if needed, etc.
  • It is a good idea to only load the test code when the application runs tests, but don't include it in production. We actually provide examples on how to achieve that.

PNUnit: the secret sauce

If you are familiar with .NET, you probably already know the popular testing framework . But, most likely you never heard of PNUnit, which stands for Parallel NUnit. We contributed it to the project many years ago, and we have used it for more than a decade as the foundation of most of our testing efforts.

Although it was contributed to the community back then (and it is publicly available in NUnit 2.x GitHub repository and NuGet), we have since made some changes that were long overdue to make public. Don't worry, the code is indeed available for you to use; we just must polish it, write some documentation, and present it as a standalone project.

Don't let the name confuse you. "Parallel NUnit? I can already run my tests in parallel!" Well, it is not parallelism just for the sake of it. PNUnit implements a mechanism to orchestrate test execution.

In our case, we develop a client part and a server. When we need to test something end-to-end on the client, we need to start a server. Sometimes, we need to start two servers for more complex scenarios. And the server and the client might be on different machines and different operating systems. The test code performing actions automating the client shouldn't start until the servers are ready. That's why we added some cross-machine synchronization mechanisms - basically barriers.

Also, we want to stop the tests if the server is unable to start, and, in that case, we need to know exactly why.

The video looks aged for sure, but if something works, don't change it!


That's were PNUnit really shines; it allows the tests to communicate and to wait for each other using synchronization barriers. The tests run in parallel, and coordinate with each other to achieve testing scenarios that are just impossible with traditional NUnit without further work on the developer's side. And, of course, once every parallel test finish, the framework gathers the results, just as you'd expected it to. The cherry on top is that it works fine regardless of the Operating System thanks to the portability of the .NET Framework.

In very general terms, PNUnit is structured as follows:

How PNUnit is structured


PNUnit works as follows - you start by executing the Agent specifying an agent configuration file that looks like this:

The Agent knows where the test assemblies (the DLLs that contain the test fixtures) are (PathToAssemblies key) and publishes an instance of the IPNUnitAgent class through .NET Remoting at the specified port.

Then, you execute the Launcher, specifying a test suite file and, optionally, which of those tests should run. A test suite file looks like this (mind the omitted part, we'll get to it later):

The Launcher then publishes an instance of IPNUnitServices using .NET Remoting (we know it's old, but since it is a minor part of the grand scheme, it is easily replaceable in the future), and connects to the Agent through the IPNUnitAgent instance published at $host:$port. Please note that thanks to this, the Launcher and the Agent don't need to be on the same host, so we can orchestrate parallel tests to run across different machines.

Through IPNUnitAgent, the Launcher tells the Agent which tests must run. With that information, the Agent prepares some data and launches the pnunittestrunner, which is the process that will load the test assemblies, and that will host test execution.

The Agent uses the IPNUnitServices to tell the Launcher if there was any problem executing the test runner. For example, if it immediately crashed after launch. The test runner uses the IPNUnitServices for test synchronization and to send back the results of the test execution, along with any messages the programmer left behind to debug the tests, report back progress, or whatever is needed at the time.

How to write and execute a test using PNUnit

Before we continue, I think we must dig a little deeper in the PNUnit framework, as it might be easier to grasp the concept using an example.

Is writing PNUnit tests really that different than using NUnit, or any other testing framework for the matter? The answer is no. You only change the tools and add a new assembly dependency. And, the synchronization mechanism is really easy to use too. Here is a simple example of two tests synchronized through the SERVERSTART barrier:

Our test suite definition file looks as follows inside the ParallelTests (plural) section (the one we previously omitted):

Notice that PNUnit does not discover the tests in an assembly. That is because several unit tests might be part of a bigger, synchronized test, so it doesn't make sense to discover and run tests automatically. You have to specify them using their assembly name (tests.dll) and their full qualified name (Namespace.Class.Method). The Machine section refers to where the agent that can launch these tests is running.

Once the test assembly is compiled and is inside the PathToAssemblies directory, we can start the agent...:

C:\wkspace\pnunit> agent.exe agent.conf
2018-12-27 13:27:30,589 INFO ConfigureRemoting - Registering channel
2018-12-27 13:27:30,610 INFO ConfigureRemoting - Registering channel on port 8080

...and launch the parallel tests:

C:\wkspace\pnunit> launcher.exe testsuite.conf --test=ExampleTest

The Launcher will tell the agent to run the test named ExampleTest, which is composed of two parallel tests. It will also tell the Agent that the first one is found inside tests.dll, at Examples.SyncedTests.ServerTest and that the second one is found inside the same assembly, at Examples.SyncedTests.ClientTest.

With this information, the Agent will start two instances of pnunittestrunner.exe – one instance per test – with the necessary parameters for them to know which test to load and run, and how to contact the Launcher.

If any of the pnunittestrunner.exe instances fail to spawn (for whatever reason, for example, if the test assembly is not there), the Agent will inform the Launcher about it and the tests will be considered failed. If not, pnunittestrunner.exe will contact the Launcher, informing it about the progress, and synchronizing the running tests.

Once both tests finish, the pnunittestrunner.exe instances will report back the results to the Launcher, the Launcher will exit reporting these results, and the agent will remain idle waiting for new connections.

Every GUI app is a test runner

The key in GuiTestSharp is that every tested GUI application becomes a test runner which is launched by the PNUnit agent. The agent starts app the application and tells it which test to run.

Let's now go into the details.

As in any other testing frameworks, the test runner is the tool that loads the test assembly dynamically, discovers the tests inside it, runs them, and reports back the results through console or log files. In our case, PNUnit, reports results through a remoting interface IPNUnitServices.

So, the GUI application becomes a test runner itself. It must be able to load test assemblies and run the tests within its context, because the tests need to be omniscient regarding the GUI. The tests will take control of the application and access windows and dialogs in a way that is otherwise impossible. That's why all our applications with GUIs are secretly a test runner – if you launch them with the correct parameters.

Test execution workflow


A full picture of the application architecture

Now that you know the PNUnit framework and its components, it is time to take a step back once again to see the full picture. Let me show you the most important components of the architecture of our applications. A lot of details are still omitted for simplicity, but this is pretty much what you need to know without digging into the code:

Full application architecture


Let's focus on what's changed from the previous architecture diagram. Now, the application has a GuiTestRunner class, which will implement the test runner behavior of the application. It depends on the IPNUnitServices interface located at the pnunitframework.dll assembly for communicating back and forth with the Launcher.

The test code in guitests.dll also relies on IPNUnitServices, but this time for test synchronization through barriers.

Let's dissect the new components.

ITesteableView interfaces – giving tests access to the GUI

The test code accesses the application's GUI through the ITesteableView interfaces (ITesteableApplicationWindow, ITesteableErrorDialog, and so on; you get the picture).

These interfaces are implemented by classes inside the GUI application. Their method's names express basic actions that the user can perform, and basic ways to retrieve information from the Window. In short, they simulate the hands and the eyes of a user. Take this one as an example:

ITesteable is not an interface but a family of interfaces that you must implement to give tests access to the GUI.

This is the Adapter Pattern at its finest. The test must be multiplatform, so it cannot rely on things such as Windows Forms controls, GTK Widgets, nor Cocoa NSViews. So, the ITesteableView interfaces define the contract that every adapter must comply with, and each application implements a version of this adapter.

These interfaces reside in the guitestinterfaces.dll assembly, so the application is decoupled from the tests, and vice versa.

Be as blind as your user is when you write testing code

Your TesteableView classes will need to perform actions in the same fashion that a user would.

Although your test code can have unrestricted access to the application, you must act like it doesn't.

If you want the test to be meaningful, you must avoid the urge of hacking your application to bypass – for whatever reason – the UI framework you are using.

This means that things such as looking for the Click event handler of a button and directly invoking it is strictly forbidden. This seems obvious, but there are times it is not.

For example, it is also forbidden to directly access the model of a list; if you want to get the text of a row in a NSTableView, you must look for the NSTextView of the cell you are interested in. This requires a deep understanding of how the framework you are using works, because sometimes it is not easy to retrieve information from some GUI elements; they seem to be designed as write-only black boxes. But don't worry, there is always a way.

And, the code to gain access to the innards of your framework of choice is code you'll only have to write once. The main goal is to elaborate helper classes that express basic, reusable actions such as ClickButton, ChangeEntryText, or GetTextInRow.

And remember, because the test is executing in its own thread, you'll need to perform invokes on the graphical item you are manipulating! For Windows Forms, an easy example of this helper class would look as follows:

You will have to write one of these helper classes for each GUI framework you use, but most controls are easy to deal with. I used the NSTableView example before because lists (and trees) are a major pain regardless of the framework (and we have covered it in the example code we provided before!).

The TesteableView classes

What we have so far in this example is the ITesteableInputDialog interface that express basic actions, and the TestHelper class that will help us manipulate the graphical elements of our GUIs.

How do we implement the classes that perform the actions? As I said before, we follow the Adapter Pattern. We will have to do it once per GUI framework we use, because this TesteableView family of classes do indeed rely on said framework (through the TestHelper class).

For this to work OK, the class being adapted must expose its controls, either as public members or as public properties. Let's say this is our InputDialogimplementation:

...and it really doesn't matter if it really looks like this...:

...or like this:

The actual implementation of the ITesteableInputDialog comes as natural as this, just replacing the platform-dependent components for the given framework counterparts:

With this, your test code is easy to read (supposing that the testeableDialog instance is of the type ITesteableInputDialog):

testeableDialog.ChangeEntryText("Sergio");
testeableDialog.ClickOkButton();
Assert.IsNullOrWhitespace(testeableDialog.GetEntryText());

A central point for your application's windows and dialogs

Right now, you might be asking: where does that ITesteableInputDialog instance come from? How can we access it from the testing code? The answer is: it depends on the importance of the window.

We write our code pretty symmetrically across platforms – even the code that is not cross-platform, even the code that is not strictly related to GUIs. On all three platforms (Gluon in Windows, Plastic and Gluon in the rest) we have a center-pieced class called the WindowHandler. We know, adding the "handler" suffix to a class name should at least raise some eyebrows, as it is way too generic. But I swear, that class purpose is to handle windows within the application.

When the application starts, after parsing the arguments, initializing everything that needs to be ready and the like, there is something like the following code:

The WindowHandler, that holds a reference to the ApplicationWindow (or whatever is called in your application), will initialize the GuiTesteableServices class on the LaunchTest method call. The GuiTesteableServices class is the other thing that lives in the guitestinterfaces.dll assembly, apart from the actual interfaces I talked to you about before.

Then, because the test assembly also depends on the guitestinterfaces one, and because the GuiTesteableServices class is initialized way before the tests actually begin executing, a test can access the ITesteableView as easy as this:

But what happens when that ApplicationWindow launches our InputDialog? How do we get access to it? The most logical answer is: "through its parent window". The ITesteableApplicationWindow interface could look as follows:

Now, completing our test method a little bit...

This starts to look good – simple, expressive code that achieves something! Let's move on.

Wait for things to happen

Look again at the testing code we just wrote. We acquired a testable window, we made a dialog appear by clicking a button, we acquired the testable class to interact with said dialog, and through it, we changed an entry, clicked the Ok button, and immediately checked that the entry had been cleared.

But depending on how the application behaves, this test can be fragile.

What happens if the dialog does not appear immediately after clicking the button in the window?

What happens if we fire some I/O or intensive CPU work when we hit the OK button in the dialog? If something like this happens, causing a delay between the action and the reaction (click – dialog appears, click – text input clears), the test, as it is now, will fail.

See, applications don't always immediately respond to actions – and that's OK, if they provide some visual feedback to the user. What we usually do is disable the "action controls" (those controls the users can interact with either to send to, or to retrieve data from the application).

For example, in Plastic SCM, when you create a workspace, as long as the background operation that actually creates the workspace is running, all of the inputs and buttons on the dialog are disabled – both to tell the user that something is happening behind the curtains, and to prevent them from firing more events that could lead to an inconsistent state of the application.

The buttons, the text input, and the list are all disabled while the application does its background work


This way, when a test performs an action on the GUI that will fire a long running job, it waits for a hint from the GUI that the operation ended just as a user would. Of course, there are timeouts to prevent a test from hanging for too long. For this purpose, another helper class enters the ring: the WaitingAssert.

Its name is self-explanatory: it is your everyday Assert class, but it waits for a specified period for the condition to be successful before actually failing.

Now, we should add a new method to our TestHelper class...:

...and three more methods to our ITestableInputDialog interface (with its implementation, that I'm going to skip):

With this, we can adapt the test to work under the scenario described before:

Of course, this is not the only way to test the same behavior. For example, you could wait for the entry to be cleared, and then just check if the inputs are enabled immediately:

Is there a problem with it? Maybe yes, maybe no. It doesn't check if the controls are disabled at some point, so a more complete test would be this:

What I'm trying to tell you with these examples is that writing a GUI test is sometimes tricky. There are a lot of things to check for, and many of them seem so mundane that they are easy to overlook. Be thorough with what you check, keeping in mind how does you GUI work. And writing tests is also a good occasion to rethink if the behavior of your GUI makes sense!

DRY – Don't Repeat Yourself

As you saw before, the ITestableInputDialog interface implemented "atomic" UI operations. Why isn't it more high-level? For example:

We started coding the tests that way. We wrapped high-level actions (for example, creating a workspace), in methods that would make a lot of the GUI work all at once (click the workspace switcher button, wait for the dialog to appear, navigate to the workspaces tab, click the "New workspace" button, fill-in the fields, click the OK button, wait for the action to complete, check that the workspace was correctly created and that it appeared on the workspaces list, open it, and close the workspace switcher).

As time passed, we realized that having a high-level action wrapped in a method call made us lazy. Instead of thinking "uhm, do I really need to create a workspace using the GUI for this test, that is not related to the workspaces switcher window at all?" we would call this CreateWorkspace method and be happy with it.

This caused long tests even though the code was short, concise, and easy to read; it took ages to execute!

We already had a test to check that creating workspaces from the GUI worked OK. Did we really need to repeat that cycle over and over, even if creating a workspace was just collateral regarding what the test was really checking? No, of course not. Instead, we now create a workspace from the GUI only in the test that is specifically testing that part of the application, and then just rely on the cm makeworkspace command for the rest of the tests (through our CmdRunner library)

Because we only take the long path in one test, we don't need high-level functions that perform a lot of GUI actions in just one method call. We just forbade them, as they enable what we consider is a bad practice.

Of course, your application might not have a command line interface, and you just must perform some actions from the GUI over and over for each test. That's fine as long as you keep in mind that the more actions you repeat through the GUI that do not add any additional value to your testing will translate into waste: in time, in money, and, in the long run, in your test maintainers' mental health when they have to debug a test and wait many times for the same repetitive, mind-numbing action to complete.

Don't let exceptions fly under the radar

I'm going to cover some specific things now that might not apply to your use case, but that I still consider important to go over.

One of them is exception handling. Or, better put, unhandled exception handling. Chances are, you are a cautious developer, and although your code is exception-proof, you have some piece of code that looks a lot like this:

And this is not bad practice at all. All applications ship with bugs and ensuring our application does not crash (or at least does not crash in a rude way) because some exception we didn't think of is perfectly fine. But, if you are testing the application and an unhandled exception arises, instead of congratulating yourself that it didn't crash the application, make the exception crash it big time, so it can be solved. The unhandled exceptions handlers are a safety net once the application is deployed – pass your tests without that safety net.

You can crash the application as soon as some unhandled exception is thrown, or you can save them for later and make the test fail and log each one of them. That's your call.

And then you wouldn't even need to check for this manually in each one of your tests like this:

What we did is implement the tests in a way so that they implicitly check for a few things on each run (remember the DRY point!). I'll tell you how at the end of the next section.

Don't let unexpected dialogs ruin your day

Sometimes, an exception rises from a background thread. Maybe the Plastic client couldn't connect to the server. Maybe the disk is full. Maybe something else, but because it is an error that the user can actually do something about, we just want to show them the message and carry on.

During testing, this kind of error also happen. But, because the test is not expecting an error dialog, said dialog shows up, and the test tries to continue behind it. Then, the test will fail with some message like "couldn't find the recently created workspace in the workspaces list", but we will miss the root cause of it. It is displayed in the unexpected dialog.

How do we ensure this doesn't happen? It is good that you can retrieve expected dialogs from their parent window as we saw before, but just to be sure we don't miss it, that dialog should register and de-register itself from a central place. That place is once again the WindowHandler.

Then, we ensure this is done in an automatic way inheriting all our dialogs from a base class that might look like this:

Then, at the end of our tests, we could do this:

But, as I promised before, there is a better way.

Don't implement basic checks over and over

We have two things we will always want to check for: unexpected exceptions and unexpected dialogs. Implementing a way to check for them on each test run is easy. I'm only covering it here in case this is one of those "eureka" moments you need to wait weeks or months of doing it the hard way to realize:

Wrap your test execution code.

Not your test code. Your test execution code.

Remember, that your application is implementing a full test runner? Use it to your advantage and customize it as needed.

You can't test it all

Unluckily for us, this testing system has a sad drawback. It prevents you from taking advantage of some standard GUI components present in many of the frameworks.

For example, the MessageBox class in Windows Forms does not expose its message nor buttons in any way, so we can't wrap it with our TestableMessageDialog adapter. Of course, we could try reflection, but we have decided not to go down that path, and instead implement our standard MessageBox-alike classes (which also allows us to show them from the platform-agnostic code; you can discover how browsing the framework's repository).

Error message


Another standard component that doesn't usually expose its elements is the typical file or directory chooser dialog. It is used while opening files, saving them, or choosing an application to open another file. Because these kind of dialogs can get quite complex, instead of implementing our own, we just don't have tests that cover the functionality where we use them.

This means that some of the parts of the GUIs, such as the "Open with..." functionality on the Workspace Explorer view go untested. We rely on QA for that.

Is QA still necessary? The caveats of this system

Yes, of course Quality Assurance is still necessary, despite of how complete your GUI tests are!

As I stated before, these tests are a great safety net to ensure that everything that is tested works OK. But there are things still impossible to check that need a pair of human eyes (at least for now, who knows if AI will take over this...;)).

Let me give you a couple of recent examples on how the GUI tests didn't detect any issues until they hit the QA team, who then prevented them from going into a public release of Plastic SCM:

The mergetool window in macOS was, for some reason, small and not resizable. Because all the buttons, sliders, text boxes, and the like were still there, the tests passed OK on the first try, but a human user could not interact with the application in any way.

Another example I can tell you came when we adapted the macOS interfaces to Mojave's dark theme. We missed some icon overlays in the process (those little icons in the workspace explorer that indicate if an item is checked out, ignored, locally changed...). The image view was there, but it couldn't load the icons from the disk, so they were not displayed. Again, because the tests relied on the Status column instead of in the icon overlay, they passed OK.

What was obvious for a human user was completely oblivious for the machine.

Wrapping up

Let's do a quick recap.

Because the boundaries of the application are an issue for GUI testing, I revealed that our applications are also test runners that work with PNUnit. You learned the scenarios where PNUnit is useful, we went through its architecture, and wrote a couple of parallel tests.

Then, you discovered the main components of our application's architecture, and what's more important – their purpose, and how they interact with each other. We wrote a simple dialog that takes some input, and we started developing a test which we improved incrementally to overcome every little caveat we could think of.

By now, you might still have some questions about specific details that I might have overlooked. Once again, there is a repository with a thorough, working example of what I showed you: a multiplatform application (Windows, GNU/Linux and macOS) with four GUI tests.

macOS GUI testing running


You will find instructions on how to run the application and the tests within the repository, and useful comments inside the code that will hopefully guide you in the right direction.

If you find something weird or have any doubts, remember that this is a work in progress, and that you are more than welcome to open an issue on said repository. Or to leave a comment in this blogpost. Or to ping us on Twitter at @plasticscm.

But we also have some questions for you too! Do you do GUI testing? In which platform? Maybe its Android or iOS, instead of a desktop application? Or React instead of a "native" framework? Did you find GUI testing useful? Or did you stop doing it at some point for any reason?

We're looking forward to your input on the subject!

Sergio Luis
After an intense internship at Codice during spring and part of summer 2015, I joined the ranks of Plastic SCM as junior developer.
I already contributed code to the Plastic REST API, the HAL Slack bot that controls our CI system, migrated our internal main server to "new" hardware, coded an Android repo browser and hacked wifi-direct for the upcoming Plastic version.
You can reach me at @_sluis_.

0 comentarios: