Window Driver

Provide a programmatic api to drive and interrogate a UI window.

26 August 2004

This is part of the Further Enterprise Application Architecture development writing that I was doing in the mid 2000’s. Sadly too many other things have claimed my attention since, so I haven’t had time to work on them further, nor do I see much time in the foreseeable future. As such this material is very much in draft form and I won’t be doing any corrections or updates until I’m able to find time to work on it again.

A user interface window acts as an important gateway to a system. Despite the fact that it runs on a computer, it often isn't really very computer friendly - in particular it's often difficult to program it to carry out tasks automatically. This is particularly a problem for testing where automated tests do a great deal to simplify the whole development process.

Window Driver is an programmatic API for a UI window. A Window Driver should allow programs to control all dynamic aspects of a window, invoking any action and retrieving any information that's available to a human user.

How it Works

The basic rule of thumb for a Window Driver is that it should allow a software client to do anything and see anything that a human can. It should also provide an interface that's easy to program to and hides the underlying widgetry in the window. So to access a text field you should have accessor methods that take and return a string, check boxes should use booleans, and buttons should be represented by action oriented method names. The Window Driver should encapsulate the mechanics required to actually manipulate the data in the gui control itself. A good rule of thumb is to imagine changing the concrete control - in which case the Window Driver interface shouldn't change.

The names in the interface of the Window Driver should reflect the visible UI, so name elements after their labels on the screen.

Rich client windows usually organize their controls into a complicated hierarchical structure. This is particularly the case when you use flow based layouts as opposed to absolute coordinate layouts. The Window Driver should hide the layout design as much as possible from the interface. That way changes to the internal layout shouldn't cause clients to change. An exception to this may be multipart windows using techniques like tabs or sidebar selectors. In this case the sheer number of controls make it worthwhile to split up the programmatic api into separate Window Driver classes.

One of the more tricky aspects of a Window Driver with modern UIs is dealing with mutliple threads. Usually once a window is launched it runs on a different thread to the driving program. This can cause nasty threading bugs to breed that makes the Window Driver unreliable. There are a couple of ways to deal with this. One is to use a library that will put requests onto the UI thread. Another is to not actually launch the window so that it never actually goes into the UI thread.

The interface of a Window Driver can expose either the widgets themselves, or expose methods that make the changes you're interested in making on the widgets. So if you want to expose a text field in swing, you can either do it by a single method JTextField getArtistField(); or with methods for the different aspects of a field String getArtistField(), void setArtistField(String text), bool getArtistFieldEnabled(). Returning the field itself is simpler, but does mean that the client of the Window Driver is dependent upon the windowing system and programmers using the Window Driver need to be familiar with how the windowing system works. It also means that windowing system has to fire events on changes to the widgets that are done by calling methods on them directly. On the whole I prefer to return the widgets unless there is a good reason not to.

When to Use It

The most common use for a Window Driver is for testing, particularly when using an Autonomous View. The Window Driver isolates the tests from the details of the implementation of view, simplifying writing the tests and isolating them from changes in organization of the view.

Window Driver can also be used to provide a scripting interface on top of the application. However in most cases it's better to write such an interface on a lower layer in the system. One case where this may be difficult is when you have an application with lots of behavior embedded into the view and it's too difficult to move this behavior into a lower layer. I'd still prefer to move the behavior if I could.

Window Driver may not be needed if you work hard to provide a really thin view. Patterns such as Supervising Controller, Passive View and Presentation Model aim to make a Window Driver unneccessary.

Example: The Swing Album Example (Java)

Here's the Window Driver I used for the album running example. One of my requirements was that I could write a single set of tests which would test multiple implementations of the this window using different patterns. This isn't going to be a common requirement, but it helps show how the Window Driver can provide some implementation independence.

I begin by defining the common interface for the Window Driver

public interface AlbumWindowDriver {
    JList getAlbumList();
    JTextField getTitleField();
    JPanel getMainPane();
    JPanel getAlbumDataPane();
    JPanel getApplyPanel();
    JScrollPane getAlbumListPane();
    JTextField getComposerField();
    JCheckBox getClassicalCheckBox();
    JSplitPane getSplitPane();
    JButton getApplyButton();
    JButton getCancelButton();
    JTextField getArtistField();
    JFrame getWindow();
}

As you can see this just exposes the various widgets for manipulation. I then implement this interface directly in the various view implementations that I have.

In my case I'm driving the tests using Jemmy. The testcase class uses Jemmy's operators to wrap the controls exposed by the Window Driver.

private JTextFieldOperator title;
private JTextFieldOperator artist;
private JListOperator list;
private JFrameOperator window;
private JCheckBoxOperator isClassical;
private JTextFieldOperator composer;
private JButtonOperator applyButton, cancelButton;
private Album[] albums = (Album[]) Mother.albums().toArray(new Album[Mother.albums().size()]);

protected void setUp() throws Exception {
    AlbumWindowDriver frame = doCreateFrame();
    window = new JFrameOperator(frame.getWindow());
    title = new JTextFieldOperator(frame.getTitleField());
    list = new JListOperator(frame.getAlbumList());
    artist = new JTextFieldOperator(frame.getArtistField());
    isClassical = new JCheckBoxOperator(frame.getClassicalCheckBox());
    composer = new JTextFieldOperator(frame.getComposerField());
    applyButton = new JButtonOperator(frame.getApplyButton());
    cancelButton = new JButtonOperator(frame.getCancelButton());
}
protected abstract AlbumWindowDriver doCreateFrame();

I'm using Jemmy's operators to handle the threading issues. Jemmy also provides capability to allow you to find controls in a form's hierarchy, but I don't need that as the Window Driver gets me directly to the controls I need.

The abstract method doCreateFrame() is there so I can use a subclass to setup the actual view implementation I'm using. Unless you have this odd requirement you can just instantiate the view inline.

With the various variables setup, I can now write tests that directly manipulate the controls.

public void testCheckClassicalBoxEnablesComposerField() {
    list.setSelectedIndex(4);
    assertEquals("Zero Hour", title.getText());
    isClassical.doClick();
    assertTrue(isClassical.isSelected());
    assertTrue("composer field not enabled", composer.isEnabled());
    applyButton.doClick();
    list.setSelectedIndex(0);
    list.setSelectedIndex(4);
    assertTrue("composer field not enabled after switch", composer.isEnabled());
}