Unit Testing Code Boundaries

Unit Testing Code Boundaries

Ashley Bye

June 04, 2019

When I first learned to unit test my software, I noticed that I struggled to test code that interacted with components at the boundaries of my design. These components would often be input/output related, whether that was getting input from the command line or firing off a request to an external HTTP API. Due to these external dependencies, I couldn't work out how to write a test that didn't need to use the service in some way or another. Having read that I should keep my tests decoupled from any external dependencies, I regularly chose not to test these aspects of the system due to it being "too hard" to do.

Hopefully you will agree that my initial approach was a bad decision. Such components serve an important role in the systems we develop—they are the interfaces through which we send and receive the data our application depends upon, which implies that they are also depended upon by our customers, and, ipso facto, our businesses.

By not writing unit tests to cover my software's interaction with boundary code, I was introducing a great deal of risk into my applications, reducing the flexibility and extendability of my software, and potentially increasing the cost of future changes.

After battling with this problem for a while, I started to learn about mocks. Ah ha, I hear you say. At last! Unfortunately, this sad little story doesn't quite end there. Mocking is certainly a good solution, and as I investigated the various possible mocking frameworks, I still found that I was struggling. Most of the examples I found relied on mocking the method calls and responses of library software designed to interact with an external API.

This meant that I needed an intimate knowledge of the internals of the library, which I just didn't have. Further, my code was completely coupled to the library code, and any changes to it rippled throughout my software. It also meant I couldn't easily swap out one external service for another without significant changes to my code. That smells like a violation of several SOLID principles to me. In short, I still hadn't resolved the business-impacting problems I wrote about above.

I think that my problems stemmed, in large part, from the wealth of literature explaining the concepts of unit testing and test-driven-development. I agree with my colleague, Thomas Countz, who opined in his recent article on Essential and Relevant Unit Tests that much has been written on why to do unit testing but little has been written on how to do unit testing well.

A few weeks ago I was ramping up on a new project that is primarily written in Python. Coming from a Java background, I'm familiar with the main unit testing and mocking frameworks, as well as writing manual mocks. My experience with Python is more limited and I wanted to know what mocking frameworks were available. As I researched the popular tools, I noticed that their "Quick Start" guides gave examples of directly mocking methods on external APIs. I believe this is bad advice that is likely to trip up new developers, as well as a fair number of seasoned pros. So what's the solution?

In the remainder of this article I'll explain what boundary code is, highlight the benefits of designing the boundaries of a system with testability in mind, cover the key areas in which I think developers can improve their unit tests, and provide a couple of examples along the way. I'll use both Python and Java for the examples, since they are popular languages.

What is boundary code?

When I talk about boundary code, I mean any part of a system that interacts with the outside world. That is, anything that receives input or transmits output. One way of categorising such boundary components comes from Hexagonal Architecture—any component that is a port is considered to be on the boundary.

There are two main types of boundaries, one of which is significantly more obvious than the other:

  1. External systems.
  2. Standard libraries and packages.

Code that interacts with external systems is, I hope, an obvious boundary component. Examples include interacting with database systems, file systems, REST APIs, audio devices, etc.

Standard libraries and packages are perhaps less obviously boundary components. These libraries are often provided by programming language authors to make developers’ lives easier, and are in many instances baked into our language of choice. But let's take a quick moment to examine the functionality that some of them provide: filesystem IO, console IO, HTTP clients, etc. Since these types of libraries exist to deal with getting input and receiving output, I consider them to be boundary components.

What are the benefits of designing boundaries with testability in mind?

As I mentioned above, the main problems that arise when developers are encouraged to couple their code to a boundary are that:

  • Decision making cannot be deferred.
  • Risk is increased.
  • Flexibility is reduced.
  • The cost of changes goes up.

As software developers, we should strive to give our teams as many options as possible for as long as possible, reduce risk, increase flexibility, and minimise costs. Doing so provides them with the best opportunity to respond and adapt to changing market demands. Whilst effectively tested code boundaries is not the prime factor to determine how well software meets these goals, it goes a long way. As my mum used to say when getting me to tidy my room, take care of the edges and the rest takes care of itself.

How can unit testing at code boundaries be improved?

As I outlined above, I believe there are two key areas that constitute boundary code: code that interacts with obviously external systems and code that uses standard libraries to interact with the local system. The solution is the same for each of these—by employing the Adapter Pattern we can use the Humble Object test pattern to push the complexity of interacting with code boundaries to the very edges of our system. Below, I'll provide an example for each of the two cases in turn.

Unit testing code boundaries with external systems

When developing unit tests, it can be hard to write tests for code that interacts with external systems. This is because we are writing code that interacts with services that we do not own and which, more often than not, do not have a deterministic output. In terms of functional programming, functions that interact with external systems cause side-effects. Since components at the boundary rely on side effects, they are at odds with the deterministic nature of unit testing. We regularly use third-party libraries and frameworks to help us interact with such systems.

We can avoid directly mocking calls on these external APIs by defining an interface that serves as an adapter to the service. The adapter wraps the code that interacts with the external service with methods that describe the functionality we need from the external service. We then implement this interface with the code that calls the real API and as many mocks as we need for testing purposes. In a dynamic language such as Python, we use duck typing to achieve this.

For example, suppose I am writing an application that needs to download CSV files from an Amazon S3 bucket. Further, suppose that the business has said they will be moving all their infrastructure over to a similar service on the Microsoft Azure platform within the next 12 months. If I don't use a wrapper, my code might look something like the following:

import io
import os

import boto3

def download_csv(filename):
				s3_client = boto3.session.Session().client(
								service_name='s3',
								endpoint_url=os.environ['S3_URL']
				)

				with io.BytesIO() as data:
								s3_client.download_fileobj(
												os.environ['BUCKET_NAME'], filename, data
								)

				return io.StringIO(data.getvalue().decode('UTF-8'))

By embedding the code that interacts with S3 directly in the program code, I am unable to test anything that calls it. As such, I'm unlikely to test any behaviour of this part of the program. To overcome this, I can use the Adapter pattern create a wrapper object:

import io
import os

import boto3


class S3Repository(object):
				def download_csv(self, filename):
								s3_client = boto3.session.Session().client(
												service_name='s3',
												endpoint_url=os.environ['S3_URL']
								)

								with io.BytesIO() as data:
												s3_client.download_fileobj(
																os.environ['BUCKET_NAME'], filename, data
												)

												return io.StringIO(data.getvalue().decode('UTF-8'))

Not much has changed—the download_csv() function is now a method on an S3Repository class, which contains the complex and hard to test logic. Any clients of the class no longer rely on the implementation of the download_csv() method—they have become Humble Objects. I can use duck typing to substitute a mock wrapper in place of the real implementation during tests (note that we should extract the logic to load the environment variables into their own adapter too):

import io


def example(filename, repository):
				return repository.download_csv(filename)


class TestRequiringS3Repository(object):
				def test_example(self):
								repository = MockS3Repository()
								csv = example("filename.csv", repository)

								assert csv.read() == "Some,Test,Data\n1,2,3"
								assert repository.download_csv_called_with_filename == "filename.csv"


class MockS3Repository(object):
				def __init__(self):
								self.download_csv_called_with_filename = ""

				def download_csv(self, filename):
								self.download_csv_called_with_filename = filename
								return io.StringIO("Some,Test,Data\n1,2,3")

My unit test is now decoupled from boto3 and thus the dependence on Amazon S3, meaning I can test behaviour without needing to have the necessary settings configured to connect to the correct S3 account. I can use different mocks to exhibit different behaviour, such as when a file is not found, or if there are network issues. When I need to swap Amazon S3 for Microsoft Azure, I only need to implement a new class wrapping the same functionality but for Azure. If the external library changes its API, such changes are isolated to one place in my code.

You may be wondering how you would test the code that actually communicates with the S3 bucket. Such code should certainly be tested, but it is not the role of a unit test to do so. Instead, exercise this functionality with automated integration tests—these need not, and should not, test every single eventuality, but they should verify that the repository code works as expected in at least one scenario.

Unit testing code boundaries with standard libraries

A similar approach should be taken when testing code that uses standard libraries to perform IO-type actions. The main difference, and perhaps objection, when doing so is that these APIs are generally stable and unlikely to change. However, using a wrapper object makes testing the behaviour at the boundary much simpler.

To demonstrate this I'll use a simple example, which asks a user for their name and tells them how many letters it contains. The main logic is counting the number of characters in the input string, but to test that everything is wired up correctly, I want to simulate input and output. This can be done with the following test:


@Test
void tellsAshleyHeHasSixLettersInHisName() {
				MockConsole console = new MockConsole("Ashley");
				CliApp app = new CliApp(console);

				app.run();

				assertTrue(console.displayedMessageWas("Ashley has 6 letters"));
}

To implement this, I created a simple interface called Console—not to be confused with java.io.Console—that serves as an Adapter for Java’s input and output functionality, making it a Humble Object too. My mock console then simulates the input and asserts that the correct output was requested:


public interface Console {
				String getName();

				void displayMessage(String message);
}

public class MockConsole implements Console {
				private String input;
				private String message;

				public MockConsole(String input) {
								this.input = input;
				}

				@Override
				public String getName() {
								return input;
				}

				@Override
				public void displayMessage(String message) {
								this.message = message;
				}

				public boolean displayedMessageWas(String expectedMessage) {
								return message.equals(expectedMessage);
				}
}

The application logic is very simple, as follows:


public class CliApp {
				private Console console;

				public CliApp(Console console) {
								this.console = console;
				}

				public void run() {
								String name = console.getName();
								int numberOfLetters = name.length();
								console.displayMessage(String.format("Ashley has %d letters", numberOfLetters));
				}
}

By declaring Console as a wrapper interface, I am able to remove any dependency on an IO library, resulting in more robust tests, flexibility and options—in this trivial example that would most likely only be reading from System.in and writing to System.out, but I hope you can see how powerful this approach to unit testing code boundaries with standard libraries is. Adapters can prove especially beneficial when working with more complex libraries, such as sockets.

Wrapping up

Unit testing is an important tool in a developer’s toolbox. However, it is not enough to merely have the tool at our disposal. Rather, we need to become adept at wielding it in each of the situations it is needed. One area where I think existing instruction could be improved is how to use unit testing to improve the quality of our code boundaries. By doing so, we benefit by designing software with reduced risk, increased flexibility, and decreased cost of changes. To enable these benefits we can use the Adapter and Humble Object patterns.