Testing Code That's hard to test

Testing Code That's hard to test

Josh Cheek
Josh Cheek

October 01, 2011

So if you're like me, you're 5'10" and sport a ruggedly awesome beard. And if you're really like me, you swallowed the T/BDD pill years ago and have struggled to test on 15 to 20 projects trying to really feel like you're experiencing those advantages. I'm not going to write a "why you should be using TDD" blog, that's been discussed in the Ruby and Agile communities ad nauseum. Instead, I'm going to show you one of the patterns I've discovered during my apprenticeship at 8th Light that has allowed me to stop saying "I tried to test it" and start saying "I did test it".


First, of all, we need to talk about dependencies. If you're familiar with the SOLID principles of Object Oriented design, then you're doing good. Specifically we're going to talk about:

  • Dependency Inversion Principle: Your high-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend upon details. Details should depend upon abstractions.
  • Single Responsibility Principle: Every object should have a single responsibility, and that responsibility should be entirely encapsulated by the class.

We're going to use the Dependency Injection pattern to help with this. The basic idea is that instead of referencing dependencies directly (creating instances ourselves or specifying which namespace to find methods in), we'll reference objects that abstract those dependencies. This will allow us to test the code in question apart from the details they depend on. Then we're going to extract responsibilities into classes so they're easier to test and reason about.

Code That's Hard to Test

Let me show you some code that's hard to test. It is a low level class that prompts the user for an integer and returns it. If the user doesn't enter a valid input, it tells the user and reprompts.

1class UserPrompter
2 def self.integer
3 print "Enter an integer: "
4 input = gets
5 return input.to_i if input =~ /^\\d+$/
6 puts "That is not an integer"
7 self.integer
8 end

Now how do we go about testing this code? Not too long ago, my brain would have first jumped to to popen3 or something similar. That's a long miserable tangent that leads you to hate testing -- if you can even get it to work at all (which I have).

As I learned more, it got better, I'd mock $stdin and $stdout like this:

 1class UserPrompterTest < Test::Unit::TestCase
 3 require 'stringio'
 4 def mock_stdin(user_input)
 5 stdin = $stdin
 6 $stdin = mock = StringIO.new user_input
 7 yield
 8 mock.read
 9 ensure
10 $stdin = stdin
11 end
13 def mock_stdout
14 stdout = $stdout
15 $stdout = mock = StringIO.new
16 yield
17 mock.rewind
18 mock.read
19 ensure
20 $stdout = stdout
21 end
23 def test_integer_happy_path
24 stdout = mock_stdout do
25 stdin = mock_stdin "12" do
26 assert_equal 12, UserPrompter.integer
27 end
28 assert_equal '', stdin
29 end
30 assert_equal 'Enter an integer: ', stdout
31 end
33 # ... more tests for unhappy path ...

But wow, that's pretty miserable, too! We can't test that it works without mirroring the messages (to be fair, we can make it a bit better by matching keywords, but its still problematic). We're hijacking global variables, so our tests can't run concurrently! What if our test suite prints to $stdout? That could mess our test up (plus the output would be really confusing). What if we change our messages? We'll have to go change our tests. And what if we ever want it to talk to something other than standard ouput? We might want it to read its input from a file, or over a stream from some remote terminal.

People say when your code is hard to test, it's a design smell. This is what they mean. This code is hard to test because it's coupled to its dependencies. It's fragile, it's hard to read, and it just generally reeks of hackiness.

Inject the Dependencies

So what is the problem here? Why is this code so hard to test? It's pretty simple code, but we just can't get in there because we have hard dependencies on standard input and standard output. The method invokes Kernel#print directly (Kernel is mixed into Object, so all objects inherit it), forcing us to intercept $stdout as shown above.

So what's the solution? Remove the dependencies on stdin and stdout. Instead, we'll inject those dependencies like this:

1class UserPrompter
2 def self.integer(input_stream, output_stream)
3 output_stream.print "Enter an integer: "
4 input = input_stream.gets
5 return input.to_i if input =~ /^\\d+$/
6 output_stream.puts "That is not an integer"
7 self.integer input_stream, output_stream
8 end

Now what does our test look like?

 1class UserPrompterTest < Test::Unit::TestCase
 3 require 'stringio'
 5 def test_integer_happy_path
 6 stdin = MockStream.new('12')
 7 stdout = MockStream.new
 8 assert_equal 12, UserPrompter.integer(stdin, stdout)
 9 assert_equal '', stdin.read
10 assert_equal 'Enter an integer: ', stdout.string
11 end
13 # ... more tests for unhappy path ...

That's a lot better. Instead of relying on standard input and standard output, we rely on the inputstream and outputstream parameters. We no longer change global state. Any code that wants to use this can decide what it wants to read from and what it wants to write to (e.g. UserPrompter.integer($stdin, $stdout) would result in the initial behaviour).

The Single Responsibility Principle

Personally, I'd take it a step further, and abstract away from this code even more. This would allow me to get away from the details of the message (dependency inversion) and just test the logic itself. We do this by delegating the responsibilities of reading from the input stream and writing to the output stream to other objects.

 1class UserPrompter
 2 def self.integer(reader=Reader, writer=Writer)
 3 writer.ask_for_int
 4 int = reader.read_int
 5 return int if int
 6 writer.notify_not_an_int
 7 self.integer reader, writer
 8 end
10 class Writer
11 def self.ask_for_int(output_stream=$stdout)
12 output_stream.print 'Enter an integer: '
13 end
15 def self.notify_not_an_int(output_stream=$stdout)
16 output_stream.puts 'That is not an integer'
17 end
18 end
20 class Reader
21 def self.read_int(input_stream=$stdin)
22 input = input_stream.gets
23 input[/^\\d+$/] && input.to_i
24 end
25 end

This can then be tested like this:

 1class UserPrompterTest < Test::Unit::TestCase
 2 def setup
 3 @reader = MockReader.new
 4 @writer = MockWriter.new
 5 end
 7 def test_integer_happy_path
 8 @reader.int_responses = [12]
 9 assert_equal 12, UserPrompter.integer(@reader, @writer)
10 assert_equal 1, @reader.times_read
11 assert_equal 1, @writer.times_asked_for_int
12 assert_equal 0, @writer.times_notified_not_an_int
13 end
15 # ... test non-happy path ...
18# ... declare mocks here ...

This is robust because we know it does what it says it does, as we no longer have to parse strings, but can instead check that it invoked the methods that are responsible for printing the strings. What's more, the integer prompter knows nothing about strings or streams! It could prompt from anything, even another method (treat a service as the user and wrap its API with a custom reader and dummy writer). All the details of what it means to ask for the integer, and what it means to get the integer are now encapsulated by objects responsible for those tasks.

Because we've separated these responsibilities out, we can then test the reader and writer in isolation. There used to be several reasons our tests could break. They could break because we needed to change the logic of prompting for an integer (client thinks error message is redundant, just reprompt), or because the message itself changed (client wants the error message to explain that an integer contains only digits) or the process of prompting changed (need to check that the stream is open before writing). Now these changes are encapsulated in their own objects. We could perhaps take this even further by pulling the message contents out and placing them in a settings file.


Code should be easy to test. If it isn't, figure out why. If you're having difficulty intercepting a dependency, pass it in instead. If the code is doing too much, extract the lower responsibility into its own method or object.

Another nice advantage of this is that if you break something while refactoring, probably only one method will fail, the method that checks the thing that broke. When your tests let the objects use their dependencies, then they will fail when the dependent code fails, making for a lot of spam in your test results and reducing your ability to quickly identify the regression.

Here is the full version of the final code.