Add Approval Testing to Your Toolbox

The term “legacy code” might conjure images of crusty old COBOL programs running on VAX mainframes doing batch processing of really boring calculations that no developer today wants to go anywhere near. You also could throw in a few paper tapes, punch cards, cobwebs, and spiders, and I reckon that might be the view of “modern” developers.

However, I like Michael Feathers’ description of legacy code from his book Working Effectively With Legacy Code. He defines legacy code as “code without tests.”

“

Maverick developer: “Whoa, dude! You’re saying all my code is legacy?” Me: “Yep!” Maverick developer: “I’ve written a gazillion lines of code, man. You expect me to test all of that?” Me: “Yep!” Maverick developer: “Well, that’s gonna take forever.” Me: “Well, maybe not, and I’m sure you’ve not written that much. There is a way …”

”

Getting serious for a moment, what I am suggesting to “Maverick developer” is that we need to write a bunch of tests that preserve the behaviour of the current system, so that we can change it with confidence in the future. Such tests are called “characterization tests,” and we have Michael Feathers to thank for this. He writes a definition of characterization tests in the same book:

“

“Tests that we need when we want to preserve behaviour.”

”

Depending on the size of the application, writing a whole suite of tests could take some time. However, there is a testing tool called “approval testing” that helps you add tests to code that have none. Although uncommon, a lot of developers could benefit from leveraging such a tool.

In this article, I introduce some techniques to get started with approval testing, and then show how to advance your skills with combination approvals.

Step Forward With Approval Testing

Approval testing allows you to quickly add tests to untested code by comparing a previous test run with the current. Any differences between the runs will cause the test to fail, and you need to approve the differences for the test to pass. For example, suppose you have the following (trivial) legacy code that has no tests:


fun add(a: Int, b: Int) {
    return a + b;
}

Normally you’d write a unit test for this and pass in some arbitrary values to see what it does, then write an assertion that captures how it works. You might use the built-in assertion calls in your testing library to verify it, but you can use approval testing library assertions instead.

When the test runs with approval testing, the provided result is stored in a “received” file, which requires “approval.” When you approve the result, it is stored in an “approved” file. The test fails when the two files are different.

You “approve” the received values by merging or copying the result from the “received” file to the “approved” file. Once the two files are the same, the test passes. For example, given this test and implementation code:


@Test
    public void verifyAddingGivesCorrectResult() {
        int result = add(2, 4);
        Approvals.verify(result);
    }

    int add(int a, int b) {
        return a + b;
    }

The test above will fail, and a diff viewer will pop up showing the received.txt file containing 6 and the approved.txt file will be blank. Merge the data across, and the test will pass. Then write your next test and continue in the same way, copying each result from the received.txt file into the approved.txt file.

So how is this better than using normal assertions, considering you end up with a similar number of tests? The real power comes from “combination approvals.”

Combination Approvals

Let’s say you have a function that has two parameters; both of which can take a number of values. It’s quite easy to see that there is potentially quite a bit to test here.

An illustrative example is the Tennis kata, which converts points to Tennis scores. Given values from 0 to n, it returns a format such as Fifteen-Love or Deuce, etc. The rules are:

Scores from zero to 3 points are known as Love, Fifteen, Thirty, and Forty.
If at least 3 points are scored and the scores are equal, this is Deuce.
If at least 3 points are scored and one player has one more point, then this is known as “Advantage” for the player in the lead.
The game is won by the player who’s won at least 4 points in total and has at least 2 points more than the opponent.

Code to implement this could look like this:


fun score(player1Points: Int, player2Points: Int): String {
        if (player1Points.coerceAtLeast(player2Points) >= 4 && player1Points != player2Points) {
            return (if (abs(player1Points - player2Points) > 1) "Win for" else "Advantage") + " player " + if (player1Points > player2Points) "1" else "2"
        }
        return if (player2Points == player1Points) {
            if (player1Points >= 3) "Deuce" else scoreFor(player1Points) + "-All"
        } else scoreFor(player1Points) + "-" + scoreFor(player2Points)
    }

    private fun scoreFor(points: Int): String? =
        hashMapOf(0 to "Love", 1 to "Fifteen", 2 to "Thirty", 3 to "Forty")[points]

If the above were “legacy” code, you could write some unit tests to preserve the expected behaviour. If you were to be thorough about it, then you might start off with some cases such as these:


@Test
    fun testScores() {
        assertEquals("Fifteen-Love", TennisCalc().score(1, 0))
        assertEquals("Thirty-Love", TennisCalc().score(2, 0))
        assertEquals("Forty-Love", TennisCalc().score(3, 0))
        assertEquals("Love-Fifteen", TennisCalc().score(0, 1))
        assertEquals("Love-Thirty", TennisCalc().score(0, 2))
        assertEquals("Love-Forty", TennisCalc().score(0, 3))
    }

This test is growing long, and I haven’t even included cases for Draw, Deuce, Advantage, or Win. It’s also missing logic allowing an advantage and a win to apply to either player 1 or player 2. The result involves a lot of test cases and a fair bit of work. Rather than listing out every possible case, combination approvals allow for an easier effort. The test leveraging combination approvals looks like this:


@Test
    fun verifyScores() {
        val tennisCalc = TennisCalc()
        val p1points = arrayOf(0, 1, 2, 3, 4, 8)
        val p2points = arrayOf(0, 1, 2, 3, 4, 8)
        CombinationApprovals.verifyAllCombinations({ p1: Int?, p2: Int? ->
            tennisCalc.score(
                p1!!, p2!!
            )
        }, p1points, p2points)
    }

That’s it! A little bit of explanation is required: The p1points and p2points arrays are the scores needed to pass into the score function, and there are six values in each so the total number of combinations becomes 6 * 6 = 36.

Calling CombinationApprovals.verifyAllCombinations() will verify the results, and in this case, it takes three parameters. The first parameter is a lambda function that takes two parameters, p1 and p2. These values are the individual scores that come out from the array. Then, inside this lambda function, you call the score function (the one you want to test), passing it the p1 and p2 values passed in via the lambda. Finally, the last two parameters to the verifyAllCombinations function are the two arrays.

Running this test will fail, and again will show you the diff viewer with a received.txt file and an empty approved.txt file, e.g:


[0, 0] => Love-All 
[0, 1] => Love-Fifteen 
[0, 2] => Love-Thirty 



[8, 2] => Win for player 1 
[8, 3] => Win for player 1 
[8, 4] => Win for player 1 
[8, 8] => Deuce

Copy the contents of the received.txt file to approved.txt and run it again — it passes! Also, if you’ve been checking your test coverage, you should notice that we have 100% test coverage! Nice going for such little effort.

You might have looked at the verifyScores test and thought it looked a bit like a parameterized test. If it were, you’d have to come up with the combinations of inputs yourself — of which there are about 36 different combinations. With “combination approvals,” you just list some values you’d like, and the tool figures out the various permutations to inject into the function under test.

If you were testing a function with another parameter that can take multiple values, then hopefully you can see the amount of work it would be to retrospectively write tests for it.

Conclusion

Even the newest, shiniest code goes rotten very quickly, especially if no one is writing tests (of course, no one reading this article would ever dream of it). There are many tools and techniques a modern software engineer can use to help them work effectively with legacy code, and approval tests could be a tool in your arsenal.

In summary, approval testing can provide broad test coverage in just six steps:

Call function with some inputs.
Use Approvals.verify on the result, rather than the usual assertEquals.
Run the test, it will fail and open up a diff viewer showing a received file and an approved file.
The result from the test goes in the received file. You approve this result by using your diff viewer to copy the result into approved.
If you run the test again it will pass.
Go to step 1.

Addendum

The testing tool pops up a diff viewer when the test fails, therefore you need to have one installed. I used the one with Visual Studio Code, which seems to work well. You may need to configure it. See your favourite Q&A site on how to set it up. For me (on MacOS), it just worked.

The tool generates a lot of .approved.txt and .received.txt files. You should check in the .approved.txt files into source control but add the .received.txt files into your .gitignore.

References

The approval testing library
Tennis kata rules
Working Effectively With Legacy Code - Michael Feathers
Tennis Refactoring Kata