Nested Contexts are Only Kind of Evil

If there's anything where opinions amongst my development colleagues differ heavily, it's about the evil of context blocks inside a spec suite. Proponents of nesting contexts will advocate the clarity they provide, and about how they can help deal with duplication in a readable, functional style. Opponents of this style, however, claim that they create too much state and context that can cause all kinds of "wait, what is the value of 'x' again?" moments as you attempt to figure out a test.

This conflict is interesting... mostly because I'm not really sure exactly where I stand on the topic. Mutation of state can be difficult to pick up on quickly and usually requires a closer look, but duplication of setup and concerns can cause a headache as you search for the specific differences in various bits and pieces. To get to the bottom of the matter, I started with a few "virtues" of what I think are good tests: - communicates the behavior of the system under test effectively - minimizes cognitive overhead (i.e., simplicity and transparency is better, and can be understood faster) - fails clearly, correctly, and informatively

I can add a few more, of course (e.g., doesn't stand in the way of rapid refactoring, runs quickly to facilitate feedback, etc), but I find the three guidelines above to be the most important in pursuit of a simple tests. I'd also say that I prefer tests to maximize communication, even at the cost of creating duplication: once that duplication exists, you have a nice clean point from which you can figure out what to do with it.

Sometimes removing code duplication depends on the tools we have available. With tools like Jasmine or RSpec, this is often accomplished with nested contexts. But sometimes I've seen the approach of extracting a function or method that encapsulates the duplicated assertions, and the test can call that explicitly. Let's look at JS for this case, since its local variable scoping makes things relatively clear. Just to be honest, I don't like this approach at all, but it's a nice point for us to start with since we can refactor.

describe("Widgets", function() {
		var result, widget;

		function behavesLikeWidget() {
				it("has a base URL", function() {
						expect(result.url).toEqual("the-url");
				});

				it("has a document selector", function() {
						expect(result.selector).toEqual(".a-class");
				});
		}

		context("when the URL has query string", function() {
				beforeEach(function() {
						widget = new Widget({ url: "the-url?with=params" });
						result = Widgets.launch(widget, { otherParam: "foo" });
				});

				it("adds parameters after an ampersand", function() {
						expect(result.queryString).toEqual("?with=params&otherParam=foo");
				});

				behavesLikeWidget();
		});

		context("when the URL has no query string", function() {
				beforeEach(function() {
						widget = new Widget({ url: "the-url" });
						result = Widgets.launch(widget, { otherParam: "foo" });
				});

				it("adds parameters after a question mark", function() {
						expect(result.queryString).toEqual("?otherParam=foo");
				});

				behavesLikeWidget();
		});
});

Seeing this is a bit jarring in JavaScript or Ruby, where blocks and closures give us lots of flexibility. Because behavesLikeWidget is defining examples and not assertions, this code depends on some local state being set up properly in the beforeEach before it can be executed. You can't exactly pass a local variable into a Jasmine example, so mutation is a necessary evil to accomplish what we're seeing here.

However, like most mutation of local state, I think this can be made much more clear. I stumble across the function first, and wonder what result is. Then I see it all set in a beforeEach, and then I get to scroll back up to see what the function is doing with the mutated local state. It feels like obfuscation!

Let's try abandoning the letter of the "one assertion per spec" rule in favor of being able to pass the result as an argument as opposed to having to use the mutated local state. In general, I tend to prefer this sort of referential transparency, even when I have to make a sacrifice in terms of following another one of the accepted TDD rules.

describe("Widgets", function() {
		var result, widget;

		function behavesLikeWidget(result) {
				expect(result.url).toEqual("the-url");
				expect(result.selector).toEqual(".a-class");
		}

		context("when the URL has query string", function() {
				beforeEach(function() {
						widget = new Widget({ url: "the-url?with=params" });
						result = Widgets.launch(widget, { otherParam: "foo" });
				});

				it("adds parameters after an ampersand", function() {
						expect(result.queryString).toEqual("?with=params&otherParam=foo");
				});

				it("does everything else a widget does", function() {
						behavesLikeWidget(result);
				});
		});
});

As you can probably tell, I struggled with trying to name that last example. It "does everything else a widget does"? Not the best name, right? I tried "has a url and selector" but then I realized: these things don't care about the context's changed state. And that finally allowed me to see the real problem: we're doing this inside-out. We're running assertions over and over that don't even care about whether or not the URL has a query string, and yet the query-string's presence has some sort of massive importance in our test suite.


describe("Widgets", function() {
		var result, widget;

		beforeEach(function() {
				widget = new Widget({ url: "the-url" });
				result = Widgets.launch(widget, { otherParam: "foo" });
		});

		it("has a base URL", function() {
				expect(result.url).toEqual("the-url");
		});

		it("has a document selector", function() {
				expect(result.selector).toEqual(".a-class");
		});

		it("adds launch parameters to the query string", function() {
				expect(result.queryString).toEqual("otherParam=foo");
		});

		it("preserves existing query string parameters", function() {
				widget = new Widget({ url: "the-url?with=params" });
				result = Widgets.launch(widget, { otherParam: "foo" });
				expect(result.queryString).toEqual("?with=params&otherParam=foo");
		});
});

Now our query-string has been delegated to the level of relative unimportance it deserves. I'd say our test suite is a lot more informative now. As opposed to describing the specific environments that the code is executing in, it describes how the code behaves generally: "has a URL", "has a selector", "adds parameters to query string", and then it dips into more specifics: "it preserves existing query string parameters." This is describing high-level behavior. Contrast that to "when the query string is present, adds parameters after an ampersand." There are a lot of implementation details here, and they may not exactly be important at-a-glance.

So, are nested contexts truly evil? I think not: contexts are excellent tools for BDD-style development, but only if the code is TELLING you that it's necessary. In general, I think the reaction against nested contexts is really a reaction against tests with all kinds of mutable state and lots of implementation specifics. Mutation is hard! Specifics are hard! Using the context to help create a shared vocabulary for your tests can help your test suite work for your team, but using it improperly can cause some mind-bending results, so use this tool with care.

Nested Contexts are Only Kind of Evil

Brian Pratt

Empowering Game Developers with Faster Builds and Better Tools

Essential Strategies for Upgrading Microservices Without Downtime

Getting Started with TDD: A Practical Guide to Beginning a Lasting Practice