Ruby DeRailed: Fast Tests

I use TDD as my development workflow. Red, green, refactor. It creates a quick feedback cycle, allows me to build momentum, and gives me confidence in my refactoring. It becomes the pulse at which I develop features quickly and well.

Conversely, a slow feedback cycle is a huge hindrance. As my test suite gets slower, each of these three steps carries a larger burden. I run the tests less often and resort to trying to maintain my momentum from the last feedback. This deteriorates the design since the cost of refactoring in development time becomes too expensive.

Every Rails project I've worked on has gotten where its test suite takes more than 15 minutes to run. At this point design is an afterthought and fast feedback is a long lost dream. I don't know specifically, but something about the Omakase way of building Rails apps causes slow test suites. With this in mind, I decided to find a different way to build Rails apps.

My first attempt

In February 2011, I gave a talk about Abstracting your Application Away from Rails. The general idea was to leave anything web related in the Rails app and to extract all of the business logic out. In this way, the business logic could be tested independently of the Rails application and tests would run quickly. In tandem with the talk, I released an example payroll application.

Shortly after I gave this talk I got the chance to test my ideas. From February 2012 until July 2013 I was part of a team that designed and built a large scale Rails application from the ground up, using the ideas from my talk. As a boiled down view, the architecture was two Rails applications, the user-facing application and an admin application, sharing a gem that contained all the logic shared by the two applications.

Unfortunately, the test suite ended the same way as every other Rails app I have built. The user-facing application tests took about 16 minutes, the admin application tests took about 5 minutes and the shared gem tests took about 3 minutes. This project was a long and painful process. For a while, we were able to fight the test times and keep the test suite reasonably fast. However, our efforts eventually failed. I was left puzzled. It took a while to understand, but I was able to come up with some explanations of what went wrong.

We still had a lot of code in our Rails apps

When I proposed the idea of extracting all non-web logic out of the Rails application, I envisioned that the Rails app controllers would be nothing more than single line function calls into the gem. I was wrong. In practice, both applications that consumed the shared gem had vastly different use cases. This was good. We were able to clearly see which pieces of logic were application specific and which were generic and reusable. However, the amount of code that was application specific was nearly equivalent to the amount of generic code. This meant that we had to move all the application specific code back into the Rails ecosystem.

The shared gem still depended on the database

The shared gem contained generic behavior. However, it depended heavily on the database and other external services (i. e. elastic search) for the tests to run. This was a major slow down.

All in all, I learned two big lessons from this project:

Just pulling code out of the Rails ecosystem was not enough
DON'T DEPEND ON EXTERNAL SERVICES IN YOUR TEST SUITE

Figuring out the root problem

After my first failed attempt at solving the problem of slow tests, I decided to go back to the drawing board. After a lot of investigation, I finally discovered why the Rails ecosystem was so slow.

Preloading

Preloading is a pervasive mechanism in Rails. In essence, it means that the "environment" should be loaded and initialized before your code gets run. The "environment" includes the Rails framework, any dependencies listed in your Gemfile, plus the Rails initialization process. If you look at an empty Rails application, you can see where this idea starts to seep into code.

Your test_helper.rb contains this line:
```
require File.expand_path('../../config/environment', __FILE__)
```
This sneaky one liner loads the entire Rails "environment". This means that whenever you run a single test file, you have to load all of Rails.

Your Rakefile looks like this:

# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.

require File.expand_path('../config/application', __FILE__)

MyApp::Application.load_tasks

This is a little less invasive. It won't load the entire environment, only part of it. However, it does invoke the slowest part of the environment loading process: dependency loading.

Unfortunately, the idea of preloading has expanded beyond the Rails framework and into the Rails ecosystem. It is conventional for Ruby gems to contain a "require farm" to load all the contents of the gem for you. For example, to use any part of the gem Virtus I just require "virtus" and I'm good to go! The dark side is that the entire Virtus gem gets loaded, even if I don't need all of it.

Active Record

Active Record is the ORM that ships with Rails. All Rails apps use Active Record by default. Even though Active Record offers flexibity in choosing which SQL database to use, you have to pick a SQL database. This is fine, except when running a test suite. Database calls can be expensive, especially when considering test speed. In the past I've used 'sqlite3::memory:' to ease the pain of these expensive calls. However, sqlite is still a SQL database. This means the database calls will always be expensive with a large enough test suite.

My second attempt

A few months ago, I started working on another Rails app from scratch. This time around I decided to address the root issue. Here's what we did.

Lazyloading over Preloading

Instead of loading the environment before any of our code executes, we load nothing until we absolutely need it. This has drastic effects on our code base.

Only load the parts of Rails we use.

In the `application.rb` file, instead of this,
```
require 'rails/all'
```
we have this,
```
require 'action_controller/railtie'
require 'sprockets/railtie'
```
Since we only use Action Controller and Sprockets, we don't need to load the whole Rails framework.
We removed this line from our application.rb file
```
Bundler.require(:default, Rails.env)
```
This removes dependency loading. Now the consumers of the gems are responsible for requiring them.
Only load the bare minimum from Active Support

We added this to our application.rb configuration block,
```
config.active_support.bare = true
```
By default, Rails loads the entire Active Support library. This option tells Rails to load only the parts of Active Support it uses.
Use multiple spec helpers
- spec_helper.rb - This is our default spec helper. It loads nothing besides RSpec. It runs extremely fast.
- rails_spec_helper.rb - This spec helper loads the Rails environment. It still runs fast because we optimized the Rails initialization procedure in the first few steps, but it's much slower than the main spec helper. We only use this spec helper when testing controllers.

The Repository Pattern over Active Record

In order to avoid the overhead of database calls in our test suite, we decided to configure the system to use an in-memory database, not just Sqlite in-memory. In this blog, Mike Ebert talks about how to implement the repository pattern. We used this idea across the whole system. Our test suite now talks to an in-memory Ruby database and the real system talks to a MySQL database.

The Results

Let's look at some metrics taken from my first speed up attempt and compare them to the second attempt.

Total test suite time:

Attempt #1: 4853 examples, 3377.13 seconds
Attempt #2: 1706 examples, 7.19 seconds
Attempt #2: Projected to 4853 examples, 20.45 seconds

Time to run one test:

Attempt #1: 18.195 seconds, 69 gems
Attempt #2: 1.580 seconds, 40 gems

From these results, we see the second attempt at fixing this problem was more successful.

After months of working on this project, our feedback cycle has remained constantly fast. Large scale refactorings no longer carry the burden of a slow feedback cycle. Refactoring is cheap and we can focus on producing the best design possible.