Hide Your Queries

Hide Your Queries

Mike Jansen
Mike Jansen

October 23, 2014

One of the quickest ways to turn a well-factored Rails project into an old-fashioned legacy app is to use ActiveRecord where and find_by_sql calls outside of the model classes.

These methods have the appearance of abstraction because of the flexibility of the ActiveRecord query methods, but they are really introducing reasons for change that classes outside the model should not have. Here are two common examples of this Single Responsibility violation I've seen across Rails projects:

Field Names Change

Let's say you have an articles table, which includes a published field. In Rails, this likely means you have an Article model. To get 10 published articles for a certain author, you use a where statement like this inside of a controller:

def published
			@articles = Article.where(:author_id => params[:author_id], :published => true).limit(10)
end

When you want 10 articles that are ready for review you use a where statement like this:

def ready_for_review
			@articles = Article.where(:author_id => params[:author_id], :ready_for_review => true).limit(10)
end

This week there is a call for a new feature around flagging articles as "modified after published". It becomes obvious that fields for each status are not the way to go. A migration to a status field is in order. This can contain values like ready-for-review, published, modified-after-published, and any other crazy thing the product team can think up.

This is a change we want to make, but now our calls inside the controller must be modified to use the new field and to check for a certain status.

Instead, from the very beginning consider adding a specific method to the Article class itself:

def self.ready_for_review_by_author(author_id)
			@articles = Article.where(:author_id => author_id, :ready_for_review => true).limit(10)
end

When the field changes, the data and the query logic is encapsulated in the model itself. The controller can simply call Article.ready_for_review_by_author, and all of the field-specific changes can happen in one place. Do the same for Article.published_by_author, and for the new method Article.modified_after_published_by_author.

In our simple example, we are saved from changing one place in a controller. The primary benefit, however, is keeping our database details from escaping out into the system.

Changing Databases

As an ORM, ActiveRecord gives tremendous power to the developer in terms of not having to worry about writing database-specific queries. It also allows the developer to write their own custom query statements as strings and pass them through a find_by_sql call.

For our ready for review by author example, let's say we are running on MS SQLServer 2008 and using the activerecord-sqlserver-adapter gem. An even more shortsighted way to implement this would be something like this:

def ready_for_review
			@articles = Article.find_by_sql('SELECT TOP 10 * FROM articles WHERE author_id = ? AND ready_for_review = ?', params[:author_id], true)
end

Our first problem is that we should be using the ORM to our advantage by using the limit method. This delegates the query creation to ActiveRecord and hides this detail from the developer.

The more alarming issue is that we now have an MS SQLServer 2008 dependency in our codebase in the form of the database-specific query language.

Inside of the model, this problem is contained. Outside of the model, it is always an SRP violation. It's a clear reason for change that should be grouped in the model itself. If we ever want to change away from MS SQLServer, our job is made harder the more references like this we have.

The problem is limited if you have an Article.ready_for_review_by_author(author_id) method defined, and the task of migration later becomes that much easier.

Costs

With these changes, the public interface of the model class has increased. There are more individual methods to maintain, and there is some slight duplication.

Consider how wide the ActiveRecord interface is to begin with, though—there are hundreds, even thousands of options for queries generated using method_missing for ActiveRecord.

In my experience, the cost of explicit duplication with flexibility for future data model changes is typically worth it. Given that you can refactor out the duplicate where calls within the model, you can reduce this cost further.

Encapsulate Your Data

These are simple examples, but this is how the pain begins. Consistently used over time, the problem splits and propagates like a virus, eventually crippling the project's ability to change.

If you have these Single Responsibility violations in your project, start fixing them today. Refactor and move these queries to the model, and the rest of your application will be better for it.