Do you use validates_uniqueness_of
in Rails? Do you feel confident that it works to prevent duplicate records? If you’re like most of us, you won’t have given it much thought, but since I’m asking, you’re second-guessing yourself.
And indeed you should. More and more, the Rails documentation and community blogs reflect the problems with validates_uniqueness_of
(see Further Reading below), and the fact that you can’t depend on it to prevent duplicates in your database.
“You can’t?!?” Right. It’s called validates_uniqueness_of
, and it will perform that validation… right up until your project hits production.
First, let’s take a look at what happens normally. When saving an ActiveRecord
, your validations run first. So by virtue of the validates_uniqueness_of
call, ActiveRecord
will check the database to see if this User is the only one with that email address. If so, great! Save away!
Otherwise, the save fails, and errors are populated on the object to let you know what went wrong.
Everything looks good so far. Now what’s the big deal about your app going to production? Well, with any reasonably sized application, you’ll need multiple application server processes or at least threads (Mongrel, Passenger, Tomcat, etc.) to handle incoming traffic. And that’s when the problem strikes.
The Problem
Now Impatient Ian creates an account, quadruple-clicking the “Create Account” button after filling out his info. Let’s say either there isn’t any Javascripty double-click protection on that button or he has Javascript turned off.
Now somewhere in the intertubes, those four requests are all vying for your app servers’ attention, and two of them happen to hit the app servers simultaneously.
Now, each process checks the database to see if the newly constructed but unsaved object is unique for the given scoping, as before. Now, both of them say “yes”, and so both of them can go ahead and save. Hooray!
Wait. Not hooray—the opposite of that. Now we have duplicate records in our database, after we explicitly said that we didn’t want that to happen.
So that’s the problem: validates_uniqueness_of
doesn’t work as our intuitions might lead us to expect. What’s the solution? Well, if you need to make sure you’ve got no duplicate records, you’ll want a database-level constraint: a unique index.
Has_one
has similar problems, though there are dozens of possible root causes of those in application logic, so I’ll leave that as an exercise to the reader.
A Solution: consistency_fail
Consistency_fail
is a brand-new gem I wrote that aims to make it easier to fix these problems. By installing the gem for Rails 3…
…or alternatively for Rails 2.3…
… you get a consistency_fail
executable that will print out a report of the indexes you’re missing.
Here’s an example. We’ve got two models in this Rails 3 project, both of which have missing indexes:
So, getting an exhaustive list of missing indexes is as easy as consistency_fail
:
The first column of the report, labeled Model
, shows you where the call to validates_uniqueness_of
or has_one
issues from, and the Table Columns
column shows you the table with the database columns that need a unique index. Multiple column names in parentheses mean that a composite unique index is required.
For performance reasons, you’ll want to think carefully about how you order the columns in a multicolumn unique index, but for our purposes, any ordering will work to enforce uniqueness.
It’s worth noting here that by adding these indexes, we will protect our data, but at the risk of bubbling a database level uniqueness violation exception to the top level, and potentially showing our user an error page.
Unless, that is, we want to hack ActiveRecord
up to catch the database specific exception (done it!) and give a nice explanation. So, buyer beware for the time being, there’s application specific code that will need to be written if handling these user-facing errors is a concern.
I’d love to hear whether you find this reporting from consistency_fail
to be valuable, and what improvements you’d like to see.