With the release of Rails 3.2, comes a host of new features. There are two in particular in ActiveRecord which seem to have attracted a lot of attention, so I thought I would look at them each more closely. This post, as you have no doubt realized from the title, is focused on ActiveRecord::Base.explain
This may come as a shock, but explain
is only new in Ruby on Rails; it - or similar features - has been around in the database world forever. Stated most simple, “The EXPLAIN
statement provides information about the execution plan for a SELECT
statement.” (MySQL 7.8.2. EXPLAIN Output Format) In other words, when you explain
a query, the database returns information about how it is going to go about finding all the data. It’s a very useful tool which can help developers uncover why queries are running slowly and what might be done to speed them up.
Let’s look at an example so we can see the value of explain
.
We’ll need a couple tables with data for our example. I’ll use users and user_restaurants.
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :name
t.string :email
t.timestamps
end
end
end
class CreateUserRestaurants < ActiveRecord::Migration
def change
create_table :user_restaurants do |t|
t.integer :user_id
t.string :name
t.boolean :like
t.timestamps
end
end
end
The users table is self-explanatory. user_restaurants, on the other hand, is a table of restaurants a user likes (I guess that’s self-explanatory as well.)
Well need a couple of users, so let’s add them here.
class AddUsers < ActiveRecord::Migration
def up
User.create(:name => "John Galt", :email => "john@galtsgulch.com")
User.create(:name => "Howard Roark", :email => "howard@architect.com")
end
end
And our users will want to “like” or “dislike” restaurants, so we’ll need to create the liked restaurant records. I’ll name the restaurants “Restaurant n” to save myself from having to come up with 100,000 restaurant names. We’ll split those restaurants between the two users and define 20% of them to be restaurants the users did not “like”.
class AddRestaurants < ActiveRecord::Migration
def up
users = User.all
100000.times do |i|
UserRestaurant.create(:user_id => i % 2 == 0 ? users.first.id : users.second.id, :name => "Restaurant #{i}", :like => i % 5 == 0 ? false : true)
end
end
end
The last thing we have to do is associate the users table to the user_restaurants table.
class User < ActiveRecord::Base
has_many :user_restaurants
end
class UserRestaurant < ActiveRecord::Base
belongs_to :user
end
Okay, we’ve created our tables and populated them with data, and we’ve got our models and associations. Let’s run a query to find “John Galt’s” favorite restaurants and see what’s going on.
User.where(:email => "john@galtsgulch.com").
joins(:user_restaurants).where("user_restaurants.like = 1")
When I run this from the Rails Console, explain
kicks in automatically and returns the table below (this is output from MySQL, PostgreSQL looks and is worded differently). “Active Record monitors queries and if they take more than [config.active_record.auto_explain_threshold_in_seconds
] their query plan will be logged using warn.” (What’s new in Edge Rails: EXPLAIN)
Note: To manually execute explain on a query, just append .explain
to it.
+----+-------------+------------------+------+---------------+------+---------+------+--------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------+------+---------+------+--------+--------------------------------+
| 1 | SIMPLE | users | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where |
| 1 | SIMPLE | user_restaurants | ALL | NULL | NULL | NULL | NULL | 100041 | Using where; Using join buffer |
+----+-------------+------------------+------+---------------+------+---------+------+--------+--------------------------------+
The columns you’re going to be most interested in are “key” and “rows”. The “key” columns lists the indexes, if any, the database uses to find data. “The rows column indicates the number of rows MySQL believes it must examine to execute the query.” 7.8.2. EXPLAIN Output Format
In the output above, we can see that the query would use no indexes, and would have to search through 100,041 rows (Hey, that’s more rows than are in the database). “For InnoDB tables, this number is an estimate, and may not always be exact.” 7.8.2. EXPLAIN Output Format
Let’s add some indexes to a couple of the key table columns and see if we can’t reduce the pain.
class AddTableIndexes < ActiveRecord::Migration
def up
add_index :users, :email
add_index :user_restaurants, :user_id
add_index :user_restaurants, :like
end
end
Now, when we run our query it still issues an explain (because it’s a really stupid example), but we now see how adding the indexes improved performance.
+----+-------------+------------------+------+------------------------------------------------------------------+-----------------------------------+---------+---------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+------------------------------------------------------------------+-----------------------------------+---------+---------------------+-------+-------------+
| 1 | SIMPLE | users | ref | PRIMARY,index_users_on_email | index_users_on_email | 768 | const | 1 | Using where |
| 1 | SIMPLE | user_restaurants | ref | index_user_restaurants_on_user_id,index_user_restaurants_on_like | index_user_restaurants_on_user_id | 5 | ar_explain.users.id | 25010 | Using where |
+----+-------------+------------------+------+------------------------------------------------------------------+-----------------------------------+---------+---------------------+-------+-------------+
Looking at the new explain
output, we see two things: 1) the query is now using the indexes we created; and 2) the number of potential rows that have to be searched through has been quartered.
It may not seem like a big deal that we went from two to one for the search by email, but imagine if you had 10,000 users, and by using the index here it was still able to go directly to that one record. And if the data we were searching for was something other than a boolean, as it is with the “like” column, we would be able to further reduce the number of potential records through which were searched.
Of course, if we didn’t have our handy dandy explain
tool we might not even have realized it was a database problem to begin with.
As I alluded to in the beginning, there is nothing new about explain
. It’s an excellent tool which provides insight into what course of action the database will take to find the data you are searching for. I’ve been using explain
for some time directly from the database command line; it’s nice to finally have it accessible from the Rails Console.
I’ve purposely ignored the various implementations and configurations of explain
in order to focus more on why it’s useful. If you are interested in configuration and implementation, check out What’s new in Edge Rails: EXPLAIN.
##Further Reader