Tuesday, December 6, 2011

What's new in Edge Rails: EXPLAIN

Posted by fxn

There are some new features related to EXPLAIN in the forthcoming Ruby on Rails 3.2 we’d like to share:

  • Running EXPLAIN manually
  • Automatic EXPLAIN for slow queries
  • Silencing automatic EXPLAIN

As of this writing they are available for the adapters sqlite3, mysql2, and postgresql.

Running EXPLAIN Manually

You can now run EXPLAIN on the SQL generated by a relation this way:

User.where(:id => 1).joins(:posts).explain

The result of that method call is a string that carefully imitates the output of database shells. For example, under MySQL you get something similar to

EXPLAIN for: SELECT `users`.* FROM `users` INNER JOIN `posts` ON `posts`.`user_id` = `users`.`id` WHERE `users`.`id` = 1
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
|  1 | SIMPLE      | users | const | PRIMARY       | PRIMARY | 4       | const |    1 |             |
|  1 | SIMPLE      | posts | ALL   | NULL          | NULL    | NULL    | NULL  |    1 | Using where |
2 rows in set (0.00 sec)

and under PostgreSQL the same call yields something like

EXPLAIN for: SELECT "users".* FROM "users" INNER JOIN "posts" ON "posts"."user_id" = "users"."id" WHERE "users"."id" = 1
                                  QUERY PLAN
 Nested Loop Left Join  (cost=0.00..37.24 rows=8 width=0)
   Join Filter: (posts.user_id = users.id)
   ->  Index Scan using users_pkey on users  (cost=0.00..8.27 rows=1 width=4)
         Index Cond: (id = 1)
   ->  Seq Scan on posts  (cost=0.00..28.88 rows=8 width=4)
         Filter: (posts.user_id = 1)
(6 rows)

Please note that explain runs the query or queries and asks the database for their respective query plan afterwards. This is because due to eager loading a relation may trigger several queries to fetch the records and their associations, and in such cases some queries need the result of the previous ones.

If the relation triggers several queries the method still returns a single string with all the query plans. This is an output meant for human consumption so we preferred to present everything as a string in a format which is familiar right away rather than a structure.

Automatic EXPLAIN For Slow Queries

Rails 3.2 has the ability to help in detecting slow queries.

New applications get

config.active_record.auto_explain_threshold_in_seconds = 0.5

in config/environments/development.rb. Active Record monitors queries and if they take more than that threshold their query plan will be logged using warn.

That works for anything running find_by_sql (which is almost everything, since most of Active Record ends up calling that method). In the particular case of relations, the threshold is compared against the total time needed to fetch the records, not against the time taken by each individual query. Because conceptually we are concerned with the cost of the call

User.where(:id => 1).joins(:posts).explain

rather than the cost of the different queries that call may trigger due to the implementation.

By default the threshold is nil in the test and production environments, which means the feature is disabled.

The value of that parameter is nil also if the threshold is not set, so existing applications will need to add it by hand if they migrate to 3.2 to be able to enable automatic EXPLAIN.

Silencing Automatic EXPLAIN

With automatic EXPLAIN enabled, it could still be the case that some queries are just slow and you know they have to be. For example, a heavyweight report in the backoffice.

The macro silence_auto_explain allows you to avoid having EXPLAIN run repeatedly in those areas of code:

ActiveRecord::Base.silence_auto_explain do
  # no automatic EXPLAIN here

Interpreting Query Plans

The interpretation of the query plans is another topic, these are some pointers:

Happy debugging!