The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Slow queries: Rails habtm gotcha

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Patrick Lenz

Posts: 168
Nickname: scoop
Registered: Apr, 2005

Patrick Lenz is the lead developer at freshmeat.net and a contributor to the typo weblog engine
Slow queries: Rails habtm gotcha Posted: Nov 9, 2007 12:19 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Patrick Lenz.
Original Post: Slow queries: Rails habtm gotcha
Feed Title: poocs.net
Feed URL: http://feeds.feedburner.com/poocsnet
Feed Description: Personal weblog about free and open source software, personal development projects and random geek buzz.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Patrick Lenz
Latest Posts From poocs.net

Advertisement

Back in the old days when some brave souls started with Rails, the framework would automatically supply helper methods to check the availability of associated records for every has_many or has_and_belongs_to_many association in the form of has_foos?.

For example, given a declaration like this:


class User
  has_and_belongs_to_many :interests
end

would get you User#has_interests? back in the days.

Fast forward to the present, these methods have long been removed from Rails core. There’s no 1:1 replacement, really. When you want to check a collection in a conditional for whether or not it’s empty, there are several things you can do.

Association Proxies, a quick review

Basically, every collection is a derivative of the Array class, as such most of Arrays methods have been (re-)implemented for the respective collection proxies.

First of all, there’s collection#length, which is the most brutal of them all because it simply loads the whole collection into memory and runs #size on it.

Next up is collection#count, which uses an SQL COUNT() statement to just fetch the number of associated records and is usually the quickest way at that.

Finally, there’s collection#size, which is basically a combination of both length and count since it behaves differently depending on whether or not the collection has already been loaded at the time it is being called. Given the collection has already been loaded (by a prior statement in your code, what have you) it’s identical to collection#length. If it hasn’t been loaded, its behavior is identical to collection#count.

Best practice

Judging from the rundown above, it’s usually best practice to simply use collection#size and trust it to do the right thing. In case you’re planning to use a collection as soon as you found out there are records associated with it, use collection#length to load the connection right away and then loop over it. Beware, however, that this is advisable only for those cases that don’t use pagination. (You’re using will_paginate for that, aren’t you?)

In a conditional, it makes most sense to use collection#empty?, which uses size#zero? internally.


unless user.interests.empty?
  # ...
end

The problem

99% of what I said above applies to both has_many and has_and_belongs_to_many associations. Admittedly, poor little habtm has had its momentum and seems to be on the way out of the cool kids’ block. Still, there are a few cases where it’s still the best fit (such as the User#interests example from first section of this article) and some people *do* have Rails code that’s been in existence for a few years. (Well, maybe only I do.)

So, the 1% that behaves differently is that size on a habtm collection is not as intelligent as its little brother from the has_many camp. In fact, it’s pretty dumb because it’s in every case behaving identical to length meaning it’s loading the whole collection. Ouch.

>> user.interests.size
  Join Table Columns (0.001284)   SHOW FIELDS FROM interests_users
  Interest Load (0.001263)   SELECT * FROM interests
    INNER JOIN interests_users
    ON interests.id = interests_users.interest_id
    WHERE (interests_users.user_id = 142841 )

The solution

You have to fill in the gaps of habtm#size’s lack of intelligence yourself. If you trap over habtm queries that take way too long for your taste, it might be about time to swap out that size call for an always speedy count call.

>> u.interests.count
  Interest Columns (0.001006)   SHOW FIELDS FROM interests
  SQL (0.009749)   SELECT count(*) AS count_all FROM interests
    INNER JOIN interests_users
    ON interests.id = interests_users.interest_id
    WHERE (interests_users.user_id = 142841 )

You would then also avoid using collection#empty? in your conditionals, since that would then resort back to using size internally. Instead, use the count#zero? form or define your own instance methods on the association.


if user.interests.count.zero?
  # ..
end

Postscript

To actually find out what Rails is transforming your method calls into (in terms of generated SQL) I prefer to use this technique by the always-wise Jamis Buck. Redirecting the ActiveRecord logger to the shell’s standard output will intermix the SQL statements with the output of your method calls, making everything relevant visible at first glance.

Read: Slow queries: Rails habtm gotcha

Topic: RubyURL Bookmarklet - Now with Google Maps Support Previous Topic   Next Topic Topic: Ruby and .NET - Making a Connection

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use