This post originated from an RSS feed registered with Ruby Buzz
by David Heinemeier Hansson.
Original Post: Twitter trouble
Feed Title: Loud Thinking
Feed URL: http://feeds.feedburner.com/LoudThinking
Feed Description: All about the full-stack, web-framework Rails for Ruby and on putting it to good effect with Basecamp
Twitter is an amazing success story in terms of rapid user uptake and flattering press. I had a chance to speak with the team a while back about the wild ride they've been on. At that time they were fielding spikes of up to 11,000 requests per second across some 16 cores with very little caching thrown into the mix to mitigate. No wonder their site had been feeling slow.
It sounded like they had a good plan at the time, though. Roll in a rack of new servers, look into doing substantial caching, and move beyond a single database server. The normal road to sanity employed by most any web application experiencing rapid growth.
The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement. So it�s not just cost, it�s time, and time is that much more precious when people can[�t] reach your site. None of these scaling approaches are as fun and easy as developing
for Rails.
On some level, I can emphasize with this sentiment. Rails makes the act of developing such a pleasant experience that when you need to follow the same scaling path as every other shared-nothing stack, the contrast can feel stark. And perhaps it's a natural reaction to feel a need to blame something for that contrast, however natural it is.
But I would still like to address two points raised in the interview. First, Alex mentions that scaling the application by adding more Mongrels and servers eventually puts a greater strain on the database. That's absolutely correct, but also the intended consequence, not an unexpected side-effect. There should have been no surprises there.
Scaling is the act of removing bottlenecks. When you remove one bottleneck (like application code execution), you tend to reveal another (like database queries). That's natural and means you're making progress. But you have to keep your marbles straight when doing this. If your bottleneck has moved to the database, you probably won't see big results by replacing pretty constructs with ugly ones. In other words, if a database query is taking 0.5 seconds, improving a loop from 0.05 to 0.01 seconds is not worth bothering with at this point.
Second, when you work with open source and you discover new requirements not met by the software, it's your shining opportunity to give something back. Rather than just sit around idle waiting for some vendor to fix your problems, you get the unique chance of being a steward of your own destiny. To become a participant in the community rather than a mere spectator. This is especially true with frameworks like Rails that are implemented in high-level languages like Ruby. The barriers to contribution are exceptionally low.
In this case, it seems that Twitter requires more sophisticated ways of talking to many databases at the same time. Alex puts it a little black and white with "...there�s no facility in Rails to talk to more than one database at a time", which isn't really true, but it could definitely be done better. Last I spoke with Twitter, we discussed this and they sounded enthusiastic about being able to further this area of Rails. It's disappointing to hear that they've forsaken that opportunity for an arms-crossed alternative.
In any case, I'm proud that Twitter is having to push the envelope on scaling Rails. Fielding 11,000+ requests per second is no small feat for any dynamic web application. Once the stress of having to deal with that in the moment subsides, I'm convinced that the team will grow beyond the blame game, get their hands dirty as full participants in an open source community, and contribute back their advances to the framework. We'll all be richer for it.