Frameworks that Scale

via Ted Leung, an interview with Alex Payne of Twitter.  Ted highlighted the comments about Ruby's performance, but what jumped out at me was the comments on Ruby on Rails:

The common wisdom in the Rails community at this time is that scaling Rails is a matter of cost: just throw more CPUs at it. The problem
is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database...The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement. So it’s not just cost, it’s time, and time is that much more precious when people can[’t] reach your site.

None of these scaling approaches are as fun and easy as developing for Rails. All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise. Once you hit a certain threshold of traffic, either you need to strip out all the costly neat stuff that Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.) or move the slow parts of your application out of Rails, or both.

This fault isn't just unique to Rails, I've done enough .NET development where this is the case, and .NET doesn't even provide the level of syntactic sugar that Rails does.  Alex identifies the problem perfectly: it's difficult and expensive to scale at the DB layer.  It seems to me that if I were creating something I wanted to be successful, I'd start with a couple of assumptions:

  1. You will need to talk to cache much more than the backing store.
  2. You'll need a way to monitor the system, real time, all the time, and find out how it's performing.  It's like having a patient on the ICU, with the EKG monitor on at all times, ready to set off bells to alert the nursing staff. 

It seems like this is where frameworks need to go.  Having a nice way to talk to the backing store (DB, DBM, flat file) is wonderful and allows you to get a site up quickly, but it's not pushing you down the road to scalability.  I'm wondering now if you shouldn't start by building a site that only lives in memory and thing about persistence later - almost like working with an editor, where you start by typing and hit save as an afterthought.

— Gordon Weakliem at permanent link

More on Scaling Ruby on Rails

Also commenting on the RoR interview I posted about earlier today, LaughingMeme:

More importantly he gives a quick insight into the how of making social software scale. It’s hard, it has ugly network effects, it makes databases cry. Alex mentions cache like mad. (because frankly no one but the content creator needs to see fresh data)

Also denormalize like mad, federate like mad, and prune features that make your site slow.

You’ll never build a successful site if you build to scale from day 1, scaling is always a catch up game, but it’s the best game there is.

Kellan, having worked on Flickr, probably knows these lessons better than I do.  Good points all - Rails intentionally trades framework performance for developer productivity, every site has to deal with scaling at some point.  I just wonder if this isn't a false dichotomy.  I read comments in the original interview that suggested that RoR already has some sort of caching built in.  It seems to me that this ought to be the default.  This is all just mind candy for me at the moment, but I really have to wonder if you can design in scaling, to some extent, from the beginning.  It seems like treating the application as in-memory only, with persistence as a feature to be added later, would be the way you'd go about it.  It seems like everybody concludes somewhat hastily that they need a database, and start their project by building a database.  RoR definitely fits that model - you build your database schema and tell Rails to generate code to fit.  Databases are a royal pain.  I'm really wondering if "database first" is the right way to go.

— Gordon Weakliem at permanent link