Low hanging fruit for Ruby performance optimization in Rails

Our goal: we currently spend about 150-180 ms in Ruby for an average web request, and we think it’s reasonable to improve that to be around 100 ms.

Two of the most expensive operations in Ruby are: object allocation and garbage collection. Unfortunately, in Ruby (unlike Java), garbage collection is synchronous and can have a major impact on the performance of requests. If you’ve ever noticed that a partial template rendering occasionally randomly takes a couple of seconds, you’re probably observing a request triggering garbage collection.

The good news: it’s easy to quickly (< 10 min) see how much your app is impacted from garbage collection. You’re likely to improve your performance by 20-30% just by tuning your garbage collection parameters.

If you have a production Rails app, and you’re even remotely interested in performance, I’m going to assume you’re using the excellent New Relic service.  If you’re using REE or a version of Ruby with the Railsbench GC patches, it’s easy to turn on garbage collection stats that will be visualized by New Relic.

You’ll get pretty charts like this:

by adding:


somewhere in your initialization.  After enabling garbage collection stats in our application, we can see that approximately 20% of our Ruby time is spent in garbage collection, which implies that there’s also a not-so-insignificant portion of time spent in object allocation.

What’s next

We last tuned our Ruby garbage collection parameters about 5 months ago and, after an initial performance boost, we’ve seen the application time spent in Ruby creep back up.  To try to bring the response time back down, our next steps are to:

  • Consider taking another pass at garbage collection parameter tuning.  Since we’ve already taken one pass at this, I’m not sure if we’ll be as impactful the second time around, but we’ll see.
  • Identify the slowest controller actions via New Relic and profile them using ruby-prof or perftools.

Performance tuning using ruby-prof is likely going to vary a lot depending on the code, but if we find techniques that might apply more broadly, we’ll be sure to blog about it here.