Adventures in Scaling, Part 1: Using REE

The engineering team at Miso has been quite busy the last few weeks hacking on new features for the site, as well as on experimental ideas regarding the ‘future of social TV’. We are also busy fleshing out the Developer Platform that allows others to build applications using our API and to embed widgets displaying a user’s latest watching activity. [Post on building a OAuth REST API is forthcoming…] As always, we are very focused on infrastructure and stability as our user base grows.

There are a lot of important lessons we have been learning as we continue to scale our application. An ongoing theme of this blog will be to detail a variety of topics relating to our adventures in scaling our various web services as our traffic grows. We decided to start simple and explain in this first post how developers hosting a rails application on 1.8.7 should be using Ruby Enterprise Edition to take advantage of the performance tuning capabilities.

Ruby Enterprise Edition is very simply an improved version of the 1.8.7 MRI Ruby Runtime. The improvements include a copy-on-write friendly garbage collector, an improved memory allocator, ability to debug and tune garbage collection, and various thread bug fixes and performance improvements. In short, if you are using 1.8.7 in your Rails or Rack application anyways, there is really no reason not to switch to REE. We have been using it in production for months and the benefits are worth the switch.

First, let’s get REE installed on your servers. For this tutorial, we will assume you are running Ubuntu 32-bit in production (if not then certain details might be slightly different):

wget http://rubyenterpriseedition.googlecode.com/files/ruby-enterprise_1.8.7-2011.01_i386_ubuntu10.04.deb
sudo dpkg -i ruby-enterprise_1.8.7-2011.01_i386_ubuntu10.04.deb

This will install the REE Package onto your Ubuntu system at /usr/local/ directory by default. Don’t worry this can happily co-exist with your existing Ruby installation. Next, you need to reconfigure your web server to use this version of ruby. If you are using Passenger for instance, run the passenger command to install the nginx or apache module and then change the configuration to point to Passenger.

REE comes with the ability to tuneĀ performance by tweaking the garbage collector. This can have a significant impact on your application and is worth the extra effort. We have seen up to 20-30% increase in ruby performance simply by fine tuning these parameters. To tune ruby, we need to create a wrapper script to set the appropriate variables and then launch ruby.

There are many different recommended settings for the variables and these do depend on your application. Let’s take a look at the various adjustable options for the garbage collector. Each has a different effect on the performance of your server:

RUBY_HEAP_MIN_SLOTS

The first option has to do with the initial number of heap slots ruby will allocate upon startup. This will affect memory usage because the larger the heap size, the more initial memory required. However, most ruby applications will need much more memory then the default allocation provides. By increasing this value, the startup time of your application will be decreased and increase throughput. The default is 10000 slots but the recommended range is between 500000-1250000. For most applications I have tested, the sweet spot is roughly 800000.

RUBY_HEAP_SLOTS_INCREMENT

This option is the number of additional heap slots that will be allocated whenever Ruby is forced to allocate new heap slots for the first time. The default value is 10000 but this lower than most Rails applications could benefit from. The recommended range is between 100000-300000 because this means that ruby will grow in heap size much faster which allows for better throughput and faster response times depending on your application. Our recommended setting is 250000.

RUBY_HEAP_SLOTS_GROWTH_FACTOR

This option is the multiplier that ruby uses to calculate the new heaps to allocate next time Ruby needs new heap slots. The default is 1.8 but if you adjust the slots increment to a much higher value as recommended above, this should be changed to a value of 1 because heap allocation is already sized correctly which means no need for incremental growth of the slots.

RUBY_GC_MALLOC_LIMIT

This option is the amount of data structures that can be allocated before a garbage sweep occurs. This value is really important because the garbage collector in Ruby can be very slow and as such minimizing the frequency that it executes can significantly increase performance. The default is 8000000 but recommended values range from 30000000-80000000 which allows many more structures to be created before a collection is triggered. This means more memory consumption in exchange for less frequent sweeping which can translate to significant performance gains.

RUBY_HEAP_FREE_MIN

This options is the number of heap slots that should be free after a garbage collection has executed. This number determines when ruby must allocate more heap space if the amount free is too low after the garbage sweep. The default value is 4096 but a value much higher can be used in conjunction with the other settings above. A value more to the tune of 100000 is more suitable which means less frequent allocation but each allocation will be a much larger number of slots as defined above. This translates to higher performance in most cases.

Bringing it all together

These changes significantly impact the way Ruby manages memory and performs garbage collection. Now, Ruby will start with enough memory to hold the application in memory from the initial launch. Normally, Ruby starts with far too little memory for a production web application. The memory is increased linearly as more is required rather than the default exponential growth. Garbage collection also happens far less frequently during the execution of your application. The downside is higher peak memory usage but the upside is significant performance gains.

In our case, we ended up using settings very similar to Github and Twitter. We are going to show those settings below, but feel free to research and tweak based on your own analysis of an individual application’s needs.

Let’s create a wrapper for the tweaked ruby settings and save it to /usr/local/bin/tuned_ruby:

# !/bin/bash
export RUBY_HEAP_MIN_SLOTS=800000
export RUBY_HEAP_FREE_MIN=100000
export RUBY_HEAP_SLOTS_INCREMENT=300000
export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
export RUBY_GC_MALLOC_LIMIT=79000000
exec "/usr/local/bin/ruby" "$@"

Then let’s set the appropriate permissions:

sudo chmod a+x /usr/local/bin/tuned_ruby

Once that tweaked ruby is executable, change your configuration so that the web server uses this new version of ruby. For instance in Passenger 3 for Nginx (our deployment tool of choice), the change would look like:

# /etc/nginx/conf/includes/passenger.conf
# ...
passenger_ruby /usr/local/bin/tuned_ruby;
# ...

Then you need to restart your web server for the changes to take effect in our case:

sudo god restart nginx

Now your application will be using the new tuned REE ruby runtime and will likely be much more memory efficient and have decreased response times for users. Miso uses a variety of profiling tools to gauge the performance of our application (post forthcoming) but I feel it is important to at least mention that after changing these parameters and using REE, profiling to measure the actual impact is essential.

This is only the first blog post of this series. We will be releasing another one soon about database optimization (for Postgres or MySQL) and how to tune your database for better performance as you scale.

Resources

For additional resources about this topic be sure to check out:

Easy Monitoring of Varnish with Munin

If you’re looking for a reverse proxy server, Varnish is an excellent choice. It’s fast, and it’s used by Facebook and Twitter, as well as plenty of others. For most sites, it can be used effectively pretty much out of the box with minimal tuning.

Like many decently-sized Rails apps, we leverage a lot of open source code. Dozens of gems and plugins, a variety of cloud services, Varnish and Nginx for caching and load balancing, and various persistence solutions. The point is, as our app usage has grown over the last year, we’ve had our share of stressful, on-the-fly debugging while our app was down. That’s not the best time to learn about all the fun nuances and interactions of your technology stack.

It’s a good idea to know what your services are doing and the key metrics to watch, so you’re better prepared when you hit those inevitable scaling pain points. New Relic has been tremendously useful for monitoring and debugging our database and Rails app. The rest of this post goes over some key metrics for Varnish and setting up Munin to monitor them.

Optimizing and Inspecting Varnish

Unless your application has an extremely high volume of traffic, you likely won’t have to optimize Varnish itself (e.g., cache sizes, thread pool settings, etc). Most of the work will be in verifying that your resources have appropriate HTTP caching parameters (Expires/max-age and ETag/Last-Modified). You’re most of the way there if you do the following:

  • Run Varnish on a 64-bit machine. It’ll run on a 32-bit machine, but it likes the virtual address space of a 64-bit machine. Also, Varnish’s test suites are only run on 64-bit distributions.
  • Normalize the hostname. e.g., www.website.com => website.com, to avoid caching the same resource multiple times. Details here.
  • Unset cookies for any resource that should be cacheable. Details here.

Varnish includes a variety of command line tools to inspect what Varnish is doing. SSH into the server running Varnish, and let’s take a look.

Inspecting an individual resource

First, let’s look at how Varnish handles an individual resource. On a client machine, point a web browser to an resource cached by Varnish. On the server, type:

$ varnishlog -c -o ReqStart <IP address of client machine>

The output of this command will be communication between the client machine and Varnish. In another SSH terminal, type:

$ varnishlog -b -o TxHeader <IP address of client machine>

The output of this command will be communication between Varnish and a backend server (i.e., an origin server, the actual application). Try reloading the resource in the browser. If it is cached correctly, you shouldn’t see any communication between Varnish and any backend servers. If you do see something printed there, inspect the HTTP caching headers and verify they are correct.

Varnish statistics

Now that we’ve seen that Varnish is working for an individual resource, let’s see how it’s doing overall. In your SSH session, type:

$ varnishstat

The most important metrics to note here are the hitrate and the uptime. Varnish has a parent process whose only function is to monitor and restart a child process. If Varnish is restarting itself frequently, that’s something to be investigated by looking at its output in /var/log/syslog.

Other than that, check out Varnishstat For Dummies for a good overview.

It’s great that we can check on Varnish fairly easily, but the key is to automate this process; otherwise, it can be very difficult to detect warning patterns early. Also, it’s not realistic to have a huge, manual, pre-flight checklist to check on the health of all your services. Enter Munin…

Get Started with Munin in 15 minutes

Munin is a monitoring tool with a plug-in framework. Munin nodes periodically report back to a Munin server. The Munin server collects the data and generates an HTML page with graphs. The default install of Munin contains a plug-in for reporting Varnish statistics. The Varnish plug-in includes a variety of graphs, including the one below.

Installing Munin

If you’re installing Munin on an Ubuntu machine (or any distribution that uses apt), use the commands below. For other platforms, see the installation instructions here.

For every server you want to monitor, type:

$ sudo apt-get install munin-node

Designate a server to collect the data. The server can also be a Munin node. On the server, type:

$ sudo apt-get install munin

Configuring Munin

For each node, open the configuration file at /etc/munin/munin-node.conf. Add the IP address of the Munin server.

allow ^xxx.xxx.xxx.xxx$

After you modify the configuration file, restart the Munin node by typing:

$ sudo service munin-node restart

For the server, open the configuration file at /etc/munin/munin.conf. Add each node that you want to monitor.

[Domain;serverA]
  address xxx.xxx.xxx.xxx
  use_node_name yes

Choose any value you like for Domain and serverA above; the names are purely for organization. When the Munin server was installed, it also installed a cron job that runs every 5 minutes and collects data from each node. After editing the configuration file, wait 5 minutes for the charts to be generated. If you’re impatient, type:

$ sudo -u munin /usr/bin/munin-cron

View Munin Graphs

If you have lighttpd or Apache, point it at /var/cache/munin/www. If the charts have been generated properly, there should be an index.html file in that directory.

Troubleshooting Munin

If the Munin charts aren’t being generated, make sure that the directories listed in /etc/munin/munin.conf exist and have appropriate permissions for the user, munin.

Try manually executing munin-cron and see if there is any error output.

Look at /var/log/syslog for any Munin-related errors.

Conclusion

That’s it! Varnish is optimized and working correctly, and Munin is reporting the important stats so you can sleep easy at night. Enjoy!

Additional Resources

Web caching references

Caching Tutorial – Excellent overview of web caching by Mark Nottingham.
Things Caches Do – Overview of reverse proxy caches like Varnish and Rack-Cache.
HTTP 1.1 Caching Specification – Official HTTP 1.1 Caching Specification.

Varnish references

A Varnish Crash Course For Aspiring Sysadmins
Varnishstat for Dummies
Varnish Best Practices
Achieving a High Hitrate

Munin references

Munin Tutorial