Getting Vagrant VMware Plugin Installed on OS X El Capitan and Homebrew

I finally decided to bite the bullet this morning and install VMware Fusion 8 and the corresponding Vagrant plugin.

Both were paid upgrades (41.42€ + $39 from Fusion 7), which was a bit bitter given I had only had the previous versions for a couple of months. Yet, the word on the street was that the new version would be much more stable and less cpu-hungry than the previous generation. So what the heck, maybe I’d again be able to get more than a couple hours of productive work done without draining the battery.

The purchase process itself was painless, as was installing VMware itself. But when I started installing the plugin, an ugly yak raised its hairy head.

The first thing I tried was to just use the old plugin with the new VMware version. Would it perhaps work?

1
2
3
4
➜  ~ vagrant suspend
This provider only works with VMware Fusion 5.x, 6.x, or 7.x. You have
Fusion '8.0.1'. Please install the proper version of VMware
Fusion and try again.

That sounds like a resounding “No”.

So onwards: bought a license for the plugin as well and tried to install it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
➜  ~ vagrant plugin install vagrant-vmware-fusion
Installing the 'vagrant-vmware-fusion' plugin. This can take a few minutes...
Bundler, the underlying system Vagrant uses to install plugins,
reported an error. The error is shown below. These errors are usually
caused by misconfigured plugin installations or transient network
issues. The error from Bundler is:

An error occurred while installing hitimes (1.2.3), and Bundler cannot continue.
Make sure that `gem install hitimes -v '1.2.3'` succeeds before bundling.

Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension.

    /opt/vagrant/embedded/bin/ruby extconf.rb
creating Makefile

make "DESTDIR="


Agreeing to the Xcode/iOS license requires admin privileges, please re-run as root via sudo.

No biggie, a little Googling revealed what I suspected – I don’t actually need to run this as root, I just need to accept the EULA of the latest XCode version before running the install process again. I popped up XCode, YOLOed the agreement, and was back to the terminal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
➜  ~ vagrant plugin install vagrant-vmware-fusion
Installing the 'vagrant-vmware-fusion' plugin. This can take a few minutes...
Bundler, the underlying system Vagrant uses to install plugins,
reported an error. The error is shown below. These errors are usually
caused by misconfigured plugin installations or transient network
issues. The error from Bundler is:

An error occurred while installing eventmachine (1.0.8), and Bundler cannot continue.
Make sure that `gem install eventmachine -v '1.0.8'` succeeds before bundling.

Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension.

[LOTS OF CRUFT REMOVED FOR BREVITY]

make "DESTDIR="
compiling binder.cpp
warning: unknown warning option '-Werror=unused-command-line-argument-hard-error-in-future'; did you mean '-Werror=unused-command-line-argument'? [-Wunknown-warning-option]
In file included from binder.cpp:20:
./project.h:116:10: fatal error: 'openssl/ssl.h' file not found
#include <openssl/ssl.h>
         ^
1 warning and 1 error generated.
make: *** [binder.o] Error 1


Gem files will remain installed in /Users/jarkko/.vagrant.d/gems/gems/eventmachine-1.0.8 for inspection.
Results logged to /Users/jarkko/.vagrant.d/gems/gems/eventmachine-1.0.8/ext/gem_make.out

Again, the reason is clear: I need to build the gem against proper openssl libs – in this case, -I/usr/local/opt/openssl/include. Thus:

1
2
3
4
5
6
7
8
9
➜  ~  gem install eventmachine -v '1.0.8' -- --with-cppflags=-I/usr/local/opt/openssl/include
Fetching: eventmachine-1.0.8.gem (100%)
Building native extensions with: '--with-cppflags=-I/usr/local/opt/openssl/include'
This could take a while...
Successfully installed eventmachine-1.0.8
Parsing documentation for eventmachine-1.0.8
Installing ri documentation for eventmachine-1.0.8
Done installing documentation for eventmachine after 6 seconds
1 gem installed

Perfect. So I tried to install the Vagrant plugin again – and got the same error. Crap.

It turns out that Vagrant uses its own gem folder, so it’s not picking up what’s installed in one’s primary gem directory. The issue was, how do I tell Vagrant to use the correct cpp flags in its build process?

Fortunately I didn’t have to figure that out, because the end of the error message above gave me enough of a pointer towards a solution: if I only managed to install the eventmachine gem by hand to the correct location with the proper cpp flags, I should be fine. But how?

I dug up to gem -h and the defaults at the end gave the correct option away: with the --install-dir option I could install the gem to wherever I wanted to. Thus:

1
2
3
4
5
6
7
8
9
~  gem install eventmachine -v '1.0.8' --install-dir /Users/jarkko/.vagrant.d/gems  -- --with-cppflags=-I/usr/local/opt/openssl/include
Fetching: eventmachine-1.0.8.gem (100%)
Building native extensions with: '--with-cppflags=-I/usr/local/opt/openssl/include'
This could take a while...
Successfully installed eventmachine-1.0.8
Parsing documentation for eventmachine-1.0.8
Installing ri documentation for eventmachine-1.0.8
Done installing documentation for eventmachine after 5 seconds
1 gem installed

Et voilá. Now one more try:

1
2
3
➜    vagrant plugin install vagrant-vmware-fusion
Installed the plugin 'vagrant-vmware-fusion (4.0.2)'!

…and we’re off to the races!

…well, at least almost.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
➜  ~ vagrant up
Bringing machine 'default' up with 'vmware_fusion' provider...
==> default: Verifying vmnet devices are healthy...
==> default: Preparing network adapters...
==> default: Starting the VMware VM...
An error occurred while executing `vmrun`, a utility for controlling
VMware machines. The command and output are below:

Command: ["start", "/Users/jarkko/vmware/955a09a5-f7f6-451e-b565-22f41c8fced0/packer-ubuntu-14.04-amd64.vmx", "nogui", {:notify=>[:stdout, :stderr], :timeout=>45}]

Stdout: 2015-10-26T11:10:45.943| ServiceImpl_Opener: PID 84443
Error: The operation was canceled

Stderr:

Oh well, that wasn’t very helpful. So I tried the proven trick #1: I killed the VMware app on OS X and even its menubar daemon just to be certain, and sure enough, it did the trick. The vagrant VM is now back on track.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

How Do I Know Whether My Rails App Is Thread-safe or Not?

Photo by Joseph Francis, used under a Creative Commons license.

In January Heroku started promoting Puma as the preferred web server for Rails apps deployed on its hugely successful platform. Puma – as a threaded app server – can better use the scarce resources available for an app running on Heroku.

This is obviously good for a client since they can now run more concurrent users with a single Dyno. However, it’s also good for Heroku itself since small apps (probably the vast majority of apps deployed on Heroku) will now consume much fewer resources on its servers.

The recommendation comes with a caveat, however. Your app needs to be thread-safe. The problem with this is that there is no simple way to say with absolute certainty whether an app as a whole is thread-safe. We can get close, however.

Let’s have a look how.

For the purpose of this issue, an app can be split into three parts:

  1. The app code itself.
  2. Rails the framework.
  3. Any 3rd party gems used by the app.

All three of these need to be thread-safe. Rails and its included gems have been declared thread-safe since 2.2, i.e. since 2008. This alone, however, does not automatically make your app as a whole so. Your own app code and all the gems you use need to be thread-safe as well.

What is and what isn’t thread-safe in Ruby?

So when is your app code not thread-safe? Simply put, when you share mutable state between threads in your app.

But what does this even mean?

None of the core data structures (except for Queue) in Ruby are thread-safe. The structures are mutable, and when shared between threads, there are no guarantees the threads won’t overwrite each others’ changes. Fortunately, this is rarely the case in Rails apps.

Any code that is more than a single operation (as in a single Ruby code call implemented in C) is not thread-safe. The classic example of this is the += operator, which is in fact two operations combined, = and +. Thus, the final value of the shared variable in the following code is undetermined:

1
2
3
4
@n = 0
3.times do
  Thread.start { 100.times { @n += 1 } }
end

However, none of the two above things alone makes code thread-unsafe. It only becomes so when it is mated with shared data. Let’s get back to that in a minute, but first…

Aside: But what about GIL?

More informed readers might object at this point and point out that MRI Ruby uses a GIL, a.k.a. Global Interpreter Lock.

The general wisdom on the street is that GIL is bad because it does not let your threads run in parallel (true, in a sense), but good, because it makes your code thread-safe.

Unfortunately, GIL does not make your code thread-safe. It only guarantees that two threads can’t run Ruby code at the same time. Thus it does inhibit parallelism. However, threads can still be paused and resumed at any given point, which means that they absolutely can clobber each others’ data.

GIL does accidentally make some operations (such as Array#<<) atomic. However, there are two issues with this:

  • It only applies to cases where what you’re doing is truly a single Ruby operation. When what you’re doing is multiple operations, context switches can and will happen, and you won’t be happy.
  • It only applies to MRI. JRuby and Rubinius support true parallelism and thus don’t use a GIL. I wouldn’t count on GIL being there forever for MRI either, so relying on it guaranteeing your code being thread-safe is irresponsible at best.

Go read Jesse Storimer’s Nobody understands the GIL (also parts 2 and 3) for much more detail about it (than you can probably even stomach). But for the love of the flying spaghetti monster, don’t count on it making your app thread-safe.

Thread-safety in Rails the framework

A bit of history:

Rails and its dependencies were declared thread-safe already in version 2.2, in 2008. At that point, however, the consensus was that so many third party libraries were not thread-safe that the whole request in Rails was enclosed within a giant mutex lock. This meant that while a request was being processed, no other thread in the same process could be running.

In order to take advantage of threaded execution, you had to declare in your config.rb that you really wanted to ditch the lock:

1
config.threadsafe!

However, en route to Rails 4 Aaron Tenderlove Patterson demonstrated that what config.threadsafe! did was

  • effectively irrelevant in multi-process environments (such as Unicorn), where a single process never processed multiple requests concurrently.
  • absolutely necessary every time you used a threaded server such as Puma or Passenger Enterprise.

What this meant was that there was no reason for not to have the thread-safe option always on. And that was exactly what was done for Rails 4 in 2012.

Key takeaway: Rails and its dependencies are thread-safe. You don’t have to do anything to “turn that feature on”.

Making your app code thread-safe

Good news: Since Rails uses the Shared nothing architecture, Rails apps are consequentially very suitable for being thread-safe as well. In general, Rails creates a new controller object of every HTTP request, and everything else flows from there. This isolates most objects in a Rails app from other requests.

Like noted above, built-in Ruby data structures (save for Queue) are not thread-safe. This does not, however, matter, unless you are actually sharing them between threads. Because of the way in which Rails is architectured, this almost never happens in a Rails app.

There are, however, some patterns that can come bite you in the ass when you want to switch to a threaded app server.

Global variables

Global variables are, well, global. This means that they are shared between threads. If you weren’t convinced about not using global variables by now, here’s another reason to never touch them. If you really want to share something globally across an app, you are more than likely better served by a constant (but see below), anyway.

Class variables

For the purpose of a discussion about threads, class variables are not much different from global variables. They are shared across threads just the same way.

The problem isn’t so much about using class variables, but about mutating them. And if you are not going to mutate a class variable, in many cases a constant is again a better choice.

Class instance variables

But maybe you’ve read that you should always use class instance variables instead of class variables in Ruby. Well, maybe you should, but they are just as problematic for threaded programs as class variables.

It’s worth pointing out that both class variables and class instance variables can also be set by class methods. This isn’t such an issue in your own code, but you can easily fall into this trap when calling other apis. Here’s an example from Pratik Naik where the app developer is getting into thread-unsafe territory by just calling Rails class methods:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class HomeController < ApplicationController
  before_filter :set_site

  def index
  end

  private

  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
    if @site.layout?
      self.class.layout(@site.layout_name)
    else
      self.class.layout('default_lay')
    end
  end
end

In this case, calling the layout method causes Rails to set the class instance variable @_layout for the controller class. If two concurrent requests (served by two threads) hit this code simultaneously, they might end up in a race condition and overwrite each others’ layout.

In this case, the correct way to set the layout is to use a symbol with the layout call:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class HomeController < ApplicationController
  before_filter :set_site
  layout :site_layout

  def index
  end

  private

  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
  end

  def site_layout
    if @site.layout?
      @site.layout_name
    else
      'default_lay'
    end
  end
end

However, this is besides the point. The point is, you might end up using class variables and class instance variables by accident, thus making your app thread-unsafe.

Memoization

Memoization is a technique where you lazily set a variable if it is not already set. It is a common technique used where the original functionality is at least moderately expensive and the resulting variable is used several times within a request.

A common case would be to set the current user in a controller:

1
2
3
4
5
6
7
8
9
class SekritController < ApplicationController
  before_filter :set_user

  private

  def set_user
    @current_user ||= User.find(session[:user_id])
  end
end

Memoization by itself is not a thread safety issue. However, it can easily become one for a couple of reasons:

  • It is often used to store data in class variables or class instance variables (see the previous points).
  • The ||= operator is in fact two operations, so there is a potential context switch happening in the middle of it, causing a race condition between threads.

It would be easy to dismiss memoization as the cause of the problem, and tell people just to avoid class variables and class instance variables. However, the issue is more complex than that.

In this issue, Evan Phoenix squashes a really tricky race condition bug in the Rails codebase caused by calling super in a memoization function. So even though you would only be using instance variables, you might end up with race conditions with memoization.

What’s a developer to do, then?

  • Make sure memoization makes sense and a difference in your case. In many cases Rails actually caches the result anyway, so that you are not saving a whole lot if any resources with your memoization method.
  • Don’t memoize to class variables or class instance variables. If you need to memoize something on the class level, use thread local variables (Thread.current[:baz]) instead. Be aware, though, that it is still kind of a global variable. So while it’s thread-safe, it still might not be good coding practice.
1
2
3
def set_expensive_var
  Thread.current[:expensive_var] ||= MyModel.find(session[:goo_id])
end
  • If you absolutely think you must be able to share the result across threads, use a mutex to synchronize the memoizing part of your code. Keep in mind, though, that you’re kinda breaking the Shared nothing model of Rails with that. It’s kind of a half-assed sharing method anyway, since it only works across threads, not across processes.

    Also keep in mind, that a mutex only saves you from race conditions inside itself. It doesn’t help you a whole lot with class variables unless you put the lock around the whole controller action, which was exactly what we wanted to avoid in the first place.

1
2
3
4
5
6
7
8
9
10
11
12
class GooController < ApplicationController
  @@lock = Mutex.new
  before_filter :set_expensive_var

  private

  def set_expensive_var
    @@lock.synchronize do
      @@stupid_class_var ||= Foo.bar(params[:subdomain])
    end
  end
end
  • Use different instance variable names when you use inheritance and super in memoization methods.
1
2
3
4
5
6
7
8
9
10
11
class Foo
  def env_config
    @env_config ||= {foo: 'foo', bar: 'bar'}
  end
end

class Bar < Foo
  def env_config
    @bar_env_config ||= super.merge({foo: 'baz'})
  end
end

Constants

Yes, constants. You didn’t believe constants are really constant in Ruby, did you? Well, they kinda are:

1
2
3
4
irb(main):008:0> CON
=> [1]
irb(main):009:0> CON = [1,2]
(irb):9: warning: already initialized constant CON

So you do get a warning when trying to reassign a constant, but the reassignment still goes through. That’s not the real problem, though. The real issue is that the constancy of constants only applies to the object reference, not the referenced object. And if the referenced object can be mutated, you have a problem.

Yeah, you remember right. All the core data structures in Ruby are mutable.

1
2
3
4
5
6
irb(main):010:0> CON
=> [1, 2]
irb(main):011:0> CON << 3
=> [1, 2, 3]
irb(main):012:0> CON
=> [1, 2, 3]

Of course, you should never, ever do this. And few will. There’s a catch, however. Since Ruby variable assignments also use references, you might end up mutating a constant by accident.

1
2
3
4
5
6
7
8
irb(main):010:0> CON
=> [1, 2]
irb(main):011:0> arr = CON
=> [1, 2]
irb(main):012:0> arr << 3
=> [1, 2, 3]
irb(main):013:0> CON
=> [1, 2, 3]

If you want to be sure that your constants are never mutated, you can freeze them upon creation:

1
2
3
4
5
6
irb(main):001:0> CON = [1,2,3].freeze
=> [1, 2, 3]
irb(main):002:0> CON << 4
RuntimeError: can't modify frozen Array
  from (irb):2
  from /Users/jarkko/.rbenv/versions/2.1.2/bin/irb:11:in `<main>'

Keep in mind, though, that freeze is shallow. It only applies to the actual Array object in this case, not its items.

Environment variables

ENV is really just a hash-like construct referenced by a constant. Thus, everything that applies to constants above, also applies to it.

1
ENV['version'] = "1.2" # Don't do this

Making sure 3rd party code is thread-safe

If you want your app to be thread-safe, all the third-party code it uses also needs to be thread-safe in the context of your app.

The first thing you probably should do with any gem is to read through its documentation and Google for whether it is deemed thread-safe. That said, even if it were, there’s no escaping double-checking yourself. Yes, by reading through the source code.

As a general rule, all that I wrote above about making your own code thread-safe applies here as well. However…

With 3rd party gems and Rails plugins, context matters.

If the third party code you use is just a library that your own code calls, you’re fairly safe (considering you’re using it in a thread-safe way yourself). It can be thread-unsafe just the same way as Array is, but if you don’t share the structures between threads, you’re more or less fine.

However, many Rails plugins actually extend or modify the Rails classes, in which case all bets are off. In this case, you need to scrutinize the library code much, much more thoroughly.

So how do you know which type of the two above a gem or plugin is? Well, you don’t. Until you read the code, that is. But you are reading the code anyway, aren’t you?

What smells to look for in third party code?

Everything we mentioned above regarding your own code applies.

  • Class variables (@@foo)
  • Class instance variables (@bar, trickier to find since they look the same as any old ivar)
  • Constants, ENV variables, and potential variables through which they can be mutated.
  • Memoization, especially when one of the two above points are involved
  • Creation of new threads (Thread.new, Thread.start). These obviously aren’t smells just by themselves. However, the risks mentioned above only materialize when shared across threads, so you should at least be familiar with in which cases the library is spawning new threads.

Again, context matters. Nothing above alone makes code thread-unsafe. Even sharing data with them doesn’t. But modifying that data does. So pay close attention to whether the libs provide methods that can be used to modify shared data.

The final bad news

No matter how thoroughly you read through the code in your application and the gems it uses, you cannot be 100% sure that the whole is thread-safe. Heck, even running and profiling the code in a test environment might not reveal lingering thread safety issues.

This is because many race conditions only appear under serious, concurrent load. That’s why you should both try to squash them from the code and keep a close eye on your production environment on a continual basis. Your app being perfectly thread-safe today does not guarantee the same is true a couple of sprints later.

Recap

To make a Rails app thread-safe, you have to make sure the code is thread-safe on three different levels:

  • Rails framework and its dependencies.
  • Your app code.
  • Any third party code you use.

The first one of these is handled for you, unless you do stupid shit with it (like the memoization example above). The rest is your responsibility.

The main thing to keep in mind is to never mutate data that is shared across threads. Most often this happens through class variables, class instance variables, or by accidentally mutating objects that are referenced by a constant.

There are, however, some pretty esoteric ways an app can end up thread-unsafe, so be prepared to track down and fix the last remaining threading issues while running in production.

Have fun!

Acknowledgments: Thanks to James Tucker, Evan Phoenix, and the whole Bear Metal gang for providing feedback for the drafts of this article.

Related articles

This article is a part of a series about Rails performance optimization and GC tuning. Other articles in the series:

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

Rails Garbage Collection: Tuning Approaches

MRI maintainers have put a tremendous amount of work into improving the garbage collector in Ruby 2.0 through 2.2. The engine has thus gained a lot more horsepower. However, it’s still not trivial to get the most out of it. In this post we’re going to gain a better understanding of how and what to tune for.

Photo by Steve Arnold, used under a Creative Commons license.

Koichi Sasada (_ko1, Ruby MRI maintainer) famously mentioned in a presentation (slide 89):

Try GC parameters

  • There is no silver bullet
    • No one answer for all applications
    • You should not believe other applications settings easily
  • Try and try and try!

This is true in theory but a whole lot harder to pull off in practice due to three primary problems:

  • Interpreter GC semantics and configuration change over time.
  • One GC config isn’t optimal for all app runtime contexts: tests, requests, background jobs, rake tasks, etc.
  • During the lifetime and development cycles of a project, it’s very likely that existing GC settings are invalidated quickly.

An evolving Garbage Collector

The garbage collector has frequently changed in the latest MRI Ruby releases. The changes have also broken many existing assumptions and environment variables that tune the GC. Compare GC.stat on Ruby 2.1:

1
2
3
4
5
6
7
8
{ :count=>7, :heap_used=>66, :heap_length=>66, :heap_increment=>0,
  :heap_live_slot=>26397, :heap_free_slot=>507, :heap_final_slot=>0,
  :heap_swept_slot=>10698, :heap_eden_page_length=>66, :heap_tomb_page_length=>0,
  :total_allocated_object=>75494, :total_freed_object=>49097,
  :malloc_increase=>465840, :malloc_limit=>16777216, :minor_gc_count=>5,
  :major_gc_count=>2, :remembered_shady_object=>175,
  :remembered_shady_object_limit=>322, :old_object=>9109, :old_object_limit=>15116,
  :oldmalloc_increase=>1136080, :oldmalloc_limit=>16777216 }

…with Ruby 2.2:

1
2
3
4
5
6
7
8
9
10
11
{ :count=>6, :heap_allocated_pages=>74, :heap_sorted_length=>75,
  :heap_allocatable_pages=>0, :heap_available_slots=>30162, :heap_live_slots=>29729,
  :heap_free_slots=>433, :heap_final_slots=>0, :heap_marked_slots=>14752,
  :heap_swept_slots=>11520, :heap_eden_pages=>74, :heap_tomb_pages=>0,
  :total_allocated_pages=>74, :total_freed_pages=>0,
  :total_allocated_objects=>76976, :total_freed_objects=>47247,
  :malloc_increase_bytes=>449520, :malloc_increase_bytes_limit=>16777216,
  :minor_gc_count=>4, :major_gc_count=>2, :remembered_wb_unprotected_objects=>169,
  :remembered_wb_unprotected_objects_limit=>278, :old_objects=>9337,
  :old_objects_limit=>10806, :oldmalloc_increase_bytes=>1147760,
  :oldmalloc_increase_bytes_limit=>16777216 }

In Ruby 2.2 we can see a lot more to introspect and tune, but this also comes with a steep learning curve which is (and should be) out of scope for most developers.

One codebase, different roles

A modern Rails application is typically used day to day in different contexts:

  • Running tests
  • rake tasks
  • database migrations
  • background jobs

They all start pretty much the same way with the VM compiling code to instruction sequences. Different roles affect the Ruby heap and the garbage collector in very different ways, however.

This job typically runs for 13 minutes, triggers 133 GC cycles and allocates a metric ton of objects. Allocations are very bursty and in batches.

1
2
3
4
5
6
7
class CartCleanupJob < ActiveJob::Base
  queue_as :default

  def perform(*args)
    Cart.cleanup(Time.now)
  end
end

This controller action allocates 24 555 objects. Allocator throughput isn’t very variable.

1
2
3
4
5
class CartsController < ApplicationController
  def show
    @cart = Cart.find(params[:id])
  end
end

Our test case contributes 175 objects to the heap. Test cases generally are very variable and bursty in allocation patterns.

1
2
3
4
5
def test_add_to_cart
  cart = carts(:empty)
  cart << products(:butter)
  assert_equal 1, cart.items.size
end

The default GC behavior isn’t optimal for all of these execution paths within the same project and neither is throwing a single set of RUBY_GC_* environment variables at it.

We’d like to refer to processing in these different contexts as “units of work”.

Fast development cycles

During the lifetime and development cycle of a project, it’s very likely that garbage collector settings that were valid yesterday aren’t optimal anymore after the next two sprints. Changes to your Gemfile, rolling out new features, and bumping the Ruby interpreter all affect the garbage collector.

1
2
3
4
5
source 'https://rubygems.org'

ruby '2.2.0' # Invalidates most existing RUBY_GC_* variables

gem 'mail' # slots galore

Process lifecycle events

Let’s have a look at a few events that are important during the lifetime of a process. They help the tuner to gain valuable insights into how well the garbage collector is working and how to further optimize it. They all hint at how the heap changes and what triggered a GC cycle.

How many mutations happened for example while

  • processing a request
  • between booting the app and processing a request
  • during the lifetime of the application?

When it booted

When the application is ready to start doing work. For Rails application, this is typically when the app has been fully loaded in production, ready to serve requests, ready to accept background work, etc. All source files have been loaded and most resources acquired.

When processing started

At the start of a unit of work. Typically the start of an HTTP request, when a background job has been popped off a queue, the start of a test case or any other type of processing that is the primary purpose of running the process.

When processing ended

At the end of a unit of work. Typically the end of a HTTP request, when a background job has finished processing, the end of a test case or any other type of processing that is the primary purpose of running the process.

When it terminated

Triggered when the application terminates.

Knowing when and why GC happens

Tracking GC cycles interleaved with the aforementioned application events yield insights into why a particular GC cycle happens. The progression from BOOTED to TERMINATED and everything else is important because mutations that happen during the fifth HTTP request of a new Rails process also contribute to a GC cycle during request number eight.

On tuning

Primarily the garbage collector exposes tuning variables in these three categories:

  • Heap slot values: where Ruby objects live
  • Malloc limits: off heap storage for large strings, arrays and other structures
  • Growth factors: by how much to grow slots, malloc limits etc.

Tuning GC parameters is generally a tradeoff between tuning for speed (thus using more memory) and tuning for low memory usage while giving up speed. We think it’s possible to infer a reasonable set of defaults from observing the application at runtime that’s conservative with memory, yet maintain reasonable throughput.

A solution

We’ve been working on a product, TuneMyGC for a few weeks that attempts to do just that. Our goals and objectives are:

  • A repeatable and systematic tuning process that respects fast development cycles
  • It should have awareness of runtime profiles being different for HTTP requests, background job processing etc.
  • It should support current mainline Ruby versions without developers having to keep up to date with changes
  • Deliver reasonable memory footprints with better runtime performance
  • Provide better insights into GC characteristics both for app owners and possibly also ruby-core

Here’s an example of Discourse being automatically tuned for better 99th percentile throughput. Response times in milliseconds, 200 requests:

Controller GC defaults Tuned GC
categories 227 160
home 163 113
topic 55 40
user 92 76

GC defaults:

1
$ RUBY_GC_TUNE=1 RUBY_GC_TOKEN=a5a672761b25265ec62a1140e21fc81f ruby script/bench.rb -m -i 200

Raw GC stats from Discourse’s bench.rb script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
GC STATS:
count: 106
heap_allocated_pages: 2447
heap_sorted_length: 2455
heap_allocatable_pages: 95
heap_available_slots: 997407
heap_live_slots: 464541
heap_free_slots: 532866
heap_final_slots: 0
heap_marked_slots: 464530
heap_swept_slots: 532876
heap_eden_pages: 2352
heap_tomb_pages: 95
total_allocated_pages: 2447
total_freed_pages: 0
total_allocated_objects: 27169276
total_freed_objects: 26704735
malloc_increase_bytes: 4352
malloc_increase_bytes_limit: 16777216
minor_gc_count: 91
major_gc_count: 15
remembered_wb_unprotected_objects: 11669
remembered_wb_unprotected_objects_limit: 23338
old_objects: 435435
old_objects_limit: 870870
oldmalloc_increase_bytes: 4736
oldmalloc_increase_bytes_limit: 30286118

TuneMyGC recommendations

1
$ RUBY_GC_TUNE=1 RUBY_GC_TOKEN=a5a672761b25265ec62a1140e21fc81f RUBY_GC_HEAP_INIT_SLOTS=997339 RUBY_GC_HEAP_FREE_SLOTS=626600 RUBY_GC_HEAP_GROWTH_FACTOR=1.03 RUBY_GC_HEAP_GROWTH_MAX_SLOTS=88792 RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=2.4 RUBY_GC_MALLOC_LIMIT=34393793 RUBY_GC_MALLOC_LIMIT_MAX=41272552 RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=1.32 RUBY_GC_OLDMALLOC_LIMIT=39339204 RUBY_GC_OLDMALLOC_LIMIT_MAX=47207045 RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR=1.2 ruby script/bench.rb -m -i 200

Raw GC stats from Discourse’s bench.rb script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
GC STATS:
count: 44
heap_allocated_pages: 2893
heap_sorted_length: 2953
heap_allocatable_pages: 161
heap_available_slots: 1179182
heap_live_slots: 460935
heap_free_slots: 718247
heap_final_slots: 0
heap_marked_slots: 460925
heap_swept_slots: 718277
heap_eden_pages: 2732
heap_tomb_pages: 161
total_allocated_pages: 2893
total_freed_pages: 0
total_allocated_objects: 27167493
total_freed_objects: 26706558
malloc_increase_bytes: 4352
malloc_increase_bytes_limit: 34393793
minor_gc_count: 34
major_gc_count: 10
remembered_wb_unprotected_objects: 11659
remembered_wb_unprotected_objects_limit: 27981
old_objects: 431838
old_objects_limit: 1036411
oldmalloc_increase_bytes: 4736
oldmalloc_increase_bytes_limit: 39339204

We can see a couple of interesting points here:

  • There is much less GC activity – only 44 rounds instead of 106.
  • Slot buffers are still decent for high throughput. There are 718247 free slots (heap_free_slots) of 1179182 available slots (heap_available_slots), which is 64% of the current live objects (heap_live_slots). This value however is slightly skewed because the Discourse benchmark script forces a major GC before dumping these stats - there are about as many swept slots as free slots (heap_swept_slots).
  • Malloc limits (malloc_increase_bytes_limit and oldmalloc_increase_bytes_limit) and growth factors (old_objects_limit and remembered_wb_unprotected_objects_limit) are in line with actual app usage. The TuneMyGC service considers when limits and growth factors are bumped during the app lifecycle and attempts to raise limits via environment variables slightly higher to prevent excessive GC activity.

Now it’s your turn.

Feel free to take your Rails app for a spin too!

1. Add to your Gemfile.

1
gem 'tunemygc'

2. Register your Rails application.

1
2
$ bundle exec tunemygc -r [email protected]
      Application registered. Use RUBY_GC_TOKEN=08de9e8822c847244b31290cedfc1d51 in your environment.

3. Boot your app. We recommend an optimal GC configuration when it ends

1
$ RUBY_GC_TOKEN=08de9e8822c847244b31290cedfc1d51 RUBY_GC_TUNE=1 bundle exec rails s

Related articles

This article is a part of a series about Rails performance optimization and GC tuning. Other articles in the series:

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

It’s Not About You

This is a slightly expanded version of a talk I gave in Rails Girls Helsinki in February 2015.

Photo by Sean MacEntee, used under a Creative Commons license.

I used to suffer from terrible stage fright. I was super nervous every time I presented. I forgot stuff I was supposed to say on stage. I never vomited before a talk, though, I’ll give you that.

It got better over time through lots of practice, but I still get all sweaty and shaky before getting on stage.

Then I recently stumbled upon an article by Kathy Sierra called Presentation Skills Considered Harmful. In it she tells about having had similar problems, and how all the tutorials told her what and how you should do to give a good presentation. You, you, you.

Then she realized that a presentation is really a UX. A presentation is just a user experience. You present ideas – hopefully good ones – to your audience. What does this make you, the presenter? A UI. You are just a UI, a user interface. You yourself don’t matter that much. All that matters is that your ideas touch your audience.

Is that bad? No, it’s great. It’s a huge relief. What matters is not you but what you have to say, the meat of your talk.

And that brings us nicely to the topic of this article.

It’s not about you.

This is maybe the most important thing I’ve learned during the past decade1. It’s also not only profound, but spans your entire life.

Writing

These days, everyone and their cat has a blog.

Most blogs tell about the author, which is totally fine — if the author writes it for themselves or their relatives. But when you think about the most helpful blogs – the ones you go back to all the time – they really aren’t. They are about helping the reader, either by informing them or teaching them new things.

But Jarkko, I hear you say. You just started this article with a story about yourself.

That is correct. Storytelling gets kind of an exception.

Except not really. Because storytelling isn’t really about the storyteller, either. You know, we didn’t always know how to write. Telling stories was the only way to pass information to younger generations. Thus, our brains are quite literally wired to react to storytelling. We’re evolutionarily built to learn better from stories.

Thus, stories are not so much about you, the teller, either. Stories are about the listener/reader, and how they relate to the protagonist of the narrative.

And in the end, unless you’re writing fiction, stories are just a tool as well. A powerful one, yes, but just a tool to bring home a lesson to the reader.

Because writing is not about you.

Cincinnati Enquirer learned this the hard way recently. After they laid off their whole copy desk, they were shocked to find out that readers were outraged about the deteriorating quality of the paper’s articles. John E. McIntyre describes the issue vividly:

The reader doesn’t care how hard you worked, what pressures you are under, or how good your intentions are. The reader sees the product, online or in print; if the product looks sloppy and substandard, the reader will form, and likely express, a low opinion of it. And the reader is under no obligation whatsoever to be kind.

What the Enquirer forgot was that their writing is really not about them.

But you don’t wanna hear me babble about writing, let’s get to business.

Business ideas

Are you still looking for that killer business idea?

Stop.

Ideas are all about you. How to get into business you should – apparently through divine intervention or something – come up with a dazzling idea.

Ideas are also dangerous because once you get one, it makes you a victim of confirmation bias. You’re going to start retrofitting the world to your idea, which is totally backwards.

Instead, find an audience you can relate to and sell to. Then, find about their pains, problems, and ways to help them make more money. Then solve that pain and you’ll do well.

Because – to quote Peter Drucker – the purpose of business is to create and keep a customer. It’s that customer, not you, who is going to decide the fate of your business.

Because successful business isn’t about you.

Marketing

You know what’s special about Apple ads? They almost never list features or specs. Instead, they show what their users can do with them. Shoot video. Facetime with their relatives on the other side of globe. Throw a DJ gig with just an iPad or two.

My friend Amy Hoy has this formula for great sales copy called PDF, for Pain–Dream–Fix:

  1. Start with describing the pain your users are having, with crisp, clear words, so that they will go all ahhh, they understand me.
  2. Then flip it around. “Imagine a world where you wouldn’t have to manually scan your taxi receipts. Instead, they would magically appear in your accounting system.”
  3. And fix. “We got you covered. Uberreceipt 9000 will hook up Uber with your accounting and you will never again touch another taxi receipt!”

Now that is how you make people pay attention.

Because great marketing and sales copy is not about you, either.

Finally: Product

Last, and indeed least, we get to the actual product, software in our case.

It – as and of itself – is not all that important. Because having a great product is not about you, or the product itself. It’s about solving a customer pain or a job they need to tackle.

Because people don’t buy a quarter-inch drill, they buy a quarter-inch hole in the wall.

By now, you already know the pains of your audience2. Now it’s time to solve them. From previous points, you should already have a nice roadmap to a product people like.

Extra credit: Make your users kick ass

For an extra credit, let’s see how we could transform people from liking your product to loving and raving about it.

We started this talk with Kathy Sierra, and we’re going to end it with her as well.

Kathy’s big idea is that the main purpose of software is to make its users kick ass. What she means by this is that software – or any product really – should help their users to get better at what they do, not just using the product3.

Screw gamification.

Final Cut Pro should not make its users better at using Final Cut Pro. It should make them better film editors.

WODConnect should not make its users better at using the app. It should make them stronger and faster.

This should happen through everything related to your product. The product itself, its marketing, its manuals, its customer support, everything.

Because your success is not about you or your product.

It’s about the users.

It’s about empathy.


Discuss on Hacker News.


  1. Well, let’s just say it’s a tie with learning about the growth mindset.

  2. If not, go back to the Business Ideas part above. Do not – and I can’t stress this enough – start building a product before you are sure it solves a problem people have, know they have, and are willing to pay for.

  3. All this is also the subject of Kathy’s upcoming book, Badass: Making Users Awesome

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

The Tremendous Scale of AWS and the Hidden Benefit of the Cloud

Photo by Susanne Nilsson, used under a Creative Commons license.

Finland, 2013. Vantaa, the second largest municipality in Finland buys a new web form for welfare applications from CGI (née Logica, née WM-Data) for a whopping €1.9 million. The story doesn’t end there, though. A month later it turns out, that Helsinki has bought the exact same form from CGI as well, for €1.85 million.

Now, you can argue about what is a fair value for a single web form, especially when it has to be integrated to an existing information system. What is clear though, that it is not almost 2 million Euros, twice.

“How on earth was that possible,” I hear you ask. Surely someone would have offered to do that form for, say, 1 million a pop. Heck, even the Finnish law for public procurements mandates public competitive bidding for such projects.

Vendor lock-in. CGI was administering the information system on which the form was to be built. And since they held the key, they could pretty much ask for as much as the municipalities could potentially pay for the form.

Now hold that thought.


Over at High Scalability, Todd Hoff writes about James Hamilton’s talk at the AWS re:Invent conference last November. It reveals how gigantic the scale of Amazon Web Services really is:

Every day, AWS adds enough new server capacity to support all of Amazon’s global infrastructure when it was a $7B annual revenue enterprise (in 2004).

This also means that AWS is leaps and bounds above its competitors when it comes to capacity:

All 14 other cloud providers combined have 1/5th the aggregate capacity of AWS (estimate by Gartner)

This of course gives AWS a huge benefit compared to its competitors. It can run larger datacenters both close and far from each others; they can get sweetheart deals and custom-made components from Intel for servers, just like Apple does with laptops and desktops. And they can afford to design their own network gear, the one field where the progress hasn’t followed the Moore’s law. There the only other companies who do the same are other internet giants like Google and Facebook, but they’re not in the same business as AWS1.

All this is leading to a situation where AWS is becoming the IBM of the 21st century, for better or for worse. Just like no one ever got fired for buying IBM in the 80’s, few will likely get fired for choosing AWS in the years to come. This will be a tough, tough nut to crack for Amazon’s competitors.

So far the situation doesn’t seem to have slowed down Amazon’s rate of innovation, and perhaps they have learned the lessons of the big blue. Only future will tell.

From a customer’s perspective, a cloud platform like AWS brings lots and lots of benefits – well listed in the article above – but of course also downsides. Computing power is still much cheaper when bought in physical servers. You can rent a monster Xeon server with basically unlimited bandwidth for less than €100/month. AWS or platforms built on it such as Heroku can’t compete with that on price. So if you’re very constrained on cash and have the sysadmin chops to operate the server, you will get a better deal.

Of course we’re comparing apples and oranges here. You won’t get similar redundancy and flexibility with physical servers as you can with AWS for any money – except when you do. The second group where using a commercial cloud platform doesn’t make sense is when your scale merits a cloud platform of your own. Open source software for such platforms – such as Docker and Flynn – are slowly at a point where you can rent your own servers and basically build your own AWS on them2. Of course this will take a lot more knowledge from your operations team, especially if you want to attain similar redundancy and high availability that you can with AWS Availability Zones.

There is – however – one hidden benefit of going with a commercial cloud platform such as AWS, that you might not have thought about: going with AWS will lessen your vendor lock-in a lot. Of course you can still shoot yourself in the foot by handing off the intellectual property rights of the software to your vendor or some other braindead move. But given that you won’t, hosting is another huge lock-in mechanism large IT houses use to screw their clients. It not only effectively locks the client to the vendor, but it also massively slows down any modifications made by other projects that need to integrate with the existing system, since everything needs to be handled through the vendor. They can, and will, block any progress you could make yourself.

With AWS, you can skip all of that. You are not tied to a particular vendor to develop, operate, and extend your system. While running apps on PaaS platforms requires some specific knowledge, it is widely available and standard. If you want to take your systems to another provider, you can. If you want to build your own cloud platform, you can do it and move your system over bit by bit.

It is thus no wonder that large IT consultancies are racing to build their own platforms, to hit all the necessary buzzword checkboxes. However, I would be very wary of their offerings. I’m fairly certain the savings they get from better utilization of their servers by virtualization are not passed on to the customer. And even if some of them are, the lock-in is still there. They have absolutely no incentive to make their platform compatible with the existing ones, quite the contrary. Lock-in is on their side. It is not on your side. Beware.


  1. Apart from Google Computing Engine, but even it doesn’t provide a similar generic cloud platform that AWS does.

  2. If you’re at such a state, we can help, by the way. The same goes for building a server environment on AWS, of course.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

Of Course You May Eat Your Own Food Here

Are you with or against your customers?

Do your signs read like those “Private property. Keep out!” signs? “No eating of own food”. “Food may not be taken out of the restaurant!” Don’t you have more important things to care about, like, whether your customers love you?

I recently saw this sign on the outside wall of a small roadside restaurant1:

It reads: “You may of course eat your own food here as well.”

So, why should you be more like Tiinan Tupa and less like ABC?

First of all, very few people will probably abuse it. Even if someone eats their own lunch packs there, they will probably reciprocate and buy something as a favor to the place.

Second, it is the ultimate purple cow. It is something unexpected, something remarkable people will pay attention to. People are so accustomed to passive-aggressive notes forbidding this and that that a sign being actually nice will be noticed and talked about.

Lastly, and most importantly, it’s just plain being nice. So stop just using glorious and empty words about how you care about your customers and actually walk the walk.

For these three reasons, being ridiculously admitting with your customers will probably also be good for you in the long term. So stop picking nickels in front of bulldozers. Make sure people will give you money because they love you, not because they have to.


  1. Coincidentally, I found it on Vaihtoehto ABC:lle, a site and an app dedicated to finding alternatives to the ubiquitous, generic and boring ABC gas station chain that seems to have infiltrated the whole country in just a couple of years. ABC specifically forbids eating food bought from its own supermarket in the cafeteria of the same gas station.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

Help! My Rails App Is Melting Under the Launch Day Load

Photo by Pascal

It was to be our day. The Finnish championship relay in orienteering was about to be held close to us, in a terrain type we knew inside and out. We had a great team, with both young guns and old champions. My friend Samuli had been fourth in the individual middle distance championships the day before. My only job was to handle the first leg securely and pass the baton to the more experienced and tougher guys who would then take care of our success. And I failed miserably.

My legs were like dough from the beginning. I was supposed to be in good shape, but I couldn’t keep up with anyone. I was supposed to take it easy and orienteer cleanly, but I ran like a headless chicken, making a mistake after another. Although I wouldn’t learn the term until years later, this was my crash course to ego depletion.


The day before the relay we organized the middle distance Finnish champs in my old hometown Parainen. For obvious reasons, I was the de facto webmaster of our club pages, which also hosted the result service. The site was running on OpenACS, a system running on TCL I had a year or so of work experience with. I was supposed to know it.

After the race was over, I headed back to my friend’s place, opened up my laptop… only to find out that the local orienteering forums were ablaze with complaints about our results page being down. Crap.

After hours or hunting down the issue, getting help on the OpenACS IRC channel, serving the results from a static page meanwhile, I finally managed to fix the issue. The app wasn’t running enough server processes to keep up with the load. And the most embarrassing thing was that the load wasn’t even that high – from high dozens to hundreds of simultaneous users. I headed to bed with my head spinning, hoping to scramble up my self confidence for the next day’s race (with well-known results).

What does this have to do with Ruby or Rails? Nothing, really. And yet everything. The point is that most of us have a similar story to share. It’s much more common to have a meltdown story with a reasonably low number of users than actually have a slashdotting/hackernewsing/daring fireball hit your app. If you aren’t old enough to have gone through something like this, you probably will. But you don’t have to.


During the dozen or so years since the aforementioned episode, I’ve gone through some serious loads. Some of them we have handled badly, but most – including the Facebook terms of service vote campaign – with at least reasonable grace. This series of articles about Rails performance builds upon those war stories.

We have already posted a couple of articles to start off the series.

  1. Rails Garbage Collection: Naive Defaults
  2. Does Rails Scale?
  3. Rails Garbage Collection: Age Matters

This article will serve as kind of a belated intro to the series, introducing our high level principles regarding the subject but without going more to the details.

The Bear Metal ironclad rules of Rails performance

Scalability is not the same as performance

As I already noted in Does Rails Scale?, it’s worth pointing out that performance is not the same thing as scalability. They are related for sure. But you can perform similarly poorly from your first to your millionth user and be “scalable”. There is also the difference that performance is right here and now. If your app scales well, you can just throw more hardware (virtual or real) at the problem and solve it by that.

The good news is that Rails scales quite well out of the box for the vast majority of real-world needs. You probably won’t be the next Twitter or even Basecamp. In your dreams and VC prospectus maybe, but let’s be honest, the odds are stacked against you. So don’t sweat about that too much.

Meanwhile, you do want your app to perform well enough for your initial wave of users.

Perceived performance is what matters

There are basically three different layers of performance for any web app: the server level (I’m bundling everything from the data store to the frontend server here), browser rendering and the performance perceived by the user. The two latter ones are very close to each other but not exactly the same. You can tweak the perceived performance with tricks on the UI level, something that often isn’t even considered performance.

The most important lesson here is that the perceived performance is what matters. It makes no difference what your synthetic performance tests on your server say if the UI of the app feels sluggish. There is no panacea to solve this, but make no mistake, it is what matters when the chicken come home to roost.

Start with quick, large, and obvious wins

The fact is that even a modest amount of users can make your app painfully slow. The good thing is that you can probably fix that just with hitting the low-hanging fruit. You won’t believe how many Rails apps reveal that they’re running in the development mode even in production by – when an error occurs – leaking the whole error trace out to the end user.

Other examples of issues that are fairly easy to spot and fix are N+1 query issues with ActiveRecord associations, missing database indeces, and running a single, non-concurrent app server instance, where any longer-running action will block the whole app from other users.

YAGNI

Once you have squashed all the low-hanging fruit with your metal-reinforced bat, relax. Tuning app performance shouldn’t be your top priority at the moment – unless it is, but in that case you will know for sure. What you should be focusing on is how to get paying customers and how you can make them kick ass. If you have clear performance issues, by all means fix them. However…

Don’t assume, measure

You probably don’t have any idea how many users your app needs to support from the get go. That’s fine. The reality will teach you. As long as you don’t royally fuck up the share nothing (and 12 factor if you’re on a cloud platform such as Heroku) architecture, you should be able to react to issues quickly.

That said, you probably do want to do some baseline load testing with your app if you’re opening for a much larger private group or the public. The good news is that it is very cheap to spin up a virtual server instance just for a couple of hours and hit your app hard with it. Heck, you can handle the baseline from your laptop if needed. With that you should be able to get over the initial, frightening launch.

Once your app is up and running under load from real users, your tuning work starts for real. Only now will you be able to measure where the real hot paths and bottlenecks in your app are, based on real usage data, not just assumptions. At this point you’ll have a plethora of tools at your disposal, from the good old request log analyzer to commercial offerings such as Skylight, and New Relic.

On the frontend most browsers have nowadays developer tools to optimize end-user performance, from Chrome and Safari’s built-in developer tools to Firebug for Firefox.

Wrap-up

In this introductory article to building performant Rails (or any, for that matter) web apps, we took a look at five basic rules of performance optimization:

  1. Scalability is not the same as performance.
  2. Perceived performance is what matters.
  3. Start with the low-hanging fruit.
  4. YAGNI
  5. Don’t assume, measure.

We will get (much) more in the details of Rails performance optimization in later articles. At that point we’ll enter a territory where one size does not fit all anymore. However, whatever your particular performance problem is, you should keep the five rules above at the top your mind.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

How to Install Ruby 2.2.0 on OS X Yosemite With Homebrew and Rbenv

When trying to install Ruby 2.2.0 on Yosemite with rbenv and Homebrew, I got a weird error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
➜  ~ ✗ rbenv install 2.2.0
Downloading ruby-2.2.0.tar.gz...
-> http://dqw8nmjcqpjn7.cloudfront.net/7671e394abfb5d262fbcd3b27a71bf78737c7e9347fa21c39e58b0bb9c4840fc
Installing ruby-2.2.0...

BUILD FAILED (OS X 10.10.1 using ruby-build 20141225)

Inspect or clean up the working tree at /var/folders/s_/_zh3skrd0qz2933nns5y425r0000gn/T/ruby-build.20150106151200.81466
Results logged to /var/folders/s_/_zh3skrd0qz2933nns5y425r0000gn/T/ruby-build.20150106151200.81466.log

Last 10 log lines:
make[2]: *** Waiting for unfinished jobs....
compiling raddrinfo.c
compiling ifaddr.c
make[1]: *** [ext/openssl/all] Error 2
make[1]: *** Waiting for unfinished jobs....
linking shared-object zlib.bundle
linking shared-object socket.bundle
linking shared-object date_core.bundle
linking shared-object ripper.bundle
make: *** [build-ext] Error 2

When I opened up the log file mentioned in the output above, I could see the actual cause of the error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
ossl_ssl.c:125:5: error: use of undeclared identifier 'TLSv1_2_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_2),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_2_method
^
ossl_ssl.c:126:5: error: use of undeclared identifier 'TLSv1_2_server_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_2_server),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_2_server_method
^
ossl_ssl.c:127:5: error: use of undeclared identifier 'TLSv1_2_client_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_2_client),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_2_client_method
^
ossl_ssl.c:131:5: error: use of undeclared identifier 'TLSv1_1_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_1),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_1_method
^
ossl_ssl.c:132:5: error: use of undeclared identifier 'TLSv1_1_server_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_1_server),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_1_server_method
^
ossl_ssl.c:133:5: error: use of undeclared identifier 'TLSv1_1_client_method'
    OSSL_SSL_METHOD_ENTRY(TLSv1_1_client),
    ^
ossl_ssl.c:119:69: note: expanded from macro 'OSSL_SSL_METHOD_ENTRY'
#define OSSL_SSL_METHOD_ENTRY(name) { #name, (SSL_METHOD *(*)(void))name##_method }
                                                                    ^
<scratch space>:148:1: note: expanded from here
TLSv1_1_client_method
^
ossl_ssl.c:210:21: error: invalid application of 'sizeof' to an incomplete type 'const struct <anonymous struct at ossl_ssl.c:115:14> []'
    for (i = 0; i < numberof(ossl_ssl_method_tab); i++) {
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ossl_ssl.c:19:35: note: expanded from macro 'numberof'
#define numberof(ary) (int)(sizeof(ary)/sizeof((ary)[0]))
                                  ^~~~~
ossl_ssl.c:1127:13: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]
            if (rc = SSL_shutdown(ssl))
                ~~~^~~~~~~~~~~~~~~~~~~
ossl_ssl.c:1127:13: note: place parentheses around the assignment to silence this warning
            if (rc = SSL_shutdown(ssl))
                   ^
                (                     )
ossl_ssl.c:1127:13: note: use '==' to turn this assignment into an equality comparison
            if (rc = SSL_shutdown(ssl))
                   ^
                   ==
ossl_ssl.c:2194:23: error: invalid application of 'sizeof' to an incomplete type 'const struct <anonymous struct at ossl_ssl.c:115:14> []'
    ary = rb_ary_new2(numberof(ossl_ssl_method_tab));
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ossl_ssl.c:19:35: note: expanded from macro 'numberof'
#define numberof(ary) (int)(sizeof(ary)/sizeof((ary)[0]))
                                  ^~~~~
ossl_ssl.c:2195:21: error: invalid application of 'sizeof' to an incomplete type 'const struct <anonymous struct at ossl_ssl.c:115:14> []'
    for (i = 0; i < numberof(ossl_ssl_method_tab); i++) {
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ossl_ssl.c:19:35: note: expanded from macro 'numberof'
#define numberof(ary) (int)(sizeof(ary)/sizeof((ary)[0]))
                                  ^~~~~
1 warning and 9 errors generated.
make[2]: *** [ossl_ssl.o] Error 1
make[2]: *** Waiting for unfinished jobs....
compiling raddrinfo.c
compiling ifaddr.c
make[1]: *** [ext/openssl/all] Error 2
make[1]: *** Waiting for unfinished jobs....
linking shared-object zlib.bundle
linking shared-object socket.bundle
linking shared-object date_core.bundle
linking shared-object ripper.bundle
make: *** [build-ext] Error 2

The weird thing here is that I did not get the usual ‘Missing the OpenSSL lib?’ warning. The lib was found but somehow the headers were fucked up. It also did not happen with older rbenv Rubies.

Thanks to Tarmo I found the solution here.

What I had to do was this:

1
2
3
➜  ~  brew update
➜  ~  brew install openssl
➜  ~  /usr/local/opt/openssl/bin/c_rehash

Now make sure that your new binary is in your PATH before the system one.

1
2
3
4
5
➜  ~  ln -s /usr/local/opt/openssl/bin/openssl /usr/local/bin/openssl
➜  ~ which openssl
/usr/bin/openssl
➜  ~ echo $PATH
/usr/local/heroku/bin:/Users/jarkko/.rbenv/shims:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin

That’s no good. Let’s fix our PATH. I’m using zsh, so for me it’s set in ~/.zshrc. Your particular file depends on the shell you’re using (for bash it would be ~/.bashrc or ~/.bash_profile, but see the caveat here).

1
2
3
4
➜  ~ vim ~/.zshrc
# Change the line that sets PATH so that /usr/local/bin
# comes BEFORE /usr/bin. For me, it looks like this:
# export PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

Open up a new terminal window and check that the PATH is correct:

1
2
3
4
➜  ~  echo $PATH
/usr/local/heroku/bin:/Users/jarkko/.rbenv/shims:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
➜  ~  which openssl
/usr/local/bin/openssl

Better. Now, let’s make sure that homebrew libs symlink to the newer openssl.

1
2
3
4
5
6
7
8
9
10
➜  ~  brew unlink openssl
➜  ~  brew link --overwrite --force openssl
➜  ~  openssl version -a

OpenSSL 1.0.1j 15 Oct 2014
built on: Sun Dec  7 02:14:31 GMT 2014
platform: darwin64-x86_64-cc
options:  bn(64,64) rc4(ptr,char) des(idx,cisc,16,int) idea(int) blowfish(idx) 
compiler: clang -fPIC -fno-common -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -arch x86_64 -O3 -DL_ENDIAN -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/usr/local/etc/openssl"

Splendid.

After that, Ruby 2.2.0 installed cleanly without any specific parameters needed:

1
2
3
4
5
➜  ~  rbenv install 2.2.0
Downloading ruby-2.2.0.tar.gz...
-> http://dqw8nmjcqpjn7.cloudfront.net/7671e394abfb5d262fbcd3b27a71bf78737c7e9347fa21c39e58b0bb9c4840fc
Installing ruby-2.2.0...
Installed ruby-2.2.0 to /Users/jarkko/.rbenv/versions/2.2.0

[UPDATE 1, Jan 7] The original version of this post told you to rm /usr/bin/openssl, based on the link above. As James Tucker pointed out, this is a horrible idea. I fixed the article so that we now fix the $PATH instead.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

Rails Garbage Collection: Age Matters

In a previous post in the Rails Performance series we stated that the default garbage collection settings for Ruby on Rails applications are not optimal. In this post we’ll explore the basics of object age in RGenGC, Ruby 2.1’s new restricted generational garbage collector.

As a prerequisite of this and subsequent posts, basic understanding of a mark and sweep1 collector is assumed.

A somewhat simplified mark and sweep cycle goes like this:

  1. A mark and sweep collector traverses the object graph.
  2. It checks which objects are in use (referenced) and which ones are not.
  3. This is called object marking, aka. the MARK PHASE.
  4. All unused objects are freed, making their memory available.
  5. This is called sweeping, aka. the SWEEP PHASE.
  6. Nothing changes for used objects.

A GC cycle prior to Ruby 2.1 works like that. A typical Rails app boots with 300 000 live objects of which all need to be scanned during the MARK phase. That usually yields a smaller set to SWEEP.

A large percentage of the graph is going to be traversed over and over again but will never be reclaimed. This is not only CPU intensive during GC cycles, but also incurs memory overhead for accounting and anticipation for future growth.

Old and young objects

What generally makes an object old?

  • All new objects are considered to be young.
  • Old objects survived at least one GC cycle (major or minor) The collector thus reasons that the object will stick around and not become garbage quickly.

The idea behind the new generational garbage collector is this:

MOST OBJECTS DIE YOUNG.

To take advantage of this fact, the new GC classifies objects on the Ruby heap as either OLD or YOUNG. This segregation now allows the garbage collector to work with two distinct generations, with the OLD generation much less likely to yield much improvement towards recovering memory.

For a typical Rails request, some examples of old and new objects would be:

  • Old: compiled routes, templates, ActiveRecord connections, cached DB column info, classes, modules etc.
  • New: short lived strings within a partial, a string column value from an ActiveRecord result, a coerced DateTime instance etc.

Young objects are more likely to reference old objects than old objects referencing young objects. Old objects also frequently reference other old objects.

1
2
  u = User.first
  #<User id: 1, email: "[email protected]", encrypted_password: "blahblah...", reset_password_token: nil, reset_password_sent_at: nil, remember_created_at: nil, sign_in_count: 2, current_sign_in_at: "2014-10-31 11:52:30", last_sign_in_at: "2014-10-29 10:04:01", current_sign_in_ip: "127.0.0.1", last_sign_in_ip: "127.0.0.1", created_at: "2014-10-29 10:04:01", updated_at: "2014-11-30 14:07:15", provider: nil, uid: nil, first_name: "dfdsfds", last_name: "dfdsfds", confirmation_token: nil, confirmed_at: "2014-10-30 10:11:42", confirmation_sent_at: nil, unconfirmed_email: nil, onboarded_at: nil>

Notice how the transient attribute keys and names reference the long lived columns here:

1
2
3
4
5
6
7
8
9
10
11
12
{"id"=>
   #<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID::Integer:0x007fbe756d1d30>,
  "email"=>
   #<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID::Identity:0x007fbe756d1718>,
  "encrypted_password"=>
   #<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID::Identity:0x007fbe756d1718>,
  "reset_password_token"=>
   #<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID::Identity:0x007fbe756d1718>,
  "reset_password_sent_at"=>
   #<ActiveRecord::AttributeMethods::TimeZoneConversion::Type:0x007fbe741f63c0
    @column=
     #<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID::Timestamp:0x007fbe756d0e58>>,

Age segregation is also just a classification – old and young objects aren’t stored in distinct memory spaces – they’re just conceptional buckets. The generation of an object refers to the amount of GC cycles it survived:

1
2
3
4
irb(main):009:0> ObjectSpace.dump([])
=> "{\"address\":\"0x007fd24c007668\", \"type\":\"ARRAY\", \"class\":\"0x007fd24b872038\", \"length\":0, \"file\":\"(irb)\", \"line\":9, \"method\":\"irb_binding\", \"generation\":8, \"flags\":{\"wb_protected\":true}}\n"
irb(main):010:0> GC.count
=> 8

What the heck is major and minor GC?

You may have heard, read about or noticed in GC.stat output the terms “minor” and “major” GC.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
irb(main):003:0> pp GC.stat
{:count=>32,
 :heap_used=>1181,
 :heap_length=>1233,
 :heap_increment=>50,
 :heap_live_slot=>325148,
 :heap_free_slot=>156231,
 :heap_final_slot=>0,
 :heap_swept_slot=>163121,
 :heap_eden_page_length=>1171,
 :heap_tomb_page_length=>10,
 :total_allocated_object=>2050551,
 :total_freed_object=>1725403,
 :malloc_increase=>1462784,
 :malloc_limit=>24750208,
 :minor_gc_count=>26,
 :major_gc_count=>6,
 :remembered_shady_object=>3877,
 :remembered_shady_object_limit=>4042,
 :old_object=>304270,
 :old_object_limit=>259974,
 :oldmalloc_increase=>23639792,
 :oldmalloc_limit=>24159190}

Minor GC (or “partial marking”): This cycle only traverses the young generation and is very fast. Based on the hypothesis that most objects die young, this GC cycle is thus the most effective at reclaiming back a large ratio of memory in proportion to objects traversed.

It runs quite often - 26 times for the GC dump of a booted Rails app above.

Major GC: Triggered by out-of-memory conditions - Ruby heap space needs to be expanded (not OOM killer! :-)) Both old and young objects are traversed and it’s thus significantly slower than a minor GC round. Generally when there’s a significant increase in old objects, a major GC would be triggered. Every major GC cycle that an object survived bumps its current generation.

It runs much less frequently - six times for the stats dump above.

The following diagram represents a minor GC cycle (MARK phase completed, SWEEP still pending) that identifies and promotes some objects to old.

A subsequent minor GC cycle (MARK phase completed, SWEEP still pending) ignores old objects during the mark phase.

Most of the reclaiming efforts are thus focussed on the young generation (new objects). Generally 95% of objects are dead by the first GC. The current generation of an object is the number of major GC cycles it has survived.

RGenGC

At a very high level C Ruby 2.1’s collector has the following properties:

  • High throughput - it can sustain a high rate of allocations / collections due to faster minor GC cycles and very rare major GC cycles.
  • GC pauses are still long (“stop the world”) for major GC cycles.
  • Generational collectors have much shorter mark cycles as they traverse only the young generation, most of the time.

This is a marked improvement to the C Ruby GC and serves as a base for implementing other advanced features moving forward. Ruby 2.2 supports incremental GC and object ages beyond just old and new definitions. A major GC cycle in 2.1 still runs in a “stop the world” manner, whereas a more involved incremental implementation (Ruby 2.2) interleaves short steps of mark and sweep cycles between other VM operations.

Object references

In this simple example below we create a String array with three elements.

1
2
3
4
5
6
irb(main):001:0> require 'objspace'
=> true
irb(main):002:0> ObjectSpace.trace_object_allocations_start
=> nil
irb(main):003:0> ary = %w(a b c)
=> ["a", "b", "c"]

Very much like a river flowing downstream, the array has knowledge of (a reference to) each of its String elements. On the contrary, the strings don’t have an awareness of (or references back to) the array container.

1
2
3
4
5
6
7
8
irb(main):004:0> ObjectSpace.dump(ary)
=> "{\"address\":\"0x007fd24b890fd8\", \"type\":\"ARRAY\", \"class\":\"0x007fd24b872038\", \"length\":3, \"embedded\":true, \"references\":[\"0x007fd24b891050\", \"0x007fd24b891028\", \"0x007fd24b891000\"], \"file\":\"(irb)\", \"line\":3, \"method\":\"irb_binding\", \"generation\":7, \"flags\":{\"wb_protected\":true}}\n"
irb(main):004:0> ObjectSpace.reachable_objects_from(ary)
=> [Array, "a", "b", "c"]
irb(main):006:0> ObjectSpace.reachable_objects_from(ary[1])
=> [String]
irb(main):007:0> ObjectSpace.dump(ary[1])
=> "{\"address\":\"0x007fd24b891028\", \"type\":\"STRING\", \"class\":\"0x007fd24b829658\", \"embedded\":true, \"bytesize\":1, \"value\":\"b\", \"encoding\":\"UTF-8\", \"file\":\"(irb)\", \"line\":3, \"method\":\"irb_binding\", \"generation\":7, \"flags\":{\"wb_protected\":true, \"old\":true, \"marked\":true}}\n"

We stated earlier that:

Young objects are more likely to reference old objects, than old objects referencing young objects. Old objects also frequently reference other old objects.

However it’s possible for old objects to reference new objects. What happens when old objects reference new ones?

Old objects with references to new objects are stored in a “remembered set”. The remembered set is a container of references from old objects to new objects and is a shortcut for preventing heap scans for finding such references.

Implications for Rails

As our friend Ezra used to say, “no code is faster than no code.” The same applies to automatic memory management. Every object allocation also has a variable recycle cost. Allocation generally is low overhead as it happens once, except for the use case where there are no free object slots on the Ruby heap and a major GC is triggered as a result.

A major drawback of this limited segregation of OLD vs YOUNG is that many transient objects are in fact promoted to old during large contexts such as a Rails request. These long lived objects eventually become unexpected “memory leaks”. These transient objects can be conceptually classified as of “medium lifetime” as they need to stick around for the duration of a request. There’s however a large probability that a minor GC would run during request lifetime, promoting young objects to old, effectively increasing their lifetime to well beyond the end of a request. This situation can only be revisited during a major GC which runs infrequently and sweeps both old and young objects.

Each generation can be specifically tweaked, with the older generation being particularly important for balancing total process memory use with maintaining a minimal transient object set (young ones) per request. And subsequent too fast promotion from young to old generation.

In our next post we will explore how you’d approach tuning the Ruby GC for Rails applications, balancing tradeoffs of speed and memory. Leave your email address below and we’ll let you know as soon as it’s posted.


  1. See the Ruby Hacking Guide’s GC chapter for further context and nitty gritty details. I’d recommended scanning the content below the first few headings, until turned off by C.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.

Does Rails Scale?

Photo by Soctech

Back when Rails was still not mainstream, a common dismissal by developers using other – more established – technologies was that Rails is cool and stuff, but it will never scale1. While the question isn’t (compared to Rails’ success) as common these days, it still appears in one form or another every once in a while.

Last week on the Ruby on Rails Facebook group, someone asked this question:

Can Rails stand up to making a social platform like FB with millions of users using it at the same time?

If so what are the pro’s and the cons?

So in other words, can Rails scale a lot?

Just as is customary for a Facebook group, the question got a lot of clueless answers. There were a couple of gems like this:

Tony if you want to build some thing like FB, you need to learn deeper mountable engine and SOLID anti pattern.

The worst however are answers from people who don’t know they don’t know shit but insist on giving advice that is only bound to either confuse the original poster or lead them astray – and probably both:

Twitter is not a good example. They stopped using Rails because it couldn’t handle millions of request per second. They began using Scala.

This is of course mostly BS with a hint of truth in it, but we’ll get back to that in a bit.

The issue with the question itself, is that it’s the wrong question to ask, and this has nothing to do with Ruby or Rails per se.

Why is it the wrong question? Let’s have a look.

Sure, Ruby is slow in raw performance. It has gotten a lot faster during the past decade, but it is still a highly dynamic interpreted scripting language. Its main shtick has always been programmer happiness, and its primary way to attain that goal has definitely not been to return from that test run as fast as possible. The same goes for Rails.

That said, there are two reasons bare Ruby performance doesn’t matter that much. First, it’s only a tiny part of the perceived app performance for the user. Rails has gone out of its way to automatically make the frontend of your web app performant. This includes frontend caching, asset pipeline, and more opinionated things like Turbolinks. You can of course screw all that up, but you would be amazed how much actual end-user performance you’d miss if you’d write the same app from scratch – not to mention the time you’d waste building it.

Second, and most important for this discussion: scaling is not the same thing as performance. Rails has always been built on the shared nothing architecture, where in theory the only thing you need to do to scale out your app is to throw more hardware at it – the app should scale linearly. Of course there are limits to this, but they are by no means specific to Rails or Ruby.

Scaling and performance are two separate things. They are related as terms, but not strictly connected. Your app can be very fast for a couple users but awful for a thousand (didn’t scale). Or it can scale at O(1) to a million users but loading a page for even a single concurrent user can take 10 seconds (scales but doesn’t perform).

Like stated above, a traditional crud-style app on Rails can be made to scale very well by just adding app server instances, cores, and finally physical app servers serving the app. This is what is meant by scaling out2. Most often the limiting factor here is not Rails but the datastore, which is still often the one shared component in the equation. Scaling the database out is still harder than the appservers, but nevertheless possible. That is way outside the scope of this article, however.

Is this the right tool for the job?

It’s clear in hindsight that a Rails app wasn’t the right tool for what Twitter became – a juggernaut where millions of people were basically realtime chatting with the whole world.

That doesn’t mean that Rails wasn’t a valid choice for the original app. Maybe it wasn’t the best option even then from the technical perspective, but it for sure made developing Twitter a whole lot faster in its initial stages. You know, twitter wasn’t the only contender in the microblogging stage in the late naughties. We finns fondly remember Jaiku. Then there was that other San Fransisco startup using Django that I can’t even name anymore.

Anyway, the point is that reaching a scale where you have to think harder about scalability is a very, very nice problem to have. Either you built a real business and are making money hand over fist, or you are playing – and winning – the eyeball lotto and have VCs knocking on your door (or, more realistically, have taken on several millions already). The vast majority of businesses never reach this stage.

More likely you just fail in the hockeystick game (the VC option), or perhaps build a sustainable business (the old-fashioned people pay me for helping them kick ass kind). In any case, you won’t have to worry about scaling to millions of concurrent users.

Even at the very profitable, high scale SaaS market there are hoards of examples of apps running on Rails. Kissmetrics runs its frontend on Rails, as does GitHub, not to mention Groupon, Livingsocial3, and many others.

However, at certain scale you have to go for a more modular architecture, SOA if I may. You can use a message queue for message passing, a noSQL db for non-relational and ephemeral data, node.js for realtime apps, and so on. A good tool for every particular sub-task of your app.

That said, you need to keep in mind what I said above. It is pretty unlikely you will ever reach a state where you really need to scale. Thus, thinking about the architecture at the initial stage too much is a form of premature optimization. As long as you don’t do anything extra stupid, you can probably get away with a simple Rails app. Because splitting up your app to lots of components early on makes several things harder and more expensive:

  • Complexity of development.
  • Operating and deploying a bunch of different apps.
  • Keeping track that all apps are up and running.
  • Hunting bugs.
  • Making changes in a lean development environment where things change rapidly
  • Cognitive cost of understanding and learning how the app works. This is especially true when you’re expanding your team.

This doesn’t mean that at some point you shouldn’t do the split. There might be a time where the scale for the points above tips, and a monorail app becomes a burden. But then again, there might not. So do what makes sense now, not what makes sense in your imaginary future.

Of course Rails alone won’t scale to a gazillion users for an app it wasn’t really meant for to begin with. Neither is it supposed to. However, it is amazing how far you can get with it, just the same way that the old boring PostgreSQL still beats the shit out of its more “modern” competitors in most common usecases4.

Questions you should be asking

When making a technology decision, instead of “Does is scale?”, here’s what you should be asking instead:

  • What is the right tool for the jobs of my app?
  • How far can I likely get away with a single Rails app?
  • Will we ever really reach the scale we claim in our investor prospectus? No need to lie to yourself here.
  • What is more important: getting the app up and running and in front of real users fast, or making it scalable in an imaginary future that may never come?

Only after answering those are you equipped to make a decision.

P.S. Reached the point where optimizing Rails and Ruby performance does make a difference? We’re writing a series of articles about just that. Pop your details in the form ☟ down there and we’ll keep you posted.


  1. Another good one I heard in EuroOSCON 2005 was that the only thing good about Rails is its marketing.

  2. Versus scaling up, which means making the single core or thread faster.

  3. OK, the last two might not pass the profitable bit.

  4. There are obviously special cases where a single Rails app doesn’t cut it even from the beginning. E.g. computationally intensive apps such as Kissmetrics or Skylight.io obviously won’t run their stats aggregation processes on Rails.

Get our Secrets of Successful Web Apps email course for free!

Sign up to get the lessons we’ve spent more than a decade learning straight to your inbox. 100% meat, no spam, ever.

We won’t send you spam. Unsubscribe at any time.