Narrowband IoT (NB-IoT) is a Low Power Wide Area Network (LPWAN) radio technology standard developed by 3GPP to enable a wide range of cellular devices and services. The specification was frozen in 3GPP Release 13 (LTE Advanced Pro), in June 2016. Other 3GPP IoT technologies include eMTC (enhanced Machine-Type Communication) and EC-GSM-IoT.
NB-IoT focuses specifically on indoor coverage, low cost, long battery life, and high connection density. NB-IoT uses a subset of the LTE standard, but limits the bandwidth to a single narrow-band of 200kHz. It uses OFDM modulation for downlink communication and SC-FDMA for uplink communications.
We managed to source a Quectel LTE BC68 NB-IoT Module and started to work on data integration, soon realizing that these Neul (which is a company Huawei acquired in 2014) based NB-Iot modems require you to use special AT commands to communicate with the outside world (as opposed to “normal” modems which would require the host to call in using PPP).
Okay, let’s scan the Quectel BC95-G&BC68 AT Commands Manual. There’s a command for creating UDP sockets, great. Another one for TCP sockets, too. Awesome! And then there’s something mysteriously referred to as the “Huawei IoT Platform”. Wait, what?!?
In TELCO speech it’s called a CDP
(which stands for Connected Device Platform). It’s a fancy way to describe a gateway that accepts messages from a mobile device and routes them to your application server. There are a few flavours around, such as:
But what if we want to use our own server? There are a few reasons one would do this, including cost and efficiency, or the fact that the Huawei partner signup website was broken for two weeks. (Actually turns out it just can’t accept Ü
as a character in the company name. But I digress.) Nokia CDP on the other hand is only available in the U.S.
Scanning the manual further, there are hints as to what the protocol might be. To set the ip address of the CDP
platform, there’s a specific AT command for it.
6.1. AT+NCDP Configure and Query CDP Server Settings
The command is used to set and query the server IP address and port for the CDP server. It is used when there is a HiSilicon CDP or Huawei’s IoT platform acting as gateway to network server applications. The values assigned are persistent across reboots.
And the default port for it is 5683. That’s the port for CoAP:
Constrained Application Protocol (CoAP) is a specialized Internet Application Protocol for constrained devices, as defined in RFC 7252. It enables those constrained devices called “nodes” to communicate with the wider Internet using similar protocols. CoAP is designed for use between devices on the same constrained network (e.g., low-power, lossy networks), between devices and general nodes on the Internet, and between devices on different constrained networks both joined by an internet. CoAP is also being used via other mechanisms, such as SMS on mobile communication networks.
Furthermore, another command confirms CoAP and adds OMA LwM2M:
6.5. AT+QLWULDATA Send Data
The command is used to send data to Huawei’s IoT platform with LWM2M protocol. It will give ancode and description as an intermediate message if the message cannot be sent. Before the module registered to the IoT platform, executing the command will trigger register operation and discard the data. Please refer to Chapter 7 for possible values.
OMA LwM2M is a protocol that builds on top of CoAP
OMA Lightweight M2M is a protocol from the Open Mobile Alliance for M2M or IoT device management. Lightweight M2M enabler defines the application layer communication protocol between a LWM2M Server and a LWM2M Client, which is located in a LWM2M Device. The OMA Lightweight M2M enabler includes device management and service enablement for LWM2M Devices. The target LWM2M Devices for this enabler are mainly resource constrained devices. Therefore, this enabler makes use of a light and compact protocol as well as an efficient resource data model. It provides a choice for the M2M Service Provider to deploy a M2M system to provide service to the M2M User. It is frequently used with CoAP
With this we’re ready for experimentation. Let’s fire up the modem, attach it to the network, make it use our own server as the CDP
and connect it to Eclipse Wakaama running on the server.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Once the modem boots up, we
AT+CFUN=0
AT+NCDP=198.51.100.1
AT+CFUN=1
AT+CGDCONT=1,"IP","APN"
AT+CSCON=1
and AT+CEREG=1
to monitor network connection and registration statusAT+CGATT=1
+CSCON:1
and +CEREG:1
we can send a test ping with AT+NPING=198.51.100.1
1 2 3 4 |
|
Sending AT+QLWSREGIND=0
initiates the registration process and receiving +QLWEVTIND:0
means registration was successful. We can observe this with tcpdump
and lwm2mserver
included in wakaama
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Great! We’ve registered the modem. Let’s try sending data with AT+QLWULDATA
1 2 3 4 |
|
This is a failure. Not only does it error out, it actually triggers a new registration process. Something is missing. Scanning the docs again, there’s a reference to +QLWEVTIND:3
being sent when:
//IoT platform has observed the data object 19. When the module reports this message, the customer can send data to the IoT platform.
This took me a while to figure out, but became clear after reading more about OMA LwM2M. More specifically, object 19
as specified in OMA LightweightM2M (LwM2M) Object and Resource Registry means LwM2M APPDATA
:
This LwM2M object provides the application service data related to a LwM2M Server, eg. Water meter data.
This is further specified in Lightweight M2M – Binary App Data Container.
Things are getting clearer now. For the modem to send out data, it tunnels the data inside object 19 and the server has to subscribe to receiving messages on that object. In lwm2mserver
there’s a command for it:
1 2 |
|
Observe request and ACK in tcpdump:
1 2 3 4 5 6 7 8 |
|
Meanwhile on the modem side, we’ve received the +QLWEVTIND:3
message and can send data now:
1 2 3 4 |
|
On the lwm2mserver
side we can see data coming in:
1 2 |
|
Yay! We can now successfully receive data from the modem. In follow-up posts, we’ll try to figure out the differences between AT+QLWULDATA
and AT+NMGS
, look at receiving data from the server, and maybe write our own LwM2M server to forward data to MQTT.
I’ve heard about resin.io before, and while appealing, the control freak in me wanted something with an open server infrastructrure. I’m also not sold on having docker in production embedded devices (while surely being useful for prototyping and experimentation).
There’s the nerves project, mostly focused around the elixir ecosystem. Something I definitely want to check out in more detail, both to learn more about elixir and for simpler embedded projects.
Then I stumbled onto mender. On first glance, it seems perfect. Let’s take a look, shall we?
We’re gonna be roughly following along the mender getting started guide while keeping things OSX compatible.
Instead of running our own server infrastructure (which is nice to have as an option, but not required for initial experimenting), we’ll be using hosted mender. That means we will have to inject our hosted mender token into the initial disk image what we will boot the RPi3 from.
Download the Raspberry Pi 3 disk image from https://docs.mender.io/1.3/getting-started/download-test-images .
Decompress and change the file extension to make it palatable for hdiutil
.
1 2 3 |
|
Verify we have a good image
1
|
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
|
We can see the boot partition, the primary and secondary root partitions and the data partition. Since the root partitions are using ext2fs, and we’re on OSX, we need to install FUSE for macOS along with FUSE-Ext2 to be able to mount and write to these partitions.
Download the OSX package from https://osxfuse.github.io and follow the instructions to install it. Make sure to tick the checkbox for MacFUSE Compatibility Layer
. That’s required for FUSE-Ext2
support.
This gets a little more complicated.
Instead of compiling everything from source, like described here, we’re using the excellent Homebrew package manager to install the dependencies and just compile FUSE-Ext2
itself. (we should probably create a formula for FUSE-Ext2 too …)
Install the dependencies
1
|
|
Install FUSE-Ext2
itself
1 2 3 4 5 6 7 8 |
|
Attach the original mender image
1
|
|
1 2 3 4 5 |
|
Since /dev/disk1
is our OSX boot disk, and we have nothing else mounted, /dev/disk2
is the .img
file we just attached. Pay attention to use the correct device in case you have more disks attached.
We will have to mount both of the root partitions and edit some files in there.
1 2 3 4 |
|
Grab your hosted mender token ( top right menu, under My organization) and inject it to the image.
Replace <token from hosted mender>
with your token.
1 2 3 4 |
|
Verify the config file contents
1
|
|
1 2 3 4 5 6 7 8 9 10 |
|
Unmount and burn the image and we’re done. Adjust /dev/disk3
to your sdcard device.
1 2 3 4 5 |
|
Boot the RPi and check mender dashboard, you should see a new authorization request pop up.
]]>Log in to the MikroTik box. We’re using the command-line interface via ssh but you could use the web UI too. If you haven’t done this before, check out First time startup
Every router is factory pre-configured with the IP address 192.168.88.1/24 on the ether1 port. The default username is admin with no password.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
We checked the LTE interface and realized it is not joining any networks. If you can’t see the LTE interface/modem at all, you need to enable the mini-PCIe interface.
1 2 3 4 5 6 7 8 9 |
|
With the help of the AT Command Interface Specification for the HUAWEI ME906s LTE M.2 Module we can see roaming status is queried and set using the AT^SYSCFGEX
command.
To check existing roaming status, we need to look at the third parameter returned. Notice that we’re escaping the question mark ?
with the backslash character \
. This is because the command line interface interprets ?
as the help command.
1 2 3 4 |
|
AT^SYSCFGEX=? |
---|
Possible Response(s) |
<CR><LF>^SYSCFGEX : (list of supported <acqorder> s),(list of supported (<band> ,<band_name> )s),(list of supported <roam> s),(list of supported <srvdomain> s),(list of supported (<lteband> ,<lteband_name> )s)<CR><LF><CR><LF>OK<CR><LF> |
<roam> : indicates whether roaming is supported. |
---|
0 Not supported |
1 Supported |
2 No change |
0
here means roaming not enabled. Lets set it to 1
instead (notice that we’re keeping all the rest of the parameters unchanged from the output of the previous command).
1 2 |
|
Query the status again.
1 2 |
|
Boom, roaming enbled. Verify with lte info
to make sure we’re registered to a network.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
One last thing we need to do is to persist this across (router or modem) reboots. For this we have to set a ‘modem-init’ that gets excuted every time the modem is started.
1
|
|
To verify
1 2 |
|
Both were paid upgrades (41.42€ + $39 from Fusion 7), which was a bit bitter given I had only had the previous versions for a couple of months. Yet, the word on the street was that the new version would be much more stable and less cpu-hungry than the previous generation. So what the heck, maybe I’d again be able to get more than a couple hours of productive work done without draining the battery.
The purchase process itself was painless, as was installing VMware itself. But when I started installing the plugin, an ugly yak raised its hairy head.
The first thing I tried was to just use the old plugin with the new VMware version. Would it perhaps work?
1 2 3 4 |
|
That sounds like a resounding “No”.
So onwards: bought a license for the plugin as well and tried to install it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
No biggie, a little Googling revealed what I suspected – I don’t actually need to run this as root, I just need to accept the EULA of the latest XCode version before running the install process again. I popped up XCode, YOLOed the agreement, and was back to the terminal.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Again, the reason is clear: I need to build the gem against proper openssl libs – in this case, -I/usr/local/opt/openssl/include
. Thus:
1 2 3 4 5 6 7 8 9 |
|
Perfect. So I tried to install the Vagrant plugin again – and got the same error. Crap.
It turns out that Vagrant uses its own gem folder, so it’s not picking up what’s installed in one’s primary gem directory. The issue was, how do I tell Vagrant to use the correct cpp flags in its build process?
Fortunately I didn’t have to figure that out, because the end of the error message above gave me enough of a pointer towards a solution: if I only managed to install the eventmachine gem by hand to the correct location with the proper cpp flags, I should be fine. But how?
I dug up to gem -h
and the defaults at the end gave the correct option away: with the --install-dir
option I could install the gem to wherever I wanted to. Thus:
1 2 3 4 5 6 7 8 9 |
|
Et voilá. Now one more try:
1 2 3 |
|
…and we’re off to the races!
…well, at least almost.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Oh well, that wasn’t very helpful. So I tried the proven trick #1: I killed the VMware app on OS X and even its menubar daemon just to be certain, and sure enough, it did the trick. The vagrant VM is now back on track.
]]>Photo by Joseph Francis, used under a Creative Commons license.
In January Heroku started promoting Puma as the preferred web server for Rails apps deployed on its hugely successful platform. Puma – as a threaded app server – can better use the scarce resources available for an app running on Heroku.
This is obviously good for a client since they can now run more concurrent users with a single Dyno. However, it’s also good for Heroku itself since small apps (probably the vast majority of apps deployed on Heroku) will now consume much fewer resources on its servers.
The recommendation comes with a caveat, however. Your app needs to be thread-safe. The problem with this is that there is no simple way to say with absolute certainty whether an app as a whole is thread-safe. We can get close, however.
Let’s have a look how.
For the purpose of this issue, an app can be split into three parts:
All three of these need to be thread-safe. Rails and its included gems have been declared thread-safe since 2.2, i.e. since 2008. This alone, however, does not automatically make your app as a whole so. Your own app code and all the gems you use need to be thread-safe as well.
So when is your app code not thread-safe? Simply put, when you share mutable state between threads in your app.
But what does this even mean?
None of the core data structures (except for Queue) in Ruby are thread-safe. The structures are mutable, and when shared between threads, there are no guarantees the threads won’t overwrite each others’ changes. Fortunately, this is rarely the case in Rails apps.
Any code that is more than a single operation (as in a single Ruby code call implemented in C) is not thread-safe. The classic example of this is the +=
operator, which is in fact two operations combined, =
and +
. Thus, the final value of the shared variable in the following code is undetermined:
1 2 3 4 |
|
However, none of the two above things alone makes code thread-unsafe. It only becomes so when it is mated with shared data. Let’s get back to that in a minute, but first…
More informed readers might object at this point and point out that MRI Ruby uses a GIL, a.k.a. Global Interpreter Lock.
The general wisdom on the street is that GIL is bad because it does not let your threads run in parallel (true, in a sense), but good, because it makes your code thread-safe.
Unfortunately, GIL does not make your code thread-safe. It only guarantees that two threads can’t run Ruby code at the same time. Thus it does inhibit parallelism. However, threads can still be paused and resumed at any given point, which means that they absolutely can clobber each others’ data.
GIL does accidentally make some operations (such as Array#<<
) atomic. However, there are two issues with this:
Go read Jesse Storimer’s Nobody understands the GIL (also parts 2 and 3) for much more detail about it (than you can probably even stomach). But for the love of the flying spaghetti monster, don’t count on it making your app thread-safe.
A bit of history:
Rails and its dependencies were declared thread-safe already in version 2.2, in 2008. At that point, however, the consensus was that so many third party libraries were not thread-safe that the whole request in Rails was enclosed within a giant mutex lock. This meant that while a request was being processed, no other thread in the same process could be running.
In order to take advantage of threaded execution, you had to declare in your config.rb that you really wanted to ditch the lock:
1
|
|
However, en route to Rails 4 Aaron Tenderlove Patterson demonstrated that what config.threadsafe!
did was
What this meant was that there was no reason for not to have the thread-safe option always on. And that was exactly what was done for Rails 4 in 2012.
Key takeaway: Rails and its dependencies are thread-safe. You don’t have to do anything to “turn that feature on”.
Good news: Since Rails uses the Shared nothing architecture, Rails apps are consequentially very suitable for being thread-safe as well. In general, Rails creates a new controller object of every HTTP request, and everything else flows from there. This isolates most objects in a Rails app from other requests.
Like noted above, built-in Ruby data structures (save for Queue) are not thread-safe. This does not, however, matter, unless you are actually sharing them between threads. Because of the way in which Rails is architectured, this almost never happens in a Rails app.
There are, however, some patterns that can come bite you in the ass when you want to switch to a threaded app server.
Global variables are, well, global. This means that they are shared between threads. If you weren’t convinced about not using global variables by now, here’s another reason to never touch them. If you really want to share something globally across an app, you are more than likely better served by a constant (but see below), anyway.
For the purpose of a discussion about threads, class variables are not much different from global variables. They are shared across threads just the same way.
The problem isn’t so much about using class variables, but about mutating them. And if you are not going to mutate a class variable, in many cases a constant is again a better choice.
But maybe you’ve read that you should always use class instance variables instead of class variables in Ruby. Well, maybe you should, but they are just as problematic for threaded programs as class variables.
It’s worth pointing out that both class variables and class instance variables can also be set by class methods. This isn’t such an issue in your own code, but you can easily fall into this trap when calling other apis. Here’s an example from Pratik Naik where the app developer is getting into thread-unsafe territory by just calling Rails class methods:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
In this case, calling the layout
method causes Rails to set the class instance variable @_layout
for the controller class. If two concurrent requests (served by two threads) hit this code simultaneously, they might end up in a race condition and overwrite each others’ layout.
In this case, the correct way to set the layout is to use a symbol with the layout call:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
However, this is besides the point. The point is, you might end up using class variables and class instance variables by accident, thus making your app thread-unsafe.
Memoization is a technique where you lazily set a variable if it is not already set. It is a common technique used where the original functionality is at least moderately expensive and the resulting variable is used several times within a request.
A common case would be to set the current user in a controller:
1 2 3 4 5 6 7 8 9 |
|
Memoization by itself is not a thread safety issue. However, it can easily become one for a couple of reasons:
||=
operator is in fact two operations, so there is a potential context switch happening in the middle of it, causing a race condition between threads.It would be easy to dismiss memoization as the cause of the problem, and tell people just to avoid class variables and class instance variables. However, the issue is more complex than that.
In this issue, Evan Phoenix squashes a really tricky race condition bug in the Rails codebase caused by calling super
in a memoization function. So even though you would only be using instance variables, you might end up with race conditions with memoization.
What’s a developer to do, then?
Thread.current[:baz]
) instead. Be aware, though, that it is still kind of a global variable. So while it’s thread-safe, it still might not be good coding practice.1 2 3 |
|
If you absolutely think you must be able to share the result across threads, use a mutex to synchronize the memoizing part of your code. Keep in mind, though, that you’re kinda breaking the Shared nothing model of Rails with that. It’s kind of a half-assed sharing method anyway, since it only works across threads, not across processes.
Also keep in mind, that a mutex only saves you from race conditions inside itself. It doesn’t help you a whole lot with class variables unless you put the lock around the whole controller action, which was exactly what we wanted to avoid in the first place.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
super
in memoization methods.1 2 3 4 5 6 7 8 9 10 11 |
|
Yes, constants. You didn’t believe constants are really constant in Ruby, did you? Well, they kinda are:
1 2 3 4 |
|
So you do get a warning when trying to reassign a constant, but the reassignment still goes through. That’s not the real problem, though. The real issue is that the constancy of constants only applies to the object reference, not the referenced object. And if the referenced object can be mutated, you have a problem.
Yeah, you remember right. All the core data structures in Ruby are mutable.
1 2 3 4 5 6 |
|
Of course, you should never, ever do this. And few will. There’s a catch, however. Since Ruby variable assignments also use references, you might end up mutating a constant by accident.
1 2 3 4 5 6 7 8 |
|
If you want to be sure that your constants are never mutated, you can freeze them upon creation:
1 2 3 4 5 6 |
|
Keep in mind, though, that freeze is shallow. It only applies to the actual Array
object in this case, not its items.
ENV
is really just a hash-like construct referenced by a constant. Thus, everything that applies to constants above, also applies to it.
1
|
|
If you want your app to be thread-safe, all the third-party code it uses also needs to be thread-safe in the context of your app.
The first thing you probably should do with any gem is to read through its documentation and Google for whether it is deemed thread-safe. That said, even if it were, there’s no escaping double-checking yourself. Yes, by reading through the source code.
As a general rule, all that I wrote above about making your own code thread-safe applies here as well. However…
With 3rd party gems and Rails plugins, context matters.
If the third party code you use is just a library that your own code calls, you’re fairly safe (considering you’re using it in a thread-safe way yourself). It can be thread-unsafe just the same way as Array
is, but if you don’t share the structures between threads, you’re more or less fine.
However, many Rails plugins actually extend or modify the Rails classes, in which case all bets are off. In this case, you need to scrutinize the library code much, much more thoroughly.
So how do you know which type of the two above a gem or plugin is? Well, you don’t. Until you read the code, that is. But you are reading the code anyway, aren’t you?
Everything we mentioned above regarding your own code applies.
@@foo
)@bar
, trickier to find since they look the same as any old ivar)Thread.new
, Thread.start
). These obviously aren’t smells just by themselves. However, the risks mentioned above only materialize when shared across threads, so you should at least be familiar with in which cases the library is spawning new threads.Again, context matters. Nothing above alone makes code thread-unsafe. Even sharing data with them doesn’t. But modifying that data does. So pay close attention to whether the libs provide methods that can be used to modify shared data.
No matter how thoroughly you read through the code in your application and the gems it uses, you cannot be 100% sure that the whole is thread-safe. Heck, even running and profiling the code in a test environment might not reveal lingering thread safety issues.
This is because many race conditions only appear under serious, concurrent load. That’s why you should both try to squash them from the code and keep a close eye on your production environment on a continual basis. Your app being perfectly thread-safe today does not guarantee the same is true a couple of sprints later.
To make a Rails app thread-safe, you have to make sure the code is thread-safe on three different levels:
The first one of these is handled for you, unless you do stupid shit with it (like the memoization example above). The rest is your responsibility.
The main thing to keep in mind is to never mutate data that is shared across threads. Most often this happens through class variables, class instance variables, or by accidentally mutating objects that are referenced by a constant.
There are, however, some pretty esoteric ways an app can end up thread-unsafe, so be prepared to track down and fix the last remaining threading issues while running in production.
Have fun!
Acknowledgments: Thanks to James Tucker, Evan Phoenix, and the whole Bear Metal gang for providing feedback for the drafts of this article.
This article is a part of a series about Rails performance optimization and GC tuning. Other articles in the series:
]]>Koichi Sasada (_ko1, Ruby MRI maintainer) famously mentioned in a presentation (slide 89):
Try GC parameters
- There is no silver bullet
- No one answer for all applications
- You should not believe other applications settings easily
- Try and try and try!
This is true in theory but a whole lot harder to pull off in practice due to three primary problems:
The garbage collector has frequently changed in the latest MRI Ruby releases. The changes have also broken many existing assumptions and environment variables that tune the GC. Compare GC.stat
on Ruby 2.1:
1 2 3 4 5 6 7 8 |
|
…with Ruby 2.2:
1 2 3 4 5 6 7 8 9 10 11 |
|
In Ruby 2.2 we can see a lot more to introspect and tune, but this also comes with a steep learning curve which is (and should be) out of scope for most developers.
A modern Rails application is typically used day to day in different contexts:
They all start pretty much the same way with the VM compiling code to instruction sequences. Different roles affect the Ruby heap and the garbage collector in very different ways, however.
This job typically runs for 13 minutes, triggers 133 GC cycles and allocates a metric ton of objects. Allocations are very bursty and in batches.
1 2 3 4 5 6 7 |
|
This controller action allocates 24 555 objects. Allocator throughput isn’t very variable.
1 2 3 4 5 |
|
Our test case contributes 175 objects to the heap. Test cases generally are very variable and bursty in allocation patterns.
1 2 3 4 5 |
|
The default GC behavior isn’t optimal for all of these execution paths within the same project and neither is throwing a single set of RUBY_GC_*
environment variables at it.
We’d like to refer to processing in these different contexts as “units of work”.
During the lifetime and development cycle of a project, it’s very likely that garbage collector settings that were valid yesterday aren’t optimal anymore after the next two sprints. Changes to your Gemfile, rolling out new features, and bumping the Ruby interpreter all affect the garbage collector.
1 2 3 4 5 |
|
Let’s have a look at a few events that are important during the lifetime of a process. They help the tuner to gain valuable insights into how well the garbage collector is working and how to further optimize it. They all hint at how the heap changes and what triggered a GC cycle.
How many mutations happened for example while
When the application is ready to start doing work. For Rails application, this is typically when the app has been fully loaded in production, ready to serve requests, ready to accept background work, etc. All source files have been loaded and most resources acquired.
At the start of a unit of work. Typically the start of an HTTP request, when a background job has been popped off a queue, the start of a test case or any other type of processing that is the primary purpose of running the process.
At the end of a unit of work. Typically the end of a HTTP request, when a background job has finished processing, the end of a test case or any other type of processing that is the primary purpose of running the process.
Triggered when the application terminates.
Tracking GC cycles interleaved with the aforementioned application events yield insights into why a particular GC cycle happens. The progression from BOOTED to TERMINATED and everything else is important because mutations that happen during the fifth HTTP request of a new Rails process also contribute to a GC cycle during request number eight.
Primarily the garbage collector exposes tuning variables in these three categories:
Tuning GC parameters is generally a tradeoff between tuning for speed (thus using more memory) and tuning for low memory usage while giving up speed. We think it’s possible to infer a reasonable set of defaults from observing the application at runtime that’s conservative with memory, yet maintain reasonable throughput.
We’ve been working on a product, TuneMyGC for a few weeks that attempts to do just that. Our goals and objectives are:
Here’s an example of Discourse being automatically tuned for better 99th percentile throughput. Response times in milliseconds, 200 requests:
Controller | GC defaults | Tuned GC |
---|---|---|
categories | 227 | 160 |
home | 163 | 113 |
topic | 55 | 40 |
user | 92 | 76 |
1
|
|
Raw GC stats from Discourse’s bench.rb script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
1
|
|
Raw GC stats from Discourse’s bench.rb script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
We can see a couple of interesting points here:
heap_free_slots
) of 1179182 available slots (heap_available_slots
), which is 64% of the current live objects (heap_live_slots
). This value however is slightly skewed because the Discourse benchmark script forces a major GC before dumping these stats - there are about as many swept slots as free slots (heap_swept_slots
).malloc_increase_bytes_limit
and oldmalloc_increase_bytes_limit
) and growth factors (old_objects_limit
and remembered_wb_unprotected_objects_limit
) are in line with actual app usage. The TuneMyGC service considers when limits and growth factors are bumped during the app lifecycle and attempts to raise limits via environment variables slightly higher to prevent excessive GC activity.Now it’s your turn.
1
|
|
1 2 |
|
1
|
|
This article is a part of a series about Rails performance optimization and GC tuning. Other articles in the series:
]]>I used to suffer from terrible stage fright. I was super nervous every time I presented. I forgot stuff I was supposed to say on stage. I never vomited before a talk, though, I’ll give you that.
It got better over time through lots of practice, but I still get all sweaty and shaky before getting on stage.
Then I recently stumbled upon an article by Kathy Sierra called Presentation Skills Considered Harmful. In it she tells about having had similar problems, and how all the tutorials told her what and how you should do to give a good presentation. You, you, you.
Then she realized that a presentation is really a UX. A presentation is just a user experience. You present ideas – hopefully good ones – to your audience. What does this make you, the presenter? A UI. You are just a UI, a user interface. You yourself don’t matter that much. All that matters is that your ideas touch your audience.
Is that bad? No, it’s great. It’s a huge relief. What matters is not you but what you have to say, the meat of your talk.
And that brings us nicely to the topic of this article.
It’s not about you.
This is maybe the most important thing I’ve learned during the past decade1. It’s also not only profound, but spans your entire life.
These days, everyone and their cat has a blog.
Most blogs tell about the author, which is totally fine — if the author writes it for themselves or their relatives. But when you think about the most helpful blogs – the ones you go back to all the time – they really aren’t. They are about helping the reader, either by informing them or teaching them new things.
But Jarkko, I hear you say. You just started this article with a story about yourself.
That is correct. Storytelling gets kind of an exception.
Except not really. Because storytelling isn’t really about the storyteller, either. You know, we didn’t always know how to write. Telling stories was the only way to pass information to younger generations. Thus, our brains are quite literally wired to react to storytelling. We’re evolutionarily built to learn better from stories.
Thus, stories are not so much about you, the teller, either. Stories are about the listener/reader, and how they relate to the protagonist of the narrative.
And in the end, unless you’re writing fiction, stories are just a tool as well. A powerful one, yes, but just a tool to bring home a lesson to the reader.
Because writing is not about you.
Cincinnati Enquirer learned this the hard way recently. After they laid off their whole copy desk, they were shocked to find out that readers were outraged about the deteriorating quality of the paper’s articles. John E. McIntyre describes the issue vividly:
The reader doesn’t care how hard you worked, what pressures you are under, or how good your intentions are. The reader sees the product, online or in print; if the product looks sloppy and substandard, the reader will form, and likely express, a low opinion of it. And the reader is under no obligation whatsoever to be kind.
What the Enquirer forgot was that their writing is really not about them.
But you don’t wanna hear me babble about writing, let’s get to business.
Are you still looking for that killer business idea?
Stop.
Ideas are all about you. How to get into business you should – apparently through divine intervention or something – come up with a dazzling idea.
Ideas are also dangerous because once you get one, it makes you a victim of confirmation bias. You’re going to start retrofitting the world to your idea, which is totally backwards.
Instead, find an audience you can relate to and sell to. Then, find about their pains, problems, and ways to help them make more money. Then solve that pain and you’ll do well.
Because – to quote Peter Drucker – the purpose of business is to create and keep a customer. It’s that customer, not you, who is going to decide the fate of your business.
Because successful business isn’t about you.
You know what’s special about Apple ads? They almost never list features or specs. Instead, they show what their users can do with them. Shoot video. Facetime with their relatives on the other side of globe. Throw a DJ gig with just an iPad or two.
My friend Amy Hoy has this formula for great sales copy called PDF, for Pain–Dream–Fix:
Now that is how you make people pay attention.
Because great marketing and sales copy is not about you, either.
Last, and indeed least, we get to the actual product, software in our case.
It – as and of itself – is not all that important. Because having a great product is not about you, or the product itself. It’s about solving a customer pain or a job they need to tackle.
Because people don’t buy a quarter-inch drill, they buy a quarter-inch hole in the wall.
By now, you already know the pains of your audience2. Now it’s time to solve them. From previous points, you should already have a nice roadmap to a product people like.
For an extra credit, let’s see how we could transform people from liking your product to loving and raving about it.
We started this talk with Kathy Sierra, and we’re going to end it with her as well.
Kathy’s big idea is that the main purpose of software is to make its users kick ass. What she means by this is that software – or any product really – should help their users to get better at what they do, not just using the product3.
Screw gamification.
Final Cut Pro should not make its users better at using Final Cut Pro. It should make them better film editors.
WODConnect should not make its users better at using the app. It should make them stronger and faster.
This should happen through everything related to your product. The product itself, its marketing, its manuals, its customer support, everything.
Because your success is not about you or your product.
It’s about the users.
It’s about empathy.
Discuss on Hacker News.
Well, let’s just say it’s a tie with learning about the growth mindset.↩
If not, go back to the Business Ideas part above. Do not – and I can’t stress this enough – start building a product before you are sure it solves a problem people have, know they have, and are willing to pay for.↩
All this is also the subject of Kathy’s upcoming book, Badass: Making Users Awesome↩
Photo by Susanne Nilsson, used under a Creative Commons license.
Finland, 2013. Vantaa, the second largest municipality in Finland buys a new web form for welfare applications from CGI (née Logica, née WM-Data) for a whopping €1.9 million. The story doesn’t end there, though. A month later it turns out, that Helsinki has bought the exact same form from CGI as well, for €1.85 million.
Now, you can argue about what is a fair value for a single web form, especially when it has to be integrated to an existing information system. What is clear though, that it is not almost 2 million Euros, twice.
“How on earth was that possible,” I hear you ask. Surely someone would have offered to do that form for, say, 1 million a pop. Heck, even the Finnish law for public procurements mandates public competitive bidding for such projects.
Vendor lock-in. CGI was administering the information system on which the form was to be built. And since they held the key, they could pretty much ask for as much as the municipalities could potentially pay for the form.
Now hold that thought.
Over at High Scalability, Todd Hoff writes about James Hamilton’s talk at the AWS re:Invent conference last November. It reveals how gigantic the scale of Amazon Web Services really is:
Every day, AWS adds enough new server capacity to support all of Amazon’s global infrastructure when it was a $7B annual revenue enterprise (in 2004).
This also means that AWS is leaps and bounds above its competitors when it comes to capacity:
All 14 other cloud providers combined have 1/5th the aggregate capacity of AWS (estimate by Gartner)
This of course gives AWS a huge benefit compared to its competitors. It can run larger datacenters both close and far from each others; they can get sweetheart deals and custom-made components from Intel for servers, just like Apple does with laptops and desktops. And they can afford to design their own network gear, the one field where the progress hasn’t followed the Moore’s law. There the only other companies who do the same are other internet giants like Google and Facebook, but they’re not in the same business as AWS1.
All this is leading to a situation where AWS is becoming the IBM of the 21st century, for better or for worse. Just like no one ever got fired for buying IBM in the 80’s, few will likely get fired for choosing AWS in the years to come. This will be a tough, tough nut to crack for Amazon’s competitors.
So far the situation doesn’t seem to have slowed down Amazon’s rate of innovation, and perhaps they have learned the lessons of the big blue. Only future will tell.
From a customer’s perspective, a cloud platform like AWS brings lots and lots of benefits – well listed in the article above – but of course also downsides. Computing power is still much cheaper when bought in physical servers. You can rent a monster Xeon server with basically unlimited bandwidth for less than €100/month. AWS or platforms built on it such as Heroku can’t compete with that on price. So if you’re very constrained on cash and have the sysadmin chops to operate the server, you will get a better deal.
Of course we’re comparing apples and oranges here. You won’t get similar redundancy and flexibility with physical servers as you can with AWS for any money – except when you do. The second group where using a commercial cloud platform doesn’t make sense is when your scale merits a cloud platform of your own. Open source software for such platforms – such as Docker and Flynn – are slowly at a point where you can rent your own servers and basically build your own AWS on them2. Of course this will take a lot more knowledge from your operations team, especially if you want to attain similar redundancy and high availability that you can with AWS Availability Zones.
There is – however – one hidden benefit of going with a commercial cloud platform such as AWS, that you might not have thought about: going with AWS will lessen your vendor lock-in a lot. Of course you can still shoot yourself in the foot by handing off the intellectual property rights of the software to your vendor or some other braindead move. But given that you won’t, hosting is another huge lock-in mechanism large IT houses use to screw their clients. It not only effectively locks the client to the vendor, but it also massively slows down any modifications made by other projects that need to integrate with the existing system, since everything needs to be handled through the vendor. They can, and will, block any progress you could make yourself.
With AWS, you can skip all of that. You are not tied to a particular vendor to develop, operate, and extend your system. While running apps on PaaS platforms requires some specific knowledge, it is widely available and standard. If you want to take your systems to another provider, you can. If you want to build your own cloud platform, you can do it and move your system over bit by bit.
It is thus no wonder that large IT consultancies are racing to build their own platforms, to hit all the necessary buzzword checkboxes. However, I would be very wary of their offerings. I’m fairly certain the savings they get from better utilization of their servers by virtualization are not passed on to the customer. And even if some of them are, the lock-in is still there. They have absolutely no incentive to make their platform compatible with the existing ones, quite the contrary. Lock-in is on their side. It is not on your side. Beware.
]]>Do your signs read like those “Private property. Keep out!” signs? “No eating of own food”. “Food may not be taken out of the restaurant!” Don’t you have more important things to care about, like, whether your customers love you?
I recently saw this sign on the outside wall of a small roadside restaurant1:
It reads: “You may of course eat your own food here as well.”
So, why should you be more like Tiinan Tupa and less like ABC?
First of all, very few people will probably abuse it. Even if someone eats their own lunch packs there, they will probably reciprocate and buy something as a favor to the place.
Second, it is the ultimate purple cow. It is something unexpected, something remarkable people will pay attention to. People are so accustomed to passive-aggressive notes forbidding this and that that a sign being actually nice will be noticed and talked about.
Lastly, and most importantly, it’s just plain being nice. So stop just using glorious and empty words about how you care about your customers and actually walk the walk.
For these three reasons, being ridiculously admitting with your customers will probably also be good for you in the long term. So stop picking nickels in front of bulldozers. Make sure people will give you money because they love you, not because they have to.
Coincidentally, I found it on Vaihtoehto ABC:lle, a site and an app dedicated to finding alternatives to the ubiquitous, generic and boring ABC gas station chain that seems to have infiltrated the whole country in just a couple of years. ABC specifically forbids eating food bought from its own supermarket in the cafeteria of the same gas station.↩
Photo by Pascal
It was to be our day. The Finnish championship relay in orienteering was about to be held close to us, in a terrain type we knew inside and out. We had a great team, with both young guns and old champions. My friend Samuli had been fourth in the individual middle distance championships the day before. My only job was to handle the first leg securely and pass the baton to the more experienced and tougher guys who would then take care of our success. And I failed miserably.
My legs were like dough from the beginning. I was supposed to be in good shape, but I couldn’t keep up with anyone. I was supposed to take it easy and orienteer cleanly, but I ran like a headless chicken, making a mistake after another. Although I wouldn’t learn the term until years later, this was my crash course to ego depletion.
The day before the relay we organized the middle distance Finnish champs in my old hometown Parainen. For obvious reasons, I was the de facto webmaster of our club pages, which also hosted the result service. The site was running on OpenACS, a system running on TCL I had a year or so of work experience with. I was supposed to know it.
After the race was over, I headed back to my friend’s place, opened up my laptop… only to find out that the local orienteering forums were ablaze with complaints about our results page being down. Crap.
After hours or hunting down the issue, getting help on the OpenACS IRC channel, serving the results from a static page meanwhile, I finally managed to fix the issue. The app wasn’t running enough server processes to keep up with the load. And the most embarrassing thing was that the load wasn’t even that high – from high dozens to hundreds of simultaneous users. I headed to bed with my head spinning, hoping to scramble up my self confidence for the next day’s race (with well-known results).
What does this have to do with Ruby or Rails? Nothing, really. And yet everything. The point is that most of us have a similar story to share. It’s much more common to have a meltdown story with a reasonably low number of users than actually have a slashdotting/hackernewsing/daring fireball hit your app. If you aren’t old enough to have gone through something like this, you probably will. But you don’t have to.
During the dozen or so years since the aforementioned episode, I’ve gone through some serious loads. Some of them we have handled badly, but most – including the Facebook terms of service vote campaign – with at least reasonable grace. This series of articles about Rails performance builds upon those war stories.
We have already posted a couple of articles to start off the series.
This article will serve as kind of a belated intro to the series, introducing our high level principles regarding the subject but without going more to the details.
As I already noted in Does Rails Scale?, it’s worth pointing out that performance is not the same thing as scalability. They are related for sure. But you can perform similarly poorly from your first to your millionth user and be “scalable”. There is also the difference that performance is right here and now. If your app scales well, you can just throw more hardware (virtual or real) at the problem and solve it by that.
The good news is that Rails scales quite well out of the box for the vast majority of real-world needs. You probably won’t be the next Twitter or even Basecamp. In your dreams and VC prospectus maybe, but let’s be honest, the odds are stacked against you. So don’t sweat about that too much.
Meanwhile, you do want your app to perform well enough for your initial wave of users.
There are basically three different layers of performance for any web app: the server level (I’m bundling everything from the data store to the frontend server here), browser rendering and the performance perceived by the user. The two latter ones are very close to each other but not exactly the same. You can tweak the perceived performance with tricks on the UI level, something that often isn’t even considered performance.
The most important lesson here is that the perceived performance is what matters. It makes no difference what your synthetic performance tests on your server say if the UI of the app feels sluggish. There is no panacea to solve this, but make no mistake, it is what matters when the chicken come home to roost.
The fact is that even a modest amount of users can make your app painfully slow. The good thing is that you can probably fix that just with hitting the low-hanging fruit. You won’t believe how many Rails apps reveal that they’re running in the development mode even in production by – when an error occurs – leaking the whole error trace out to the end user.
Other examples of issues that are fairly easy to spot and fix are N+1 query issues with ActiveRecord associations, missing database indeces, and running a single, non-concurrent app server instance, where any longer-running action will block the whole app from other users.
Once you have squashed all the low-hanging fruit with your metal-reinforced bat, relax. Tuning app performance shouldn’t be your top priority at the moment – unless it is, but in that case you will know for sure. What you should be focusing on is how to get paying customers and how you can make them kick ass. If you have clear performance issues, by all means fix them. However…
You probably don’t have any idea how many users your app needs to support from the get go. That’s fine. The reality will teach you. As long as you don’t royally fuck up the share nothing (and 12 factor if you’re on a cloud platform such as Heroku) architecture, you should be able to react to issues quickly.
That said, you probably do want to do some baseline load testing with your app if you’re opening for a much larger private group or the public. The good news is that it is very cheap to spin up a virtual server instance just for a couple of hours and hit your app hard with it. Heck, you can handle the baseline from your laptop if needed. With that you should be able to get over the initial, frightening launch.
Once your app is up and running under load from real users, your tuning work starts for real. Only now will you be able to measure where the real hot paths and bottlenecks in your app are, based on real usage data, not just assumptions. At this point you’ll have a plethora of tools at your disposal, from the good old request log analyzer to commercial offerings such as Skylight, and New Relic.
On the frontend most browsers have nowadays developer tools to optimize end-user performance, from Chrome and Safari’s built-in developer tools to Firebug for Firefox.
In this introductory article to building performant Rails (or any, for that matter) web apps, we took a look at five basic rules of performance optimization:
We will get (much) more in the details of Rails performance optimization in later articles. At that point we’ll enter a territory where one size does not fit all anymore. However, whatever your particular performance problem is, you should keep the five rules above at the top your mind.
]]>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
When I opened up the log file mentioned in the output above, I could see the actual cause of the error:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
The weird thing here is that I did not get the usual ‘Missing the OpenSSL lib?’ warning. The lib was found but somehow the headers were fucked up. It also did not happen with older rbenv Rubies.
Thanks to Tarmo I found the solution here.
What I had to do was this:
1 2 3 |
|
Now make sure that your new binary is in your PATH before the system one.
1 2 3 4 5 |
|
That’s no good. Let’s fix our PATH. I’m using zsh, so for me it’s set in ~/.zshrc
. Your particular file depends on the shell you’re using (for bash it would be ~/.bashrc
or ~/.bash_profile
, but see the caveat here).
1 2 3 4 |
|
Open up a new terminal window and check that the PATH is correct:
1 2 3 4 |
|
Better. Now, let’s make sure that homebrew libs symlink to the newer openssl.
1 2 3 4 5 6 7 8 9 10 |
|
Splendid.
After that, Ruby 2.2.0 installed cleanly without any specific parameters needed:
1 2 3 4 5 |
|
[UPDATE 1, Jan 7] The original version of this post told you to rm /usr/bin/openssl
, based on the link above. As James Tucker pointed out, this is a horrible idea. I fixed the article so that we now fix the $PATH
instead.
As a prerequisite of this and subsequent posts, basic understanding of a mark and sweep1 collector is assumed.
A somewhat simplified mark and sweep cycle goes like this:
A GC cycle prior to Ruby 2.1 works like that. A typical Rails app boots with 300 000 live objects of which all need to be scanned during the MARK phase. That usually yields a smaller set to SWEEP.
A large percentage of the graph is going to be traversed over and over again but will never be reclaimed. This is not only CPU intensive during GC cycles, but also incurs memory overhead for accounting and anticipation for future growth.
What generally makes an object old?
The idea behind the new generational garbage collector is this:
MOST OBJECTS DIE YOUNG.
To take advantage of this fact, the new GC classifies objects on the Ruby heap as either OLD or YOUNG. This segregation now allows the garbage collector to work with two distinct generations, with the OLD generation much less likely to yield much improvement towards recovering memory.
For a typical Rails request, some examples of old and new objects would be:
Young objects are more likely to reference old objects than old objects referencing young objects. Old objects also frequently reference other old objects.
1 2 |
|
Notice how the transient attribute keys and names reference the long lived columns here:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Age segregation is also just a classification – old and young objects aren’t stored in distinct memory spaces – they’re just conceptional buckets. The generation of an object refers to the amount of GC cycles it survived:
1 2 3 4 |
|
You may have heard, read about or noticed in GC.stat output the terms “minor” and “major” GC.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Minor GC (or “partial marking”): This cycle only traverses the young generation and is very fast. Based on the hypothesis that most objects die young, this GC cycle is thus the most effective at reclaiming back a large ratio of memory in proportion to objects traversed.
It runs quite often - 26 times for the GC dump of a booted Rails app above.
Major GC: Triggered by out-of-memory conditions - Ruby heap space needs to be expanded (not OOM killer! :-)) Both old and young objects are traversed and it’s thus significantly slower than a minor GC round. Generally when there’s a significant increase in old objects, a major GC would be triggered. Every major GC cycle that an object survived bumps its current generation.
It runs much less frequently - six times for the stats dump above.
The following diagram represents a minor GC cycle (MARK phase completed, SWEEP still pending) that identifies and promotes some objects to old.
A subsequent minor GC cycle (MARK phase completed, SWEEP still pending) ignores old objects during the mark phase.
Most of the reclaiming efforts are thus focussed on the young generation (new objects). Generally 95% of objects are dead by the first GC. The current generation of an object is the number of major GC cycles it has survived.
At a very high level C Ruby 2.1’s collector has the following properties:
This is a marked improvement to the C Ruby GC and serves as a base for implementing other advanced features moving forward. Ruby 2.2 supports incremental GC and object ages beyond just old and new definitions. A major GC cycle in 2.1 still runs in a “stop the world” manner, whereas a more involved incremental implementation (Ruby 2.2) interleaves short steps of mark and sweep cycles between other VM operations.
In this simple example below we create a String array with three elements.
1 2 3 4 5 6 |
|
Very much like a river flowing downstream, the array has knowledge of (a reference to) each of its String elements. On the contrary, the strings don’t have an awareness of (or references back to) the array container.
1 2 3 4 5 6 7 8 |
|
We stated earlier that:
Young objects are more likely to reference old objects, than old objects referencing young objects. Old objects also frequently reference other old objects.
However it’s possible for old objects to reference new objects. What happens when old objects reference new ones?
Old objects with references to new objects are stored in a “remembered set”. The remembered set is a container of references from old objects to new objects and is a shortcut for preventing heap scans for finding such references.
As our friend Ezra used to say, “no code is faster than no code.” The same applies to automatic memory management. Every object allocation also has a variable recycle cost. Allocation generally is low overhead as it happens once, except for the use case where there are no free object slots on the Ruby heap and a major GC is triggered as a result.
A major drawback of this limited segregation of OLD vs YOUNG is that many transient objects are in fact promoted to old during large contexts such as a Rails request. These long lived objects eventually become unexpected “memory leaks”. These transient objects can be conceptually classified as of “medium lifetime” as they need to stick around for the duration of a request. There’s however a large probability that a minor GC would run during request lifetime, promoting young objects to old, effectively increasing their lifetime to well beyond the end of a request. This situation can only be revisited during a major GC which runs infrequently and sweeps both old and young objects.
Each generation can be specifically tweaked, with the older generation being particularly important for balancing total process memory use with maintaining a minimal transient object set (young ones) per request. And subsequent too fast promotion from young to old generation.
In our next post we will explore how you’d approach tuning the Ruby GC for Rails applications, balancing tradeoffs of speed and memory. Leave your email address below and we’ll let you know as soon as it’s posted.
See the Ruby Hacking Guide’s GC chapter for further context and nitty gritty details. I’d recommended scanning the content below the first few headings, until turned off by C.↩
Photo by Soctech
Back when Rails was still not mainstream, a common dismissal by developers using other – more established – technologies was that Rails is cool and stuff, but it will never scale1. While the question isn’t (compared to Rails’ success) as common these days, it still appears in one form or another every once in a while.
Last week on the Ruby on Rails Facebook group, someone asked this question:
Can Rails stand up to making a social platform like FB with millions of users using it at the same time?
If so what are the pro’s and the cons?
So in other words, can Rails scale a lot?
Just as is customary for a Facebook group, the question got a lot of clueless answers. There were a couple of gems like this:
Tony if you want to build some thing like FB, you need to learn deeper mountable engine and SOLID anti pattern.
The worst however are answers from people who don’t know they don’t know shit but insist on giving advice that is only bound to either confuse the original poster or lead them astray – and probably both:
Twitter is not a good example. They stopped using Rails because it couldn’t handle millions of request per second. They began using Scala.
This is of course mostly BS with a hint of truth in it, but we’ll get back to that in a bit.
The issue with the question itself, is that it’s the wrong question to ask, and this has nothing to do with Ruby or Rails per se.
Why is it the wrong question? Let’s have a look.
Sure, Ruby is slow in raw performance. It has gotten a lot faster during the past decade, but it is still a highly dynamic interpreted scripting language. Its main shtick has always been programmer happiness, and its primary way to attain that goal has definitely not been to return from that test run as fast as possible. The same goes for Rails.
That said, there are two reasons bare Ruby performance doesn’t matter that much. First, it’s only a tiny part of the perceived app performance for the user. Rails has gone out of its way to automatically make the frontend of your web app performant. This includes frontend caching, asset pipeline, and more opinionated things like Turbolinks. You can of course screw all that up, but you would be amazed how much actual end-user performance you’d miss if you’d write the same app from scratch – not to mention the time you’d waste building it.
Second, and most important for this discussion: scaling is not the same thing as performance. Rails has always been built on the shared nothing architecture, where in theory the only thing you need to do to scale out your app is to throw more hardware at it – the app should scale linearly. Of course there are limits to this, but they are by no means specific to Rails or Ruby.
Scaling and performance are two separate things. They are related as terms, but not strictly connected. Your app can be very fast for a couple users but awful for a thousand (didn’t scale). Or it can scale at O(1) to a million users but loading a page for even a single concurrent user can take 10 seconds (scales but doesn’t perform).
Like stated above, a traditional crud-style app on Rails can be made to scale very well by just adding app server instances, cores, and finally physical app servers serving the app. This is what is meant by scaling out2. Most often the limiting factor here is not Rails but the datastore, which is still often the one shared component in the equation. Scaling the database out is still harder than the appservers, but nevertheless possible. That is way outside the scope of this article, however.
It’s clear in hindsight that a Rails app wasn’t the right tool for what Twitter became – a juggernaut where millions of people were basically realtime chatting with the whole world.
That doesn’t mean that Rails wasn’t a valid choice for the original app. Maybe it wasn’t the best option even then from the technical perspective, but it for sure made developing Twitter a whole lot faster in its initial stages. You know, twitter wasn’t the only contender in the microblogging stage in the late naughties. We finns fondly remember Jaiku. Then there was that other San Fransisco startup using Django that I can’t even name anymore.
Anyway, the point is that reaching a scale where you have to think harder about scalability is a very, very nice problem to have. Either you built a real business and are making money hand over fist, or you are playing – and winning – the eyeball lotto and have VCs knocking on your door (or, more realistically, have taken on several millions already). The vast majority of businesses never reach this stage.
More likely you just fail in the hockeystick game (the VC option), or perhaps build a sustainable business (the old-fashioned people pay me for helping them kick ass kind). In any case, you won’t have to worry about scaling to millions of concurrent users.
Even at the very profitable, high scale SaaS market there are hoards of examples of apps running on Rails. Kissmetrics runs its frontend on Rails, as does GitHub, not to mention Groupon, Livingsocial3, and many others.
However, at certain scale you have to go for a more modular architecture, SOA if I may. You can use a message queue for message passing, a noSQL db for non-relational and ephemeral data, node.js for realtime apps, and so on. A good tool for every particular sub-task of your app.
That said, you need to keep in mind what I said above. It is pretty unlikely you will ever reach a state where you really need to scale. Thus, thinking about the architecture at the initial stage too much is a form of premature optimization. As long as you don’t do anything extra stupid, you can probably get away with a simple Rails app. Because splitting up your app to lots of components early on makes several things harder and more expensive:
This doesn’t mean that at some point you shouldn’t do the split. There might be a time where the scale for the points above tips, and a monorail app becomes a burden. But then again, there might not. So do what makes sense now, not what makes sense in your imaginary future.
Of course Rails alone won’t scale to a gazillion users for an app it wasn’t really meant for to begin with. Neither is it supposed to. However, it is amazing how far you can get with it, just the same way that the old boring PostgreSQL still beats the shit out of its more “modern” competitors in most common usecases4.
When making a technology decision, instead of “Does is scale?”, here’s what you should be asking instead:
Only after answering those are you equipped to make a decision.
P.S. Reached the point where optimizing Rails and Ruby performance does make a difference? We’re writing a series of articles about just that. Pop your details in the form ☟ down there and we’ll keep you posted.
Another good one I heard in EuroOSCON 2005 was that the only thing good about Rails is its marketing.↩
Versus scaling up, which means making the single core or thread faster.↩
OK, the last two might not pass the profitable bit.↩
There are obviously special cases where a single Rails app doesn’t cut it even from the beginning. E.g. computationally intensive apps such as Kissmetrics or Skylight.io obviously won’t run their stats aggregation processes on Rails.↩
Photo by martin, used under the Creative Commons license.
The vast majority of Ruby on Rails applications deploy to production with the vanilla Ruby GC configuration. A conservative combination of growth factors and accounting that “works” for a demographic from IRB sessions (still my preferred calculator) to massive monolithic Rails apps (the fate of most successful ones). In practice this doesn’t work very well, however. It produces:
Let’s use a metaphor most of us can better relate to: dreaded household chores. Your ability and frequency of hosting dinners at home are limited by four things (takeaways and paper plates aside):
This is what you have to work with at home:
Some of your friends are also vegetarian.
Let’s have a look at two different scenarios.
You’ve invited and subsequently prepared dinner and the table—seats, plates and cutlery sets—for four, popped open your bottle of wine and fired up the grill. However, only one friend arrives, quite late. You’re grilling steak number three, yet he’s the vegetarian…and only drinks beer. And even then doesn’t talk very much.
In the end, you down the whole bottle of wine and the three steaks. Life’s good again. There’s plenty to clean up and pack away, still.
17 guests show up at your door. Half of them are heavily intoxicated because Dylan invited the rest of his wine tasting group, too. Only one eats any of your food, yet breaks four plates. Beer disappeared in three minutes. The group members reveal seven new bottles of wine, make your dog drink one and he kernel panics as a result.
You were not f*cking prepared. At all. Marinated steak’s now ruined, there’s less inventory and 30+ bottles to recycle. You’re hungry and now there are no plates left!
In both of these scenarios, from the perspective of your friends it mostly worked out just fine. It wasn’t optimal for you or your environment, though. What’s important is that you learned a few things:
In the same manner, different use cases for the Ruby runtime require different preparations. Let’s tie the dinner metaphor back to Ruby land and its memory model.
The Ruby runtime, with everything else inside. Pages, objects and auxilary object data.
The number of major features and facets you need to support. Gems and engines are good candidates along with individual models, controllers, views etc. These “seats” are also connected - different guests mingle together.
Rails provides a framework for building applications, thus should be considered as part of the guest list too. Like some family members that make their way to gettogethers. First and second tier cousins you may hear of once a year and never talk with - they’re present (consume memory), yet don’t always add much value to the ambient.
The amount and distribution of objects required to make a feature or facet work. A mix bag of small entrees (embedded objects like 2-char strings), main dishes (a Rails request and all its context) to cocktails and tequila shots (threads!).
An object slot on the Ruby heap. One String, Array, Hash or any other object. Keep in mind that they can overflow and be recycled too - a wine glass is good for multiple servings. For buffets, a single plate can go very far too :-)
Ruby pages - containers for objects. All of the plates and glasses on a given table. They’re mostly prepared in advance, but you can “construct” and improvise as needed to.
Some dishes incur a lot of work to prepare and to clean up. Cooked basmati rice will leave a very very different footprint in your kitchen than a paella or salmon option would.
The GC defaults for most Rails applications assume a reasonable sized home environment, a well defined guest list and just enough food and drinks for each. Everyone can sit at the same table, wine and dine on fine dishes, all with a minimal cleanup burden.
In reality, it’s a frat party. Gone seriously wrong.
In the next part of this series, we’re going to take a look at how the Ruby runtime can better host Rails applications. And what you can optimize for.
Because if there’s a promotion, you buy.↩
Thanks, and hi, everyone! It’s a real honor to be here. I’ve been a big fan of Reaktor for a long time, that is, UNTIL ALL MY GEEK FRIENDS DEFECTED THERE. There’s been lots of talk about Rosatom building a new nuclear plant here in Finland. I say fuck that, we already have enough nuclear knowledge locally. But I digress.
I’d like to be one of the cool kids and Start with Why just like Simon Sinek told us. However, before that it’s worth defining the term metaprogramming in the context of this talk.
What do we mean by metaprogramming? In its simplest form, metaprogramming means code that writes code.
A-ha! So, code generators are metaprogramming, too? Not really. I go with the definition where the code is generated on the fly in the runtime. We could perhaps call it dynamic metaprogramming. This means that most of what I’m going to talk about is not possible in a static language.
So a more appropriate definition might be, to quote Paolo Perrotta,
Writing code that manipulates language constructs at runtime.
But why, I hear you ask. What’s in it for me? Well, first of all, because metaprogramming is…
…magic. And magic is good, right? Right? RIGHT? Well, it can be. At least it’s cool.
It’s also a fun topic for a conference like this, because it’s frankly, quite often, mind-boggling. Think of it like this. You take your brain out of your head. You put it in your backpocket. Then you sit on it. Does it bend? If it does, you’re talking about metaprogramming.
I also like things that make me scratch my head. I mean, scratching your head is a form of exercise. Just try it yourself. Scratch your head vigorously and your Fitbit will tell you you worked out like crazy today. That’s healthy.
But all joking aside, we don’t use metaprogramming to be clever, we use it to be flexible. And with Ruby – and any other sufficiently dynamic language – in the end of the day, metaprogramming is just a fancy word for normal, advanced programming.
So why Ruby? First of all, Ruby is the language I know by far the best. Second, Ruby combines Lisp-like dynamism and flexibility to a syntax that humans can actually decipher.
Like said, in Ruby there’s really no distinction between metaprogramming and advanced OO programming in general. Thus, before we go to things that are more literally metaprogramming, let’s have a look at Ruby’s object model and constructs that lay the groundwork for metaprogramming.
Thus, in a way, this talk can be reduced to advanced OO concepts in Ruby.
Before we delve more deeply into the Ruby object model, let’s take a step back and have a look at what we mean by object-orientation.
Generally, there are two main approaches to object-oriented programming. By far the most popular is class-based OO, used in languages such as C++ and Java. The other one is prototype-based OO, which is most commonly seen in Javascript. So which of the two does Ruby use?
Class-based? Well, let’s have a look.
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 9 10 11 |
|
How’s that for prototype-oriented OO in Ruby? But, noone does anything like this with Ruby. No? Just ask the DCI guys. Or, well, ask Gary about DCI (and Snuggies).
But I get your point, mainly when you do Ruby programming, you use something that resembles more the good ole class-based OO model. However, in Ruby it comes with a twist – or a dozen.
1 2 3 4 5 6 7 8 9 10 11 |
|
In Ruby, everything is executable, even the class definitions. But it doesn’t end there. What does the following produce?
1 2 |
|
1 2 3 4 |
|
Event? Let’s see.
1 2 |
|
1 2 3 4 5 |
|
Bet you didn’t see that coming. Ok, I’ll admit, I hid something out of the original listing.
1 2 3 4 5 6 7 |
|
1 2 3 4 5 |
|
Remember, everything is executable. Thus, this would be just as valid:
1 2 3 4 5 6 |
|
Stupid? Yes, but valid nonetheless.
In Ruby, you can open any class, even the built-in classes, to modify it. This is something that is called monkey-patching, or duck punching for extra giggles.
1 2 3 4 5 6 7 8 9 10 |
|
Even methods. Thus you can even do functional style programming with Ruby. Think about it, you can use your favorite language to cook a delicious meal of callback spaghetti. Believe me, I’ve tried.
1 2 |
|
Wait, what?
But, classes are different, I hear you say. They have class methods, and stuff.
I’ll let you into a secret. In Ruby, class methods are just like Ukraine in docent Bäckman’s rethoric: they don’t really exist. Wanna proof?
1 2 3 4 5 |
|
Here’s an example of a class method in Ruby. Self is the current object, which in the case of a class definition is the class itself. Does this look familiar?
It should.
1 2 3 |
|
Singleton methods are methods that are defined for a single object, not for the whole object class.
Above is a simple (and pretty, huh?) diagram of Ruby method lookup. Methods reside in the object’s class, right of the object in the image. But where do singleton methods live? They can’t sit in the class, since then they’d be shared by all the objects of the same class. Neither can they be in the Object class, for the same reason.
Turns out they live in something called a singleton class.
Singleton class, a.k.a ghost class, metaclass, or eigenclass, is a special case of a class. It’s a regular class except for a couple of details:
#ancestors
method for a class never lists singleton classes.So, what are class methods? They’re simply singleton methods for the class object itself. And like all singleton methods, they live in the singleton class of the object in question – in this case, the class object. Because classes are just objects themselves.
This has an interesting corollary. Singleton classes are classes, and classes are objects, so…
…wait for it…
…a singleton class must have its own singleton class as well.
That’s right, it’s turtles…errr…singleton classes all the way down. Is it starting to feel like metaprogramming already? We have barely started.
We’re going to have a look at four different ways to generate code dynamically in Ruby:
eval
instance_eval
& class_eval
define_method
method_missing
eval
1 2 3 4 5 6 |
|
Eval is the simplest and barest way to dynamically execute code in Ruby. It takes a string of code and then executes it in the current scope. You can also give eval an explicit scope using a binding object as the second argument.
1 2 3 4 |
|
Eval is super powerful, but has a few huge drawbacks:
For these reasons eval has slowly fallen out of favor, but there are still some cases where you have to drop down to bear metal (excuse the pun) means. As a rule of thumb however, you should as a first option resort to one of the following constructs.
instance_eval
Put simply, instance_eval
takes a block of code and executes it in the context of the receiving object. It can – just like eval
– take a string, but also a real code block:
1 2 3 4 |
|
For the reasons above, you should probably use a code block with instance_eval
instead of a string of code, unless you know what you’re doing and have a good reason for your choice.
A very common usecase for instance_eval
is to build domain-specific languages.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
class_eval
class_eval
is the sister method for instance_eval
. It changes the scope to inside the class definition of the used class. Thus, unlike instance_eval
, it can only be called for classes and modules.
Because of this, a bit counterintuitively methods defined inside class_eval
will become instance methods for that class’s objects, while methods defined inside ClassName.instance_eval
will become its class methods.
1 2 3 |
|
define_method
define_method
is the most straightforward and highest-level way to dynamically create new methods. It is just the same as using the normal def syntax, except:
define_method
you can set the method name dynamically.define_method
as the method body.1 2 3 4 5 6 7 |
|
It is worth noting that you often use both *_eval
and define_method
together, e.g. when defining class methods.
1 2 3 4 5 6 7 8 9 |
|
method_missing
method_missing
is a special case of dynamic code in Ruby in that it doesn’t just by itself generate any dynamic code. However, you can use it to catch method calls that otherwise would go unanswered.
method_missing
is called for an object when the called method is not found in either the object’s class or any of its ancestors. By default method_missing
raises a NoMethodError
, but you can redefine it for any class to work as you need it to.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
method_missing
is an example of a hook method in Ruby. Hook methods are similar to event handlers in Javascript in that they are called whenever a certain event (such as an unanswered method call above) happens during runtime. There are a bunch of hook methods in Ruby, but we don’t have time to dive deeper into them during this talk.
method_missing
differs from the previous concepts in this talk in that it doesn’t by itself generate new methods. This has two implications:
method_missing
. This means that e.g. #instance_methods
won’t return the “ghost methods” that only method_missing
catches. Likewise, #respond_to?
will return false regardless of whether method_missing
would have caught the call or not, unless you also overwrite the respond_to_missing?
method to be aware of the ghost method.attr_accessor
Rewritten in RubyTo top off this talk, we’re going to combine the topics we have learned so far to do a simple exercise. Namely, we’re going to rewrite a simple Ruby language construct ourself, in pure Ruby.
Ruby has a simple construct called attr_accessor
that creates getter and setter methods for named instance variables of the class’s object.
1 2 3 4 5 6 7 8 9 |
|
While attr_accessor
above looks like some kind of keyword, it is actually just a call to a class method1. Remember, the whole class definition is executable code and self
inside the class definition is set to the class itself. Thus, the line is the same as:
1
|
|
So, how to add the method to the class?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
In the code above we define a new class method, nattr_accessor
2. Then we iterate over all the method names the method is called with3. For each method, we use define_method
twice, to generate both the getter and setter methods. Inside them, we use the instance_variable_get
and instance_variable_get
methods to dynamically get and set the variable value. Using these methods we can again avoid having to evaluate a string of code, the way as with using define_method
.
Let’s now take a look whether our code works:
1 2 3 4 5 6 7 8 9 10 |
|
But what if we want to make the method more reusable? Where should it go then?
We could obviously put it into the Object
class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
But what if we don’t want it everywhere, cluttering the inheritance chain? Let’s put it in a module and reuse it where needed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Now we can use it our class:
1 2 3 4 5 6 7 |
|
Oops. What happened?
We used include to get the Nattr module into Animal. However, include will take the methods in the module and make them instance methods of the including class. However, we need the method as a class method. What to do?
Fortunately, Ruby has a similar method called extend
. It works the same way as include, except that it makes the methods from the module class methods4 of our Animal class.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Now we’re talking.
Lemme tell you a story. About a dozen or so years ago I was living in Zürich, as an exchange student. I hadn’t yet found a permanent apartment so I was living at some friends’ place while they were abroad. A permanent internet connection wasn’t an ubiquitous thing back then, and the Swiss aren’t big into tv’s, so I had to figure out things to do at nights. I was living alone, and as a somewhat geeky guy I wasn’t that much into social life. Thus, I mostly read at nights. I had just found this Joel guy and a shitload of his writings, so I used the printers at the university to print on the thin brownish paper (hey, it was free!) his somewhat ranting articles and then spent nights reading about camels and rubber duckies, the Joel test, – and leaky abstractions. And that is what metaprogramming in many cases is: an abstraction.
Now, there is nothing inherently wrong with abstractions – otherwise we’d all be programming in Assembler – but we’ll have to keep in mind that they always come at a cost. So keep in mind that metaprogramming is a super powerful tool to reduce duplication and to add power to your code, but you do have to pay a price for it.
Using too much metaprogramming, your code can become harder to:
So use it as any powerful but potentially dangerous tool: start simply but when the complexity gets out of hand, sprinkle some metaprogramming magic dust to get back on the driver’s seat. Never use metaprogramming just for the sake of metaprogramming.
As Dave Thomas once said:
“The only thing worth worrying about when looking at code is ‘is it easy to change?’”
Keep this in mind. Will metaprogramming make your code easier to change in this particular case? If yes, go for it. If not, don’t bother.
We’ve only had time to scratch the surface of Ruby object model and metaprogramming. It’s a fractal of sometimes mind-boggling stuff, which also makes it so interesting. If you want to take the next steps in you advanced Ruby object model and metaprogramming knowledge, I’d recommend checking out the following:
Yeah, I know, singleton method of the class itself.↩
Let’s name it something other than the built-in method just to avoid name collisions and nasty surprises.↩
The asterisk before the parameter name means that we can have a number of arguments, each of which will be passed to the method in an array called meths
.↩
Technically, it opens up the singleton class of the Animal class and throws the methods in there. Thus they’ll become singleton methods for the Animal class, just like we want them to.↩
Photo by Quinn Dombrowski, used under the Creative Commons license.
At this point started the blame-throwing. The provider duped the client with waterfall and exorbitant change fees. The buyer didn’t know how to act as a client in an information system project. The specs weren’t good/detailed/strict/loose enough. The consultants just weren’t that good in the first place. On and on and on.
While one or more of the above invariably are true in failed software projects, there’s one issue that almost each and every failed enterprise software project has in common: the buyers were not (going to be) the users of the software.
This simple fact has huge implications. Ever heard that “the client didn’t really know what they wanted”? Well, that’s because they didn’t. Thus, most such software projects are built with something completely different than the end user in mind. Be it the ego of the CTO, his debt to his mason brothers who happen to be in the software business1, or just the cheapest initial bid2. In any case, it’s in the software provider’s best interest to appeal to the decisionmaker, not the people actually using the system.
Of course, not every software buyer is as bad as described above. Many truly care about the success of the system and even its users. If for no other reason, at least because it has a direct effect on the company’s bottom line. But even then, they just don’t have the first-hand experience of working in the daily churn. They simply can’t know what’s best for the users. Of course, this gets even worse in the design-by-committee, big-spec-upfront projects.
Since it’s not very likely that we could change the process of making large software project purchases any time soon, what can we as software vendors do? One word: empathy. If you just take a spec and implement it with no questions asked, shame on you. You deserve all the blame. Your job is not to implement what the spec says. Heck, your job isn’t even to create what the client wants. Your job is to build what the client – no, the end users – need. For this – no matter how blasphemous it might sound to an engineer – you have to actually talk to the people that will be using your software.
This is why it’s so important to put the software developers to actually do what the end-users would. If you’re building call-center software, make the developers work in the call center a day or a week. If you’re building web apps, make the developers and designers work the support queue, don’t just outsource it to India.
There is no better way to understand the needs for software you’re building than to talk directly to its users or use it yourself for real, in a real-life situation. While there aren’t that many opportunities to dog-fooding when building (perhaps internal) enterprise software for a client, there’s nothing preventing you from sending your people to the actual cost center. Nothing will give as much insight to the needs and pains of the actual users. No spec will ever give you as broad a picture. No technical brilliance will ever make up for lacking domain knowledge. And no client will ever love you as much as the one in the project where you threw yourself (even without being asked) on the line of fire. That’s what we here at Bear Metal insist on doing at the start of every project. I think you should, too.
We at Bear Metal have some availability open for short and mid-term projects. If you’re looking for help building, running, scaling or marketing your web app, get in touch.
]]>This is a talk I gave at Monitorama.eu in Berlin, September 19, 2013.
Did you know that bear is Bär in German? Which, on the other hand, is berry in Swedish, and bears obviously eat berries as breakfast. Meanwhile, a berry is Beer in German, which does sound very German when you think about it. But I’m already digressing.
Germans, and the Berliner especially, are of course very fond of bears, which is the only explanation I could come up with for why I was chosen1 to give this talk here. In particular, they like polar bears here – Eisbären in the local lingo. But it wasn’t always like that.
In 1930 in Stuttgart, an innkeeper threw a large feast serving smoked polar bear ham. The result: 100 falling ill and 13 dead because of trichinosis caused by Trichinella spiralis, a little fella that looks like this:
The moral of the story: always cook your bear meat well done. And now, after hearing this tale, I’ll guarantee you, you will remember it every time you’re cooking polar bear meat. And that is the power of a story.
We’ll get back to the topic of storytelling in a little bit, but let’s first have a quick look at what we know about the human brain and mind.
In 1998, psychologists Daniel Simons and Daniel Levin carried out an experiment. They hired a professional actor to approach people walking on the street and ask them to give them route instructions on a map. While the targets were looking at the map intensely, something weird happened. Two workmen carrying a door walked between the helper and the actor. The door, of course, was smoke and mirrors. Behind it, the person who had asked for help, was swapped to another person. Most of the targets did not notice. The actor was swapped to another with different hair color, then different clothes, and finally from a man to a woman. And yet, more than half of the subjects failed to notice that they were talking to a completely different person.
What this tells us is that our attention is very, very limited. This comes mostly down to two things.
The human vision is a bit like a digital camera. Light is directed through a lens to a “sensor”, the retina. However, this human CMOS is nothing like the one made of silicon. While a digital camera sensor has an even grid of pixels, the brain pixels are anything but. In the center of our vision, called as fovea, we can resolve as much as 100 pixels in the area of a needle pin, at arm’s length. This is more or less where the so-called retina screen resolution comes from.
However, the fovea, at that same length, is only about the size of a thumbnail. Outside that, the “pixel density” goes down really fast. In the periphery of our vision, we can’t really detect any details at all.
The obvious question here is, how then can we process a more or less sharp image of our surroundings? The answer is: we don’t. But we cheat. We move our eyes rapidly to scan our vision, which creates an illusion of a sharper image than it really is.
But this isn’t such an issue, is it? I mean, we can just memorize what we just saw to create a more comprehensive picture of what we just saw. Right? Well, yes and no.
We can, indeed, store items in what is called short-term or working memory. To stay in computer metaphors2, it is a bit like RAM. It is fast, but limited, and when something new goes into it and it gets full, something else must be thrown out. However, unlike its tech counterpart, working memory in us humans has not grown during the last years or even centuries. It is still ridiculously small: somewhere around 3-4. No, I don’t mean 3-4 gigs, or even megs. Hell, not even kilobytes or bytes. 3-4, period.
Let’s look at a short demo video of this. Please don’t continue reading this article further before you have watched it. It takes less than two minutes.
Did you notice the gorilla (or one of the other changes if you had seen the original gorilla video beforehand)? About 50% of people don’t, even though they are looking several times (this was proven with eye tracking equipment) right at the beast, which is quite an amazing demonstration of the limits of our attention.
So what does this lack of attention mean to us as graphic and visualization designers? To put it short, it means the world. As an example, you can’t put two things the viewer should be comparing against each other very far from each other, because the viewer just can’t keep the other one in her memory long enough to make the comparison. Thus the first rule of thumb is: make comparisons viewable with as few eye fixations as possible, preferably one.
The second rule is: maximize the data-ink ratio. The ratio, coined by the visualization guru Edward Tufte, means the amount of data conveyed by the visualization divided by the amount of “ink” used. To put it in another way, the less stuff you have that is only there for visual looks and doesn’t have any particular meaning, the better. Good examples of this are needless and redundant color coding, the infamous PowerPoint pattern backgrounds3, and 3D effects now running amok in the Keynote world. Each of these makes the cognitive load of the viewer higher by fighting for her attention, which then leaves fewer resources in her brain left to actually make sense of the real information in the graph.
The whole field of human attention and cognitive science is huge both in general and applied to visuals in particular. We don’t have the opportunity to delve into it deeper here, but here are some pointers for you to learn more:
In The Magazine, one the several things Marco Arment has sold during the past year, pediatrician Saul Hymes recently wrote an article called Give It Your Best Shot. In the article, Hymes writes about one of his patients, a three-week-old girl who went dead because of bacterial meningitis, an illness passed to him by her unvaccinated older brother.
It was all of course preventable. There has been a vaccine against the bacteria in question, Haemophilus influenzae type b since 1984. So afterwards Hymes asked the mother of the two whether she’d now “give her children the benefit of modern medicine’s vaccinations.”. The answer was no.
What’s going on here?
In his best-selling book, Thinking, fast and slow, the Nobel laureate psychologist Daniel Kahneman lays out his theory of human thinking, splitting it into two systems, which he calls quite unimagitatively systems 1 and 2. System 1 is fast, intuitive, automatic and direct. System 2 is slow, analytical, and not activated in many day-to-day tasks at all. It is also lazy, trusting the intuition of system 1 much more than it should. It wouldn’t be such a problem if system 1 wasn’t as prone to many errors and biases as it is. It draws conclusions long before the conscious mind does. What makes matters worse, we almost always think we made these intuitive, erroneous decisions knowingly.
And this, in many ways, is what is going on in the heads of the people in the anti-vaccination community. Let’s look at some of the biases potentially at play here.
We prefer wrong information to no information.
– Rolf Dobelli in The Art of Thinking Clearly
Because of information readily available to us, we often make totally erroneous assumptions of how common or proven it actually is. If our grandfather smoked a lot but still lived to 100-years-old, we easily think that smoking can’t be that bad for you. Or if a celebrity in the TV claims that her son got autism from vaccinations, hey, why not? We use statements like these to prove something, but they don’t of course prove anything. The plural form of ‘anecdote’ is not ‘data’.
Because of availability bias, we systematically overestimate the risk of catastrophes we see often in the media, such as terrorist attacks or natural disasters, and underestimate the boring, but much more likely causes of death, such as diabetes and cancers. We attach much more likelihood to spectacular outcomes. And what could be more spectacular than a centerfold model and her son with an illness obviously caused by greedy pharma companies and their conspiracies with public health organizations?
Conjunction fallacy means that the more vividly something is presented, the more likely it is for us to believe it is the truth. At intuitive level, we have a soft spot for plausible stories.
So when Jenny McCarthy goes to Oprah and tells about her son that “My science is Evan, and he’s at home. That’s my science”, no matter that…
…people still cry and clap their hands. As Hymes writes,
“To paraphrase George Lucas: So this is how science dies — to thunderous applause? In the court of public opinion, data, and statements, and science are no match for an emotional parent and her child.”
We want our lives to follow a tight-nit story that is easy to follow. We talk about understanding surprising events, but that’s not really true. We simply build the meaning into them afterwards.
Media is a champion at this. Just think about the rampant “Apple is doomed, just like with PCs in the 1980’s” narrative. No matter what the facts say, the tech journalists who subscribe to the above notion will distort and retrofit them to their preferred narrative. Hollywood is of course another master at it and this obviously gives an edge to McCarthy over her opponents, the science community who try to convince the public with hard data and statistics.
Unfortunately in this case, stories attract us (and you’ll soon learn why) while the abstract makes us bored out of our minds. Thus, entertaining but irrelevant issues are often prioritized over relevant facts.
Confirmation bias means that we systematically ignore and dismerit facts and opinions that disagree with our own beliefs and worldviews. If we really like BMW’s, we very easily just disregard test articles that give them bad grades and eagerly read through every word in pieces that adore them. The more strongly held a belief is, the stronger the bias is as well.
When we combine these four biases, it’s not so hard to understand why the science community has a hard time convincing the McCarthys of the world. As a result, there have recently been several outbreaks of measles in the US, something that already was completely eliminated from the country. The cases have almost without exception happened – like recently in North Texas – in vaccine-skeptical communities.
The anti-vaccination community is an extreme example, of course. I mean, we’re mostly talking about religious whackos, right? We, who are pro-science, would never succumb to such fallacies, right? Let me tell you about another cognitive bias.
As proven over and over again, we systematically overestimate our knowledge, talent and our ability to predict. And not just by a little bit but on a giant scale. The effect doesn’t deal with whether we’re correct or wrong in single estimates. Rather, it measures the difference between what we know and what we think we know. The most surprising thing about the effect is that experts are no less susceptible to it than normal people – on the contrary. As Dobelli writes:
If asked to forecast oil prices in five years time, an economics professor will be as wide of the mark as a zookeeper will. However, the professor will offer his forecast with certitude.
But let’s not be negative here. The flipside of all this is that stories are a very powerful way to get your point across and people to remember what you’re trying to teach them. Why is this?
Quite simply, our brains are evolutionarily wired to respond strongly to stories. When we listen to a presentation with mostly boring bullet points, it hits the language processing areas of the brain, where we simply decode words into meaning. And then what? Nothing.
On the other hand, when we’re told stories, the aforementioned parts are not the only ones that fire. Any other areas in our brain that we’d use when experiencing the events of the story are as well. So if we hear a story about a delicious dish, our sensory cortex gets fired up. If the story is about action sports, our motor cortex is activated. Thus, a good story can put our whole brains to work.
Because of this, in a way we’re synchronizing our brains with our listeners. As Uri Hasson from Princeton says:
“When the woman spoke English, the volunteers understood her story, and their brains synchronized. When she had activity in her insula, an emotional brain region, the listeners did too. When her frontal cortex lit up, so did theirs. By simply telling a story, the woman could plant ideas, thoughts and emotions into the listeners’ brains.”
So what do you need for a good story. Copyblogger lists the following five things.
Granted, telling stories visually is much harder than verbally. It should not be treated as impossible, though. After all, movies and cartoons are to a large degree visual. So while the above five points are mostly meant for verbal storytelling, keeping them in mind even when weaving narrative with visualization can be of huge help.
It is important to build continuum, a narrative to your visualizations. The information presented needs to be integrated, rather than a bunch of unrelated pieces. You also want to create relevant emotions and affect to your presentation, and here it helps to link it to the viewers existing knowledge. However you do it, try to make your message more memorable and thus likely to impact behavior.
And whatever you do, keep in mind both a story and a visualization has to make sense.
The above graph courtesy of WTF Visualizations.
And if you’re still convinced you can’t tell stories with visualizations, watch the TED talks by Hans Rosling.
So, how did Saul Hymes solve the problem of fighting a convincing, storytelling opponent? By telling stories himself. So while he still quoted the relevant stats and facts about the risks of taking vs not taking vaccines, he also started telling vibrant, vivid stories of individual kids dying or going deaf in his hands. After all, he didn’t have to convince people that taking vaccines is not dangerous. He had to convince them that not taking them is. And that is, of course, easy with a meaty story.
I want you to remember two things from this article.
And wait, there’s more. I’ll just leave this thought here for you to ponder:
If you’re into data visualization, you’re not in the data business – you’re in the human communications business.
Visualization is just a tool to attain goals. Keep that in mind.
As a Bear Metal cubby.↩
Isn’t it awesome to describe real, natural things with metaphors from the tech world that no one would have understood just a few decades ago?↩
It is fair to ask why they are provided in the first place. They certainly don’t make the graphs look any better either, quite the contrary.↩
Photo by Robb North
A few months of work during a sabbatical yielded a product that nailed a problem in the preventative healthcare space. After a freemium window, the product gained good market traction and you spawn a new company with three coworkers. Customers are raving, sales trends are on the up, the engineering team is growing and there are conceptual products in the pipeline.
Three months down the line, there are 4 production applications, hordes of paying customers, a few big contracts with strict SLAs (service-level agreements) and enough resources to spin off a presence in Europe. A new feature that combines these products into a suite is slated for release. Engineering hauled ass for 2 months and sales is super stoked to be able to pitch it to customers.
A few days before the feature release a set of new servers is provisioned to buffer against the upcoming marketing push. Due diligence on various fronts was completed, mostly through static analysis of the current production stack by various individuals. Saturday morning at 1am PST they deploy during a window with a historically low transaction volume. Representatives of a few departments sign off on the release, although admittedly there are still dark corners and the OK is mostly based off a few QA passes. Champagne pops, drinks are being had and everyone calls it a day. But then…
At 9am PST various alerts flood the European operations team - only 25% of the platform’s available, support is overwhelmed and stress levels go up across the board. Some public facing pages load intermittently, MySQL read load is sky high and application log streams are blank. This deployment, as with most naive releases, was flying blind. A snapshot of a working system prior to release isn’t of much value if it can’t be easily reproduced after rollout for comparison.
Based on assumptions about time, space and other variables there was a total lack of situation awareness and thus no visibility into expected impact of these changes. Running software that pays the bills is today more important than a flashy new feature. However, one must move forward and there are processes and tools available for mitigating risk.
Situation awareness can be defined as an engineering team’s knowledge of both the internal and external states of their production systems, as well as the environment in which it is operating. Internal states refer to health checks, statistics and other monitoring info. The external environment refers to things we generally can’t directly control: Humans and their reactions; hosting providers and their networks; acts of god and other environmental issues.
It’s thus a snapshot in time of system status that provides the primary basis for decision making and operation of complex systems. Experience with a given system gives team members the ability to remain aware of everything that is happening concurrently and to integrate that sense of awareness into what they’re doing at any moment.
The new feature created a dependency tree between 4 existing applications, a lightweight data synchronization service (Redis) and the new nodes that were spun up. Initial investigation and root cause analysis revealed that the following went wrong:
Without much runtime introspection in place, it was very difficult to predict what the release impact would be. Although not everything could be covered ahead of time for this release, even with basic runtime analysis, monitoring and good logging it would have been possible to spot trends and avoid issues bubbling up systematically many hours later.
Another core issue here is the “low traffic” release window. It’s often considered good practice to release during such times to minimize fallout for the worst case, however it’s sort of akin to commercial Boeing pilots only training on Cessnas. Any residual and overlooked issues tend to also only surface hours later when traffic ramps up again. This divide between cause and effect complicates root cause analysis immensely. You’d want to be able to infer errors from the system state, worst case QA or an employee and most definitely not customers interacting with your product at 9am.
One also cannot overlook the fact that suddenly each team now had a direct link with at least 3 other applications, new (misconfigured) backends and Redis at this point in time. Each team however only still mostly had a mental model of a single isolated application.
We at Bear Metal have been through a few technology stacks in thriving businesses and noticed a recurring theme and problem. Three boxes become fifty, ad-hoc nodes are spun up for testing, special slaves are provisioned for data analysis, applications are careless with resources and a new service quickly becomes a platform-wide single point of failure. Moving parts increase exponentially and so do potential points of failure.
Engineering, operations and support teams often have no clue what runs where, or what the dependencies are between them. This is especially true for fast growing businesses that reach a critical mass - teams tend to become more specialized, information silos are common and thus total system visibility is also quite narrow. Having good knowledge of your runtime (or even just a perception) is instrumental in making informed decisions for releases, maintenance, capacity planning and discovering potential problems ahead of time. Prediction only makes sense once there’s a good perception of “current state” in place to minimize the rendering of fail whales.
Operations isn’t about individuals, but teams. The goal is to have information exchange between team members and other teams being as passive as possible. Monitoring, alerting and other push based systems help a lot with passive learning about deployments. It’s mostly effortless and easy for individuals to build up knowledge and trends over time.
However, when we actively need to search for information, we can only search for what we already know exists. It’s impossible to find anything we’re not aware of. Given the primary goal of an operations team is platform stability in the face of changes, time to resolution (TTR) is always critical and actively seeking out information when under pressure is a luxury.
Historically a systemwide view has always been the territory of the CTO, operations team and perhaps a handful of platform or integration engineers. Inline with devops culture, we need to acknowledge this disconnect and explore solutions for raising situation awareness of critical systems for all concerned.
Take a minute and ponder the following :
In our next post, we’ll explore some common components, variables and events required for being “on top” of your stack. In the meantime, what causes you the most pain when trying to keep up with your production systems? What would you write a blank cheque for? :-)
]]>Photo by Ron Cogswell
One of the saddest things to happen online was in 2007 when my all-time favorite author and presenter, Kathy Sierra, received death threaths and thus retreated from the public web. It also meant that she stopped writing her Creating Passionate Users weblog, which had been a great inspiration for me for quite some time. Thank god she didn’t pull a _why on it.
While it’s more than six years since Kathy’s last blog post (is it really that long?), there is no reason we shouldn’t apply her lessons even in today’s online world.
Maybe the most famous mantra of Sierra was that in order to create passionate users you should make them kick ass. Sure, it’s nice if your UI boasts übercool 3D CSS transformations but if it doesn’t help your users shine, no one (well, except for some web geeks) will give a flying fuck.
She demonstrated this with the fact that very often companies spend a huge amount of effort and money to hone the living daylights off their marketing materials but don’t really put that much time into what actually helps their users: tutorials and user manuals. Of course this had helped her immensively by creating a market for the visual Head First book series on O’Reilly that she curated.
Apple has for a long time been a good example of helping its users kick ass. The user manual of the old Final Cut Pro 7 was also a great introduction to the art of video editing1. Likewise, most of Apple ads show things you can do and create with their products, not just random people dancing around the pool.
People care about how they can kick ass themselves and they need to be able to learn it to capitalize on it. Nowadays it seems that companies are much more interested in giving people free apps and then using psychological tricks to milk money out of them than helping them shine. Which, coincidentally, brings us back to Kathy Sierra.
To my pleasant surprise, I last week learned that Kathy is back with the pseudonym Serious Pony, and a new blog of the same name. The first article, Your app makes me fat, is of the same awesome quality as her old pieces. In it, she tackles head-on the aforementioned gamification trend and the ego depletion tax it puts on us as app users.
To honor Kathy, we wanted to start this blog off by not talking about us ourselves, because Bear Metal isn’t really about us, but you. And – assuming you are a developer, entrepreneur or content provider – not really about you either. It’s about who we (you and us) serve. Because without them there is no market, no audience, no need, no problems to solve, no pains to relieve. Your customers should be the ones that matter to you. And they don’t care about you or us. They care about whether your product can make them shine.
Can your product help them kick ass? Does it? Are you communicating that effectively to your current and potential customers? That is all that should matter.
Unfortunately this can’t be said about the manual of the new version, Final Cut Pro X.↩