tekin.co.uk

Why the advice to have a connection pool the same size as your Puma threads is (probably) wrong for you

The standard advice goes: set your Rails database’s connection pool to have as many connections as you have Puma threads. The idea being that you should only need as many connections as you have concurrent threads. This advice is coming from a good place, as most of the time you will be constrained on the number of connections you have available to your database.

The nuance missing from that advice is that it assumes no additional threads are ever spawned during your apps operation!

Now although you might know your code inside and out and be 100% certain that you don’t create additional threads anywhere in your code, chances are Rails is creating additional threads without you realising it…

An aside on threads and ActiveRecord’s connection pool

On a basic level, the way ActiveRecord’s connection pool works is that it assigns each thread that asks for a connection its own separate connection, which the thread then releases once it’s finished. Using a pool like this means that many threads can be querying the database at the same time, so many more requests can be processed in parallel.

If your app’s Puma config sets the maximum number of threads to 5, then you can normally expect there to be at most 5 threads asking for their own database connection, hence the advice to set the pool size to the same as size as the number of threads.

If however one of those threads creates another thread of its own, and that thread needs a database connection to hit the database, it will need its own connection. The spawned thread does not share the connection of the thread that spawned it.

So if we have threads that spawn their own threads, it’s possible to end up in a situation where there are many more threads wanting a connection then there are available connections, and you could end up seeing ActiveRecord::ConnectionTimeoutError exceptions being raised.

Where your Rails app might be spinning up additional threads without you realising it

So, back to your app. Even if you are not explicitly spawning your own threads, Rails itself could be spinning up additional threads without you realising. The main culprit here is ActiveStorage, or more specifically, ActiveStorage configured in proxy mode (commonly used if you’re serving assets via a CDN).

Threads created by ActiveStorage

ActiveStorage’s two proxy controllers return streamed responses, which (you guessed it) are processed in their own threads! So to handle a request for an ActiveStorage file via one of the proxy controllers, two threads will be called into action, both of which need their own connection from the ActiveRecord connection pool.

Normally (hopefully?) your app is processing requests fast enough for you to expect connections to be freed up from one of the other threads and become available for the streaming thread before a timeout occurs. And in this way, most of the time, the limited connections available in the pool can be shared between more threads than there are connections. The problems start if your app receives many successive/concurrent ActiveStorage proxy requests, each of which spin up an additional thread to stream a response, and they take a long time to complete their work and free up their connections; either because they’re doing expensive/slow work server-side (downloading a large file or processing an image representation for the first time), or they’re streaming the response to a slow client. At the time of writing at least, both threads hang on to their connection until the entire response is complete.

Threads created by ActiveRecord’s load_async

Another place Rails will spin up additional threads is with the new load_async method. This was introduced in Rails 7 as a way to parallelise expensive database queries. It’s less likely that these are going to cause connection timeout errors as (again, hopefully?) the expensive queries aren’t so expensive that they hang onto connections for an excessive amount of time. That said, if you’re making heavy use of load_async, you could again end up in a situation where you have much higher demand for connections than there are available connections.

So what should you do about all this?

The upshot of all this is: if you’re making use of either ActiveStorage in proxy mode, or calling load_async in our app, you probably want a connection pool size that is higher than the number of configured Puma threads. The theoretical maximum number of connections you’ll need will be whatever Puma’s configured thread count is x 2, based on the assumption that each thread can potentially spawn one additional thread, but you might be able to get away with a smaller multiple depending on the performance and load characteristics of your app.

And if you’re close to reaching or exceeding the connection limit offered by your database, consider using a separate connection pool like pgbouncer (available as a Heroku buildpack) to increase the number of connections you can configure.

Update

There’s now a follow-up post outlining one of the techniques I employed to help figure out why our connection pool was being exhausted in our Rails app.

Get more fab content like this straight to your inbox

You'll get an email whenever I have a fresh insight or tip to share. Zero spam, and you can unsubscribe whenever you like with a single click.

More articles on Rails

Authored by Published by