Imagine walking into a restaurant, sitting down, and waiting five minutes before a waiter even acknowledges you. You would probably leave. The same thing happens on the web. Studies consistently show that 53% of mobile users abandon a site that takes longer than 3 seconds to load. Every additional second of delay reduces conversions by roughly 7%. Performance is not a luxury — it is a fundamental feature.
Rails gives us a powerful framework for building web applications quickly, but that convenience comes with overhead. ActiveRecord, view rendering, middleware, and the Ruby runtime all add up. A poorly optimized Rails app can feel sluggish even on fast hardware. The good news: Rails also gives us excellent tools to diagnose and fix performance problems.
Most Rails performance issues fall into one of three categories:
We will tackle each of these throughout this post, starting with how to identify exactly where your app is slow.
Think of debugging a slow app like diagnosing a patient. You would not prescribe medicine based on a guess — you would run tests, look at the results, and form a diagnosis. Rails gives us several “diagnostic tools” to pinpoint exactly where time is being spent.
The first place to look is your development log. Every request prints timing information:
Started GET "/posts" for 127.0.0.1 at 2026-04-05 10:00:00
Processing by PostsController#index as HTML
Post Load (45.2ms) SELECT "posts".* FROM "posts"
Rendered posts/index.html.erb within layouts/application (120.5ms)
Completed 200 OK in 185ms (Views: 120.3ms | ActiveRecord: 47.1ms)
That log line tells you exactly how much time was spent in the database vs view rendering. If you see ActiveRecord: 500ms, you have a database problem. If Views: 800ms, your templates need attention.
rack-mini-profiler adds a small timing badge to every page in development. Click it and you get a detailed breakdown of every SQL query, partial render, and middleware call — all without switching to the terminal. It is the fastest way to spot N+1 queries.
# Gemfile
gem 'rack-mini-profiler'
# It auto-injects in development — no configuration needed
The Bullet gem specifically watches for N+1 queries and unused eager loading. We will cover it in detail later, but the idea is simple: it alerts you the moment you fire more queries than necessary.
In production, you need continuous monitoring. Tools like New Relic, Skylight, Datadog, and Scout APM track response times, throughput, error rates, and database time across your entire fleet. They show you which endpoints are slow, which queries are called most often, and whether performance is degrading over time.
Here is a practical approach to finding and fixing performance issues:
Guessing is expensive. Measuring is cheap. Always profile before optimizing.
Imagine you are a barista at a busy coffee shop. Every time someone orders a latte, you have two choices: grind the beans, steam the milk, and pull the shot from scratch, or keep a batch of espresso ready and pour it instantly. Caching is the batch of espresso — it lets you serve the same result without redoing the work.
Rails provides several caching strategies, each suited to different situations.
Fragment caching stores a rendered piece of a view (a “fragment”) so it does not need to be re-rendered on subsequent requests. This is the most commonly used caching strategy in Rails.
<% cache @post do %>
<h2><%= @post.title %></h2>
<p><%= @post.body %></p>
<span>Author: <%= @post.author.name %></span>
<% end %>
The first request renders the HTML and stores it in the cache. Subsequent requests skip the rendering entirely and return the cached HTML. Rails generates a cache key automatically based on the model’s class, ID, and updated_at timestamp — so when the post is updated, the cache key changes and the fragment is re-rendered automatically.
Russian doll caching nests fragment caches inside one another. The outer cache wraps the inner cache, and each layer can be invalidated independently.
<% cache @post do %>
<article>
<h2><%= @post.title %></h2>
<% @post.comments.each do |comment| %>
<% cache comment do %>
<div class="comment">
<%= comment.body %>
<span>by <%= comment.author.name %></span>
</div>
<% end %>
<% end %>
</article>
<% end %>
If a single comment changes, only that comment’s fragment is re-rendered. The post’s outer cache remains valid because its cache key (based on @post.updated_at) has not changed. This is incredibly efficient — you get cache hits on the outer layers while only rebuilding the small piece that changed.
Rails supports multiple cache backends:
| Store | Best For | Persistence | Speed |
|---|---|---|---|
| MemoryStore | Development, tests | In-process | Fastest |
| FileStore | Single-server deploys | Disk | Moderate |
| RedisStore | Production, multi-app | Network | Very fast |
| MemCacheStore | Production | Network | Very fast |
# config/environments/production.rb
config.cache_store = :redis_cache_store, {
url: ENV['REDIS_URL'],
namespace: 'myapp',
expires_in: 12.hours
}
Redis is the most popular choice for production because it is fast, supports eviction policies, and can be shared across multiple application servers.
Every cached fragment has a key. Rails auto-generates keys like posts/42-20260405120000 (model class, ID, and timestamp). When the record changes, the timestamp changes, the key changes, and the cache is invalidated. You rarely need to manage keys manually.
For custom caching, you can set explicit expiration:
Rails.cache.fetch("popular_posts", expires_in: 5.minutes) do
Post.order(views: :desc).limit(10).to_a
end
The block runs once, stores the result, and returns the cached value for the next 5 minutes.
Think of a restaurant. The waiter takes your order, writes it down, and moves on to the next table. They do not stand in the kitchen watching your food cook. The chef (a background worker) prepares your meal independently, and the waiter brings it out when it is ready. Your web application should work the same way.
When a user signs up, you might need to send a welcome email, create a default profile, and generate a thumbnail of their avatar. If you do all of that synchronously inside the controller action, the user waits 3-4 seconds staring at a spinner. Instead, you enqueue those tasks as background jobs. The web request finishes in 200ms, and the jobs run in a separate process.
Active Job is Rails’ abstraction layer for background processing. You define a job class, and Rails handles the rest:
class WelcomeEmailJob < ApplicationJob
queue_as :mailers
def perform(user_id)
user = User.find(user_id)
UserMailer.welcome(user).deliver_now
end
end
# Enqueue from a controller
WelcomeEmailJob.perform_later(user.id)
# Or enqueue with a delay
WelcomeEmailJob.set(wait: 24.hours).perform_later(user.id)
Active Job is backend-agnostic. You choose which adapter handles the actual queue:
| Backend | Concurrency | Persistence | Best For |
|---|---|---|---|
| Sidekiq | Multi-threaded | Redis | High-throughput production |
| Resque | Multi-process | Redis | Simpler setups |
| DelayedJob | Multi-process | Database | Small apps, no Redis |
| Solid Queue | Multi-threaded | Database | Rails 8 built-in default |
Sidekiq is the most popular choice because it is fast (uses threads instead of processes), reliable (persistent Redis queue), and feature-rich (scheduled jobs, retries, dead letter queues).
# config/application.rb
config.active_job.queue_adapter = :sidekiq
Jobs are organized into queues, and workers pull from queues in priority order:
class ProcessImageJob < ApplicationJob
queue_as :low_priority
def perform(image_id)
# thumbnail generation, compression, etc.
end
end
class PaymentJob < ApplicationJob
queue_as :critical
def perform(payment_id)
# charge credit card, update balance
end
end
# config/sidekiq.yml
:queues:
- [critical, 5]
- [mailers, 3]
- [default, 2]
- [low_priority, 1]
The numbers are weights — critical jobs get 5x more worker attention than low-priority ones.
When a job fails, Sidekiq automatically retries it with exponential backoff:
class ProcessVideoJob < ApplicationJob
retry_on TranscodingError, wait: 30.seconds, attempts: 3
discard_on InvalidVideoError
def perform(video_id)
video = Video.find(video_id)
video.transcode!
end
end
retry_on tells Sidekiq to try again after a delay. discard_on gives up immediately for unrecoverable errors. After all retries are exhausted, the job moves to the “dead” queue where you can inspect and manually retry it.
Imagine you are loading a photo album on your phone. You could tap each thumbnail one by one and wait for the full image to download (lazy loading). Or you could tap “download all” and wait once while every image loads in the background (eager loading). Both approaches get you the same images, but eager loading is dramatically faster when you need all of them.
ActiveRecord uses lazy loading by default. When you access an association for the first time, it fires a SQL query. This leads to the notorious N+1 query problem:
posts = Post.limit(10)
posts.each do |post|
puts post.author.name # Fires a query for EACH post
end
# Result: 1 query for posts + 10 queries for authors = 11 queries
That is 1 query to load posts, plus N queries (one per post) to load each author. With 100 posts, that is 101 queries. With 1,000, it is 1,001. The database becomes the bottleneck.
includesThe includes method tells Rails to load associations in advance using a few large queries instead of many small ones:
posts = Post.includes(:author).limit(10)
posts.each do |post|
puts post.author.name # No additional query — already loaded
end
# Result: 1 query for posts + 1 query for all authors = 2 queries
Rails is smart about this. It loads all posts, collects the author IDs, and runs a single SELECT * FROM authors WHERE id IN (1, 2, 3, ...). Two queries, regardless of how many posts you have.
preloadpreload works similarly but always uses separate queries (one per association):
Post.preload(:author, :comments)
# 3 queries: posts, authors, comments
eager_loadeager_load forces a single query using LEFT OUTER JOIN:
Post.eager_load(:author)
# 1 query: SELECT posts.*, authors.* FROM posts LEFT OUTER JOIN authors ON ...
This can be useful when you need to filter or sort on the association, but it produces wider result sets and duplicate rows.
| Method | Queries | Strategy | Best When |
|---|---|---|---|
includes | 2+ | Separate or JOIN | General purpose — Rails picks the best |
preload | 2+ | Separate always | You specifically want separate queries |
eager_load | 1 | LEFT OUTER JOIN | Filtering/sorting on associations |
In practice, includes covers 95% of cases. Use preload when you know separate queries are faster (e.g., polymorphic associations). Use eager_load when you need to filter on the association.
You can eager load multiple levels deep:
Post.includes(comments: [:author, :replies])
# Loads posts, comments, comment authors, and replies
Be careful not to over-eager-load. Loading an entire object graph into memory can be worse than a few extra queries. Profile with rack-mini-profiler and load only what you actually use.
The Bullet gem is like a security camera for your database queries. It watches every request and alerts you when it detects an N+1 query pattern or unnecessary eager loading. Think of it as a linter for your database access.
# Gemfile
group :development, :test do
gem 'bullet'
end
# config/environments/development.rb
config.after_initialize do
Bullet.enable = true
Bullet.bullet_logger = true
Bullet.rails_logger = true
Bullet.add_footer = true
end
Bullet catches two types of issues:
includes but never accessed the associationWhen Bullet detects a problem, it logs a warning and adds a visible footer to the page:
GET /posts
USE eager loading detected
Post => [:author]
Add to your query: Post.includes(:author)
This is invaluable during development. It catches N+1 problems the moment they appear, before they reach production. You should never deploy code that Bullet has warned you about.
You can also run Bullet in your test suite to catch regressions:
# spec/rails_helper.rb
Bullet.enable = true
Bullet.bullet_logger = true
Bullet.raise = true # Fails the test on N+1 detection
With Bullet.raise = true, any test that triggers an N+1 query will fail with a clear error message telling you exactly which association needs eager loading.
Once your app is in production, you need continuous visibility into its performance. Monitoring tools track metrics over time, alert you when things degrade, and help you understand how real users experience your application.
The four key metrics for any web application:
| Metric | What It Measures | Target |
|---|---|---|
| Response time | How long requests take | p95 < 200ms |
| Throughput | Requests per second | Trend upward |
| Error rate | Percentage of failed requests | < 0.5% |
| Database time | Time spent in SQL queries | < 30% of total |
rack-mini-profiler — Adds a timing badge to every page in development. Click it to see SQL queries, partial render times, and middleware overhead. Zero configuration needed.
bullet — Catches N+1 queries and unused eager loading in real time, as we covered in the previous section.
New Relic — The most full-featured APM. Tracks response times, database queries, external service calls, and background jobs. Provides dashboards, alerts, and distributed tracing for microservice architectures.
Skylight — Built by the Shopify team specifically for Rails. Minimal setup, beautiful interface, and it automatically detects slow endpoints and queries. A good choice if you want something Rails-native.
Datadog — An all-in-one observability platform. APM, log management, infrastructure monitoring, and alerting in a single tool. Powerful but can be expensive at scale.
Scout APM — Lightweight and affordable. Good for smaller teams that need the essentials without the complexity of New Relic or Datadog.
When you open your APM dashboard, start with these questions:
The goal is not to optimize everything — it is to find the highest-impact changes. Fixing one endpoint that handles 40% of your traffic will have far more impact than shaving 10ms off a rarely-used admin page.
HTTP/2 introduced a feature called Server Push, which allows the server to proactively send resources to the browser before the browser even asks for them. In a traditional HTTP/1.1 flow, the browser downloads the HTML, parses it, finds references to CSS and JavaScript files, and then requests those files. With Server Push, the server can send the CSS and JS alongside the HTML, saving a round trip.
Rails integrates with HTTP/2 Server Push through the link header:
# Rails automatically adds push headers in production
# when using the asset pipeline with HTTP/2
Rails can also manually push resources:
class ApplicationController < ActionController::Base
def push_assets
push "/assets/application.css"
push "/assets/application.js"
end
end
Server Push sounds great in theory, but in practice it has limited benefits:
In most modern Rails applications, the asset pipeline already generates fingerprinted URLs and sets long cache headers. Combined with HTTP/2 multiplexing (which allows multiple requests over a single connection), the need for Server Push is reduced. Some modern browsers have even deprecated it in favor of <link rel="preload">.
The <link rel="preload"> hint tells the browser to fetch a resource with high priority, but lets the browser decide whether to actually use it:
<link rel="preload" href="<%= asset_path('critical.css') %>" as="style">
<link rel="preload" href="<%= asset_path('hero-image.webp') %>" as="image">
This is generally preferred over Server Push because the browser can check its cache first and skip the download if the resource is already stored. It gives you the performance benefit without the waste.
Imagine searching for a specific name in a phone book. If the names are in alphabetical order, you flip to the right section and find the entry in seconds. If the names are in random order, you have to read every single page until you find it. A database index is the alphabetical ordering — it lets the database find rows without scanning the entire table.
An index is a separate data structure (typically a B-tree) that stores a sorted copy of one or more columns, along with pointers to the actual table rows. When you query an indexed column, the database traverses the B-tree instead of scanning every row.
Without an index, a query like SELECT * FROM users WHERE email = 'alice@example.com' must check every single row — an operation called a sequential scan or full table scan. With an index on the email column, the database jumps directly to the matching row, like looking up a word in a dictionary.
Indexes make reads faster but make writes slower. Every INSERT, UPDATE, or DELETE must update not just the table but also every index on that table. A table with 10 indexes pays the write penalty 10 times. The rule of thumb: add indexes for columns you query frequently, and avoid indexing columns that change constantly.
class AddIndexToUsersEmail < ActiveRecord::Migration[7.0]
def change
add_index :users, :email, unique: true
end
end
# Multiple indexes in one migration
class AddPerformanceIndexes < ActiveRecord::Migration[7.0]
def change
add_index :posts, :user_id
add_index :posts, :published_at
add_index :comments, [:post_id, :created_at]
end
end
A composite (multi-column) index covers queries that filter on multiple columns. The column order matters — the index is useful when queries filter on the leftmost columns first:
add_index :orders, [:user_id, :status, :created_at]
This index helps with:
WHERE user_id = 1WHERE user_id = 1 AND status = 'pending'WHERE user_id = 1 AND status = 'pending' AND created_at > '2026-01-01'But it does NOT help with:
WHERE status = 'pending' (skips the leftmost column)Add an index when:
WHERE clauses frequentlyORDER BY clauses and the sort is slowJOIN conditionsCheck which indexes your database actually uses with EXPLAIN ANALYZE:
# In the Rails console
User.where(email: 'alice@example.com').explain
The output shows whether the query used an index or fell back to a sequential scan.
Think of a taxi stand with five taxis. When people (web requests) need a ride (database connection), they take a taxi. If all five taxis are busy, the next person has to wait until one returns. If you have 100 people waiting and only 5 taxis, most of them will wait a long time. Adding more taxis (connections) helps, but each taxi costs money (database resources). Connection pooling is the art of finding the right number.
A connection pool maintains a set of reusable database connections. Instead of creating a new connection for every request (which is expensive — TCP handshake, authentication, session setup), the pool hands out an existing idle connection. When the request finishes, the connection returns to the pool for the next request.
Rails manages connection pooling through ActiveRecord:
# config/database.yml
development:
adapter: postgresql
database: myapp_development
pool: 5
timeout: 5000
pool — the maximum number of connections to keep opentimeout — how long (in milliseconds) to wait for a connection before raising an errorEvery open connection consumes memory on the database server. PostgreSQL uses roughly 10MB per connection. With 200 connections, that is 2GB of memory just for connection overhead — memory that could be used for caching query results. MySQL has similar costs.
Most databases also have a maximum connection limit. PostgreSQL defaults to 100, and many cloud providers set even lower limits on managed databases. If your app tries to open more connections than the database allows, new requests will fail.
In a multi-threaded server (like Puma), multiple threads handle requests concurrently. Each thread needs its own database connection because sharing a connection across threads would cause queries to interleave and produce incorrect results.
Rails handles this automatically: each thread checks out a connection from the pool at the start of a request and returns it at the end. As long as your pool size is at least as large as your thread count, every thread gets a connection immediately.
The pool size should match your web server’s thread count:
Pool size >= Puma threads per worker
If Puma runs 5 threads per worker and you have 2 workers, each worker needs at least 5 connections:
production:
pool: 5 # matches Puma's threads setting
# config/puma.rb
threads 0, 5 # min 0, max 5 threads per worker
workers 2 # 2 worker processes
Two workers each with 5 threads = 10 concurrent requests maximum, each needing a connection. With a pool of 5 per worker, you use 10 connections total. Add a few extra for background jobs and Rails console sessions.
When all connections in the pool are busy and a thread tries to check one out, it waits up to timeout milliseconds. If no connection becomes available, Rails raises:
ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds
This usually means:
For high-traffic applications, you can use PgBouncer — an external connection pooler that sits between your app and PostgreSQL. It maintains hundreds of connections to your application but only a small number to the database:
App (200 connections) -> PgBouncer -> PostgreSQL (25 connections)
PgBouncer handles the multiplexing, so even with 200 app connections, PostgreSQL only sees 25. This is essential when running many Puma workers or when you have multiple services connecting to the same database.
We have covered a lot of ground. Let’s map each performance technique to the problem it solves:
| Problem | Technique | Tool/Pattern |
|---|---|---|
| Repeated view rendering | Caching | Fragment, Russian doll |
| Slow page loads | Background jobs | Active Job + Sidekiq |
| N+1 queries | Eager loading | includes, preload |
| Detecting N+1 early | Bullet gem | Development + test suite |
| Slow queries on WHERE | Database indexing | B-tree indexes |
| Too many open conns | Connection pooling | Pool config + PgBouncer |
| No visibility in prod | Monitoring | New Relic, Skylight |
| Resource pre-fetching | HTTP/2 preload | <link rel="preload"> |
Before moving on, verify you can answer these:
<link rel="preload"> generally preferred over HTTP/2 Server Push?If you can answer all of these, you have a solid foundation in Rails performance optimization. The next step is applying these techniques to your own application — profile first, optimize second, and measure the results.