RUBY ON RAILS · 16 MIN READ ·

Rails Counter Cache: Eliminate N+1 COUNT Queries Without the Production Gotchas

Rails counter cache kills N+1 COUNT queries on has_many associations. Set it up properly, reset stale counters, and dodge the gotchas that bite in production.

Rails Counter Cache: Eliminate N+1 COUNT Queries Without the Production Gotchas

A B2B marketplace I advise hit Postgres at 94 percent CPU on a Tuesday afternoon, and pgHero showed the worst offender by a wide margin was a single query: SELECT COUNT(*) FROM messages WHERE messages.conversation_id = $1, executed 41,000 times an hour. The inbox view rendered each conversation in a list with the message count next to it, the developer had used conversation.messages.count in the partial, and Rails had cheerfully fired one COUNT query per row, every render, for every user, for two years. The fix was a four-line migration and a single counter_cache: true on the association. The next morning Postgres was at 31 percent. The Rails counter cache is one of the highest-leverage three-letter optimizations in the framework, and one of the most quietly bug-prone if you do not understand what it actually is.

After nineteen years of Rails I have shipped counter caches that saved companies, and I have debugged counter caches that drifted out of sync and made an entire admin dashboard lie for six months. This post is everything I wish a junior engineer knew before adding their first counter_cache: true.

What a Rails Counter Cache Actually Is

A Rails counter cache is a denormalized integer column on the parent model that Rails keeps in sync with the count of its children. When you write belongs_to :conversation, counter_cache: true on Message, Rails takes responsibility for incrementing conversations.messages_count on every Message.create and decrementing it on every Message.destroy. The read becomes a single column lookup. The write becomes an extra UPDATE.

The trade is exactly what every denormalization trade is. You pay a small write cost to eliminate a potentially enormous read cost. On a marketplace inbox, a feed page, a forum index, or any list view where you render a count next to a parent, that trade is almost always correct. A count query is O(n) on the children. A column read is O(1). At 18,000 conversations rendered per minute, the math is not subtle.

The counter cache is also one of the few denormalizations in Rails where the framework does most of the bookkeeping for you. That makes it easy to add, and easy to misuse. Almost every counter cache bug I have shipped or seen shipped traces back to engineers thinking the framework does more than it does.

Setting Up a Rails Counter Cache The Right Way

A clean counter cache setup is a migration, a model change, and a one-time backfill. The order matters.

# 1. Migration — add the column with a safe default.
class AddMessagesCountToConversations < ActiveRecord::Migration[7.1]
  def change
    add_column :conversations, :messages_count, :integer, default: 0, null: false
  end
end

A null: false with a default of zero is non-negotiable. A NULL counter cache poisons every read site that does arithmetic with it, and Rails will not coerce nil to 0 for you in views.

# 2. Model — declare the counter cache on the child.
class Message < ApplicationRecord
  belongs_to :conversation, counter_cache: true
end

# 3. Parent for completeness — readable, not strictly required.
class Conversation < ApplicationRecord
  has_many :messages
end

The declaration goes on the belongs_to, not the has_many. This is the single most common mistake I see in code review. Rails uses the belongs_to callbacks to do the bookkeeping, because that is the side that sees inserts and deletes for the child.

# 4. Backfill — one-time data correction.
Conversation.find_each do |conversation|
  Conversation.reset_counters(conversation.id, :messages)
end

For tables with millions of rows, do the backfill in a background job with batching and a small sleep between batches. A correlated update via raw SQL is faster but harder to throttle, and on a hot production database “faster” is not always the right answer. We have a Rails strong migrations post that goes into the unsafe-migration patterns to avoid here.

How Rails Actually Updates The Counter

Under the hood, Rails issues an UPDATE ... SET messages_count = COALESCE(messages_count, 0) + 1 WHERE id = ? after every successful insert, and the matching decrement after every successful destroy. This is not atomic with your business write unless they share the same transaction. By default, ActiveRecord wraps the create and the counter update in one transaction, so they are. The moment you reach outside that transaction, the guarantee evaporates.

Two cases where the guarantee silently evaporates. First, anything that uses update_all, delete_all, insert_all, or upsert_all bypasses callbacks and the counter cache entirely. Second, any custom SQL that touches the children table directly does the same. We have an entire post on Rails insert_all and upsert_all that calls this out, but it is worth repeating in the counter cache context: bulk inserts do not increment the counter. If you bulk-insert 4,000 messages into a conversation, your messages_count will drift by 4,000.

# Wrong — drifts the counter cache silently.
Message.insert_all(rows)

# Right — bulk insert, then reset the counters for affected parents.
Message.insert_all(rows)
Conversation.where(id: rows.map { |r| r[:conversation_id] }.uniq).find_each do |c|
  Conversation.reset_counters(c.id, :messages)
end

The Gotchas That Bite in Production

There are exactly five counter cache failure modes I see in real Rails apps. In order of how often they detonate.

Drift from raw SQL or bulk operations. Covered above. The remedy is to either avoid bulk operations on counter-cached associations or schedule a periodic reconciliation job that runs reset_counters for the affected parents.

Touch behaviour and stale caches. When a counter cache fires, Rails also touches the updated_at on the parent by default. This is usually what you want — your fragment caches expire correctly — but it can murder you if your parent has a heavy after_update callback. I have seen counter cache increments trigger a webhook cascade that fanned out to 80 downstream calls per message. Set touch: false on the belongs_to if you do not want the touch, or move the callback to an explicit code path.

class Message < ApplicationRecord
  belongs_to :conversation, counter_cache: true, touch: false
end

Conditional creates with validation that fail silently. A counter cache only increments when the child is actually persisted. Message.create returns the unpersisted object on validation failure, and developers sometimes ignore the return value. The counter stays correct, but the user experience is wrong. Use create! or check persisted? if your application logic depends on the count moving.

Race conditions on concurrent destroys. Two background jobs that each destroy the last message in a conversation, racing, can both decrement and leave the counter at -1. Rails has used COALESCE and concurrent-safe SQL for a long time, but I still occasionally see negative counters in legacy apps. If you see them, the cause is usually code that updated the counter manually with increment! instead of letting Rails do it.

Polymorphic associations and STI. Rails has supported counter_cache on polymorphic belongs_to since 5.0, but with a sharp edge. The cache column lives on a single concrete parent table, so the polymorphic counter only makes sense when you scope it properly. Read the source if you are going there. In ten years I have only used a polymorphic counter cache twice and regretted both.

Resetting a Stale Counter Cache

When the cache drifts — and it will, eventually, in any sufficiently large app — reset_counters is the canonical fix. It runs a SELECT COUNT(*) against the children and overwrites the parent column. Run it from a Rake task, a background job, or a one-off console session.

namespace :counters do
  desc "Reconcile conversation message counters"
  task reconcile_messages: :environment do
    drift = []
    Conversation.find_each(batch_size: 500) do |conversation|
      actual = conversation.messages.count
      next if actual == conversation.messages_count
      drift << { id: conversation.id, was: conversation.messages_count, is: actual }
      Conversation.reset_counters(conversation.id, :messages)
    end
    Rails.logger.info("Counter drift reconciled: #{drift.size} rows")
    drift.each { |row| Rails.logger.info(row.inspect) }
  end
end

I run this Rake task weekly on apps where counter accuracy matters and where the data volume is small enough that a full scan is reasonable. On apps with hundreds of millions of rows, I run a sampled version nightly and alert on drift above a threshold. The alert is the point. A counter that is one off does not matter for an inbox header. A counter that is two thousand off means you have a bug somewhere upstream and the cache is the canary.

Counter Cache With Custom Conditions

Sometimes you do not want the count of all children. You want the count of active children, or unread children, or children created in the last seven days. Rails supports custom counter cache columns out of the box. The trick is that Rails will only maintain them for inserts and destroys, not for state changes.

class Message < ApplicationRecord
  belongs_to :conversation, counter_cache: :unread_messages_count
end

class Conversation < ApplicationRecord
  has_many :messages
end

If the count depends on a mutable column — say, messages.read_at IS NULL — counter cache will not help you. The child can transition from unread to read without an insert or a destroy, and Rails will not know. You need a custom callback or a model concern that maintains the cache yourself, and at that point you should ask whether a small materialized view or a SUM(CASE WHEN read_at IS NULL THEN 1 ELSE 0 END) query against a covering index is a better answer.

# A maintained custom counter, used judiciously.
class Message < ApplicationRecord
  belongs_to :conversation

  after_update_commit :bump_unread_counter, if: :saved_change_to_read_at?

  private

  def bump_unread_counter
    if read_at_before_last_save.nil? && read_at.present?
      Conversation.where(id: conversation_id).update_all("unread_messages_count = GREATEST(unread_messages_count - 1, 0)")
    elsif read_at_before_last_save.present? && read_at.nil?
      Conversation.where(id: conversation_id).update_all("unread_messages_count = unread_messages_count + 1")
    end
  end
end

That code is fine. It is also the kind of code I review carefully. Every line of hand-rolled counter maintenance is a future drift bug.

When NOT to Use a Rails Counter Cache

Counter cache is the wrong tool when the count is mostly write-side and rarely read, when the children change category often, or when you need totals across a date range. For “messages in the last 24 hours” you want a query with a partial index, not a counter cache. For totals across many parents — “show me the top ten conversations by message count” — a counter cache is a beautiful fit, because the sort becomes a normal indexed sort on messages_count DESC.

Also resist the temptation to add a counter cache on every has_many reflexively. Each one is an extra UPDATE per write to the child table. On a heavy write path with five counter caches, that is six writes per logical insert. Profile with pg_stat_statements before and after. Sometimes the read you were trying to optimize is fired three times a day from an admin page and the counter cache costs more than it saves.

A Production Counter Cache Checklist

Before shipping a counter cache to production, I walk through this list. It is short on purpose.

The column has null: false and a default of zero. The declaration is on the belongs_to. There is a backfill that ran successfully and was verified by spot-checking ten parents. Bulk operations (insert_all, upsert_all, update_all, delete_all) that touch the children either do not exist or are paired with reset_counters. A weekly or nightly reconciliation job exists for cases where drift matters. Any after_update callbacks on the parent are reviewed for the touch cascade. The new column is indexed only if it is used in ORDER BY or WHERE, not by reflex.

That is the entire list. A counter cache that follows it will run quietly for years. A counter cache that skips even two of those items will drift, lie to your users, and turn up in a Slack thread at 11 p.m. on a Friday.

Frequently Asked Questions

Does Rails counter cache work with destroy_all and delete_all?

destroy_all runs callbacks per record and does update the counter cache correctly, at the cost of N queries. delete_all skips callbacks entirely and does not update the counter cache. After any delete_all against a counter-cached children table, run reset_counters for the affected parents or expect drift.

How do I reset a Rails counter cache for all records?

Use Model.reset_counters(id, :association_name) per parent. For a full reconciliation, iterate with find_each in batches. Avoid building a single transaction over millions of rows. On large tables, schedule the reconciliation as a background job that runs in chunks and yields between batches so the database stays responsive.

Why is my counter_cache showing negative numbers?

Almost always because something decremented the counter manually with increment! or update_columns without a matching increment elsewhere, or because callbacks were bypassed on the create side but ran on the destroy side. Run reset_counters to recover the true value, then audit your codebase for any direct writes to the counter column.

Should I add a database index on the counter_cache column?

Only if you read or sort by it. Indexing every counter cache column by reflex wastes write throughput. If your homepage sorts conversations by messages_count DESC, index it. If you only render the value in a row that is already keyed by conversation_id, do not.

Does Rails counter cache work with soft-deletes?

Not out of the box. A gem like paranoia or discard marks a row as deleted without firing destroy, so the counter does not move. You need a callback on the soft-delete column change, or you need to run reconciliation. This is one of the strongest arguments against soft-deletes in counter-cached tables.

Need help untangling Rails performance bottlenecks or a counter cache that has been quietly lying for months? TTB Software specializes in Rails performance audits and pragmatic fixes. We have been doing this for nineteen years.

#rails-counter-cache #counter-cache-rails #rails-n-plus-one #rails-association-count #counter-cache-reset #rails-performance-tuning #has-many-counter-cache

Related Articles

Last section. Then please call.

It's a phone call. That's the worst it can get.

No discovery deck. No 45-minute "qualification" call. 30 minutes, your problem, my opinion. If we're a fit, you'll know by minute 12.

Direct line — answered by Roger
+31 6 5123 6132
Mon–Fri, 09:00–18:00 CET · Currently available

OR
info@ttb.software