Rails Strong Migrations: Catch Unsafe Database Changes Before They Lock Production
Rails Strong Migrations: catch unsafe Postgres changes — NOT NULL adds, renames, non-CONCURRENTLY indexes — before they lock production tables.
The migration looked harmless. Twelve lines, one new column, a default value of false, a NOT NULL constraint. The author ran it locally on a hundred-row dev database in under a second and merged. The CI pipeline ran it against a fresh schema and took under a second. The staging deploy ran it against a million-row staging database and took two seconds. Everyone went home.
Production had eighty million rows. The migration acquired an ACCESS EXCLUSIVE lock on the orders table and held it for nine minutes while it rewrote every row. Every checkout in the EU timed out. Customer support got a hundred and forty tickets in eleven minutes. The CTO got a phone call. The migration finished, the locks released, the site recovered, and the post-mortem concluded with the sentence “we need better review on migrations.”
After nineteen years of Rails, I can tell you exactly what “better review on migrations” looks like. It is a gem called Strong Migrations, three lines in your Gemfile, and a CI check that fails the build before the dangerous migration ever reaches a reviewer’s attention. This post is about why that particular tooling matters, what it catches, what it does not, and how to wire it into a Rails app properly.
What Rails Strong Migrations Actually Does
Strong Migrations is a Rails gem that inspects the contents of your migration files at migration runtime and refuses to run patterns that are known to cause downtime on a busy production database. It is not a linter and it is not a runtime monitor. It is a guard rail that turns a class of “this works locally but kills production” bugs into a hard, loud, actionable failure on the developer’s machine and in CI.
The unsafe patterns it catches are not theoretical. They are the patterns that take down real Rails apps every week. Adding a NOT NULL column with a default on Postgres 10. Removing a column that still has live readers. Renaming a column the application still selects. Creating an index without CONCURRENTLY on a table large enough that the resulting ACCESS EXCLUSIVE lock blocks every write for the duration of the build. Each of these is a one-line migration. Each of these can take a site down for minutes to hours depending on table size and traffic shape.
Strong Migrations does not fix the migration. It refuses to run it and prints a message that includes a safe alternative, often with the exact code to use instead. The error message is part of the value. A junior engineer who has never seen the issue before reads the message, learns the pattern, and writes the safer version the first time around.
Installing Rails Strong Migrations in a Rails 8 App
Add the gem and a small initializer.
# Gemfile
gem "strong_migrations"
bundle install
bin/rails generate strong_migrations:install
The generator creates config/initializers/strong_migrations.rb and updates db/migrate to be aware of the check. You will see something like this in the initializer:
StrongMigrations.start_after = 20260619000000
StrongMigrations.target_version = 17 # Postgres major version
StrongMigrations.lock_timeout = 10.seconds
StrongMigrations.statement_timeout = 1.hour
StrongMigrations.auto_analyze = true
start_after is critical and almost always wrong by default. It tells Strong Migrations to ignore migrations whose timestamp is older than that value. Set it to the timestamp of the most recent migration before you adopted the gem so historical migrations do not retroactively fail. New migrations are checked; old ones are grandfathered in.
target_version should match the major version of Postgres you actually run in production. Strong Migrations adjusts its rules based on this. Adding a NOT NULL column with a default is unsafe on Postgres 10 and safe on Postgres 11+. If you tell the gem the wrong version, you get either false positives or false negatives.
lock_timeout and statement_timeout are the most important two settings in the file and the ones teams skip. They tell Postgres to give up on a migration that has been waiting for a lock longer than lock_timeout, or running longer than statement_timeout. Without these, a migration that gets stuck behind a long-running transaction will sit there forever, accumulating a queue of blocked queries behind it. With them, the migration fails fast, the team gets a clear error, and production stays up.
The Unsafe Patterns You Will Actually Hit
Strong Migrations catches a long list of patterns. In practice, five of them account for almost every production incident I have seen. Knowing them by heart is more useful than reading the full gem README.
Adding a NOT NULL column with a default on older Postgres
add_column :users, :receives_marketing, :boolean, default: false, null: false
On Postgres 10 and earlier, this rewrites every row in the table to fill in the default. On a hundred million row table that is hours, and the table is locked the entire time. On Postgres 11+, the default is stored as a constant in catalog metadata and applied virtually on read, so the operation is instant. Strong Migrations knows your Postgres version and only complains when it matters.
The safe pattern on older Postgres is three migrations:
# 1. Add the column nullable, no default.
add_column :users, :receives_marketing, :boolean
# 2. Backfill in batches in a separate migration or rake task.
User.in_batches.update_all(receives_marketing: false)
# 3. Add the constraint and default, deploy the application code first.
change_column_null :users, :receives_marketing, false
change_column_default :users, :receives_marketing, false
Removing a column the app still reads
remove_column :orders, :legacy_status
ActiveRecord caches the column list on first query. A running web process that started before the migration still believes legacy_status exists, includes it in its SELECT, and crashes on every request once the column is gone. Strong Migrations forces you to acknowledge this by adding an ignored_columns declaration in the model first, deploying that, and only then running the removal in a follow-up deploy.
# Deploy 1
class Order < ApplicationRecord
self.ignored_columns = [:legacy_status]
end
# Deploy 2 (after Deploy 1 is fully rolled out)
class RemoveLegacyStatusFromOrders < ActiveRecord::Migration[8.0]
def change
safety_assured { remove_column :orders, :legacy_status, :string }
end
end
safety_assured is the explicit override. Use it when you understand the risk and have done the prep work. Never use it because the gem is “annoying.”
Renaming a column
rename_column :products, :stock, :inventory_count
This is the same problem as remove_column plus a write side. Old code reads from stock, new code writes to inventory_count, neither sees the other’s data. The safe pattern is to add the new column, sync writes via a callback or trigger, backfill, switch reads, switch writes, then drop the old column. Five deploys. Strong Migrations does not let you skip steps.
For most apps the right answer to “I want to rename this column” is “do not.” Add a method alias in the model, leave the column alone, and revisit in three years when the table is half its current size and the rename actually pays back.
Creating an index without CONCURRENTLY
add_index :events, :user_id
A standard CREATE INDEX takes an ACCESS EXCLUSIVE-equivalent lock that blocks writes for the duration of the build. On a small table this is invisible. On a fifty million row table on a busy production database, it is a five-minute outage of every write to that table.
The fix is one keyword:
class AddIndexOnEventsUserId < ActiveRecord::Migration[8.0]
disable_ddl_transaction!
def change
add_index :events, :user_id, algorithm: :concurrently
end
end
disable_ddl_transaction! is required because CREATE INDEX CONCURRENTLY cannot run inside a transaction. Strong Migrations refuses the migration unless you include both the algorithm and the transaction disable, which is exactly the discipline you want.
Pair this with pg_stat_statements once the index is in place — you will usually see the query that motivated the index drop to a fraction of its previous cost, and occasionally you will discover the index did not help and you need a different one.
Backfilling data inside a schema migration
def change
add_column :invoices, :tax_cents, :integer
Invoice.find_each { |i| i.update!(tax_cents: i.subtotal_cents * 0.21) }
end
This combines two anti-patterns in one migration. The schema change holds a lock the entire time the backfill runs, and the backfill is slow because it instantiates every row through ActiveRecord. On a million-row table the migration locks the table for fifteen minutes while the backfill grinds through. Strong Migrations rejects it.
The safe pattern is to split the schema change from the backfill, do the backfill in a separate Rake task or background job that runs outside the migration runner, and use update_all in batches rather than loading objects:
# Migration: schema only, instant.
class AddTaxCentsToInvoices < ActiveRecord::Migration[8.0]
def change
add_column :invoices, :tax_cents, :integer
end
end
# Backfill: separate rake task, run after the deploy.
namespace :backfill do
task tax_cents: :environment do
Invoice.in_batches(of: 1_000) do |batch|
batch.update_all("tax_cents = (subtotal_cents * 21 / 100)")
end
end
end
This is the single biggest “Strong Migrations changed how I write code” moment for most teams. Separating schema from data once and you stop conflating them forever.
Wiring Strong Migrations Into CI
Strong Migrations runs at migration time, which means it catches the issue when the developer runs bin/rails db:migrate. That is good but not sufficient. You also want CI to fail loudly so a migration that escaped local review still does not reach production.
Run the migrations as part of the test setup. Most Rails apps already do this implicitly via bin/rails db:test:prepare. Make sure the test database is at the same Postgres major version as production — otherwise Strong Migrations cannot enforce the version-specific rules correctly.
For pull request reviews, add a job that runs only the new migrations against a copy of the production schema:
# .github/workflows/migrations.yml
name: Migration safety
on: [pull_request]
jobs:
check:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:17
env:
POSTGRES_PASSWORD: postgres
ports: ['5432:5432']
steps:
- uses: actions/checkout@v4
- uses: ruby/setup-ruby@v1
with:
bundler-cache: true
- run: bin/rails db:create db:schema:load
env:
RAILS_ENV: test
- run: bin/rails db:migrate
env:
RAILS_ENV: test
The schema load mirrors the current production schema (or close enough), the migrate runs the new migrations on top, and Strong Migrations refuses any unsafe pattern. The build fails, the PR cannot merge, the issue is caught before review even starts.
Statement Timeout and Lock Timeout in Practice
The two timeouts in the initializer deserve their own section because they are how Strong Migrations integrates with the rest of Postgres’ safety story.
lock_timeout is the maximum time Postgres will wait to acquire a lock. Default Postgres behavior is to wait forever. With lock_timeout = 10.seconds, a migration that cannot get an ACCESS EXCLUSIVE lock within ten seconds fails with PG::LockNotAvailable and rolls back. The team retries during a quieter window. Crucially, this prevents the queue-behind-a-blocked-DDL pattern where a stalled migration also blocks every subsequent write to the table, which is how a slow migration turns into a full site outage.
statement_timeout is the maximum time a single statement will run. A CREATE INDEX CONCURRENTLY on a huge table can legitimately take hours, so set this higher than your worst expected migration. An hour is a reasonable default; lower it temporarily for known-fast migrations if you want extra safety.
These pair well with advisory locks for coordinating long-running data backfills across multiple runners, and with logical replication for the truly large schema changes where even a safe migration is too risky on the primary.
When Strong Migrations Is Wrong
Strong Migrations is conservative on purpose. It will occasionally refuse a migration that you know is safe — for example, a remove_column on a table whose model has been deleted entirely, or an add_index without CONCURRENTLY on a table that you just created in the same deploy. The safety_assured block is the escape hatch:
class DropAbandonedExperimentsTable < ActiveRecord::Migration[8.0]
def change
safety_assured { drop_table :abandoned_experiments }
end
end
Use it deliberately. The wrong response to “Strong Migrations rejects my migration” is to wrap everything in safety_assured. The right response is to read the error message, understand which pattern triggered it, decide whether the production risk is real for your table size and traffic shape, and either rewrite the migration or annotate it with the explicit acknowledgement that you accept the risk.
Pair this with a code review rule: any PR that introduces safety_assured requires a one-line comment explaining why. That is enough friction to keep teams honest without slowing them down.
Frequently Asked Questions
How does Rails Strong Migrations differ from manual code review of migrations?
Code review catches what the reviewer remembers to look for. Strong Migrations catches every instance of every known unsafe pattern, every time, without exception, in the developer’s local environment before the PR is even opened. It also encodes Postgres version-specific knowledge that most reviewers do not carry in their head — for example, that add_column with a default became safe in Postgres 11 but not before. The gem complements review; it does not replace it. The reviewer is still the right place to catch logic bugs and architectural problems. The gem is the right place to catch “this will lock production for nine minutes.”
Does Rails Strong Migrations work with Rails 8 and Solid Queue?
Yes. Strong Migrations operates at the ActiveRecord migration level and is agnostic to which Rails version or which background job framework you use. It works identically on Rails 6.1, 7.x, and 8.0, including with the new Solid Queue and Solid Cache schema migrations. If you are on Rails 8 with bundled Solid Queue, the gem will inspect those migrations the same as any other and apply the same rules.
Can I use Rails Strong Migrations with MySQL or only Postgres?
Strong Migrations supports both Postgres and MySQL, but the depth of analysis is significantly better on Postgres. MySQL has a different set of unsafe patterns and a different model of online DDL (some MySQL versions can perform certain schema changes without a full lock via ALGORITHM=INPLACE), so the gem’s rules and recommended fixes are tailored per adapter. Set StrongMigrations.target_version to your MySQL major version on a MySQL app and the gem adjusts its rules accordingly.
What is the right value for StrongMigrations.start_after when adopting the gem on a legacy app?
Set it to the timestamp of the most recent migration in your repository at the moment you adopt the gem. Find it with ls -1 db/migrate | tail -1 and use the numeric prefix. This grandfathers in every historical migration — they have already run in production so checking them now is pointless — and ensures that every new migration written from this point onward is inspected. The wrong value is 0 (which makes the gem check thousands of irrelevant historical migrations) or a future timestamp (which silently disables the gem for everything currently in flight).
Need help wiring safe migrations into your Rails deployment pipeline, or recovering from a migration that took your site down? TTB Software specializes in Rails production safety and Postgres operations. We have been doing this for nineteen years.
Related Articles
Rails Phlex: Ruby-First View Components That Beat ERB and ViewComponent on Speed
Rails Phlex writes views in pure Ruby — no templates, no DSL surprises. Faster than ERB, smaller than ViewComponent, ...
Rails Pessimistic Locking: SELECT FOR UPDATE, with_lock, and Preventing Race Conditions
Rails pessimistic locking with SELECT FOR UPDATE, lock! and with_lock — prevent race conditions on balances, inventor...
Rails pg_stat_statements: Find Slow Queries in Production Before Users Do
Rails pg_stat_statements setup, query, and analysis guide: find the slow queries actually hurting production, normali...