Sidekiq to Solid Queue Migration: A Zero-Downtime Guide for Production Rails Apps
A SaaS client called me in February with a problem most Rails teams will face in the next two years: their Sidekiq Pro license renewal was a five-figure bill, their Redis node had failed over twice the previous quarter, and Rails 8 had shipped with Solid Queue as the default. Their CTO wanted to know if a Sidekiq to Solid Queue migration was realistic for a system processing thirty million jobs a month — and if so, how to do it without losing a single payment-processing job along the way. We finished the migration in six weeks. Their Redis bill went to zero. They have not had a queue-related incident since.
After nineteen years of Rails I have moved a lot of background-job systems. Resque to Sidekiq. Delayed Job to Sidekiq. Sidekiq to GoodJob. And now, repeatedly, Sidekiq to Solid Queue. This post is the playbook I use with clients: the migration steps, the dual-running pattern, what breaks, how to migrate scheduled and recurring jobs, and the rollback plan you should write before you delete a single line of Sidekiq configuration.
If you have not read my earlier piece on Solid Queue fundamentals, start there. This post assumes you already know what Solid Queue is and want to know how to get there from Sidekiq.
Why a Sidekiq to Solid Queue Migration Is Worth Doing
Before you commit, be honest about the trade-offs. Solid Queue is a worthy replacement, not a strict upgrade.
You gain operational simplicity: one fewer service to run, monitor, back up, and patch. You gain transactional enqueue — Order.create! and ChargeOrderJob.perform_later(order) now commit atomically in the same Postgres transaction, which kills an entire class of “job ran before the row existed” bugs. You shed a Sidekiq Pro or Enterprise bill that grows linearly with traffic.
You lose raw throughput. Sidekiq workers pull jobs from Redis lists in microseconds; Solid Queue workers run a SELECT FOR UPDATE SKIP LOCKED against Postgres every poll interval, which is millisecond-class. For most apps the difference is invisible. For high-fanout systems pushing fifty thousand jobs per minute on a single queue, you will feel it.
The Sidekiq to Solid Queue migration is right when: Redis is a meaningful chunk of your ops complexity, your job volume is under roughly fifty million jobs a month, your Postgres has headroom (or you are willing to add a dedicated queue database), and you do not depend heavily on Sidekiq Pro features like batches or unique jobs that have no direct Solid Queue analog.
It is wrong when: you are at hyperscale, your team is two people and Sidekiq is fine, or you depend on Sidekiq batches and would have to rewrite that orchestration from scratch.
Pre-Migration Audit: What Are You Actually Running
The single biggest mistake teams make is starting the migration before they understand what they have. Spend a day on this.
Open a rails console on production and inventory your jobs:
Sidekiq::Stats.new.queues
# => {"default"=>0, "mailers"=>3, "critical"=>0, "low"=>421}
Sidekiq::Cron::Job.all.map(&:name)
# Every recurring job you have ever set up
Sidekiq::ScheduledSet.new.size
# Jobs scheduled for the future
Sidekiq::RetrySet.new.size
# Jobs currently in the retry queue
Walk the codebase for every sidekiq_options call:
git grep -n "sidekiq_options"
git grep -n "Sidekiq::Batch"
git grep -n "sidekiq_retry_in"
git grep -n "include Sidekiq::Job"
Make a spreadsheet. For each job class, record: the queue, retry settings, any uniqueness constraints, any batch membership, and any Sidekiq-specific middleware. This list is your migration plan. Anything in it that does not have a Solid Queue equivalent has to be rewritten before you can switch.
The ones that bite: unique jobs (Sidekiq Enterprise) — Solid Queue has different concurrency primitives. Sidekiq::Batch — no direct equivalent; you will model the fan-out/fan-in in your own tables. sidekiq_options lock: :until_executed — replace with a ConcurrencyControl or an advisory lock in the job. I wrote about advisory locks for exactly this kind of pattern.
Setting Up Solid Queue Alongside Sidekiq
The migration runs both systems at once. New jobs route to Solid Queue gradually while Sidekiq drains its queues. This is the safest path I have found.
Add the gem and install Solid Queue into its own database:
# Gemfile
gem "solid_queue", "~> 1.1"
gem "mission_control-jobs", "~> 1.0" # Web UI
bundle install
bin/rails generate solid_queue:install
The generator wants to install Solid Queue into your primary database. Do not do that on a busy app. Configure a separate database in config/database.yml:
production:
primary:
<<: *primary_config
queue:
<<: *primary_config
database: my_app_production_queue
migrations_paths: db/queue_migrate
Then in config/application.rb:
config.solid_queue.connects_to = { database: { writing: :queue } }
Why a separate database? Job tables churn hard. Inserts, updates, deletes, every second. You do not want autovacuum on your solid_queue_jobs table fighting for I/O with your real application traffic. I wrote about tuning autovacuum for high-churn tables — apply those settings to your queue database day one.
Run the Solid Queue migrations into the new database:
bin/rails db:create:queue
bin/rails db:migrate:queue
You now have Solid Queue tables but nothing using them yet. Sidekiq is still doing all the work.
The Dual-Running Pattern
This is the heart of the Sidekiq to Solid Queue migration. You want every job class to be able to run on either system, controlled by a runtime flag, so you can migrate a few jobs at a time and roll back instantly if anything goes wrong.
Create a base class that routes itself:
class ApplicationJob < ActiveJob::Base
before_enqueue :route_to_backend
private
def route_to_backend
backend = JobRouting.backend_for(self.class.name)
self.queue_adapter = case backend
when :solid_queue then :solid_queue
when :sidekiq then :sidekiq
end
end
end
And a routing module that reads from a feature flag store — Flipper, a database table, or environment variables. We covered feature flag patterns in detail here.
module JobRouting
SOLID_QUEUE_JOBS = Set.new
def self.backend_for(job_class_name)
return :solid_queue if Flipper.enabled?(:solid_queue_global)
return :solid_queue if Flipper.enabled?("solid_queue_#{job_class_name.underscore}".to_sym)
:sidekiq
end
end
Configure both adapters in config/application.rb. Active Job lets you set a default adapter, but queue_adapter= on an individual job overrides it:
config.active_job.queue_adapter = :sidekiq # default during migration
Now you can flip a single job class to Solid Queue:
Flipper.enable :solid_queue_send_invoice_email_job
Monitor for an hour. If it misbehaves, flip it off. The job goes right back to Sidekiq with no code change.
Migrating Recurring and Scheduled Jobs
This is the part that most migration guides skim and most teams underestimate.
Sidekiq-cron stores its schedule in Redis. Solid Queue uses a YAML config file. You cannot dual-run a recurring job — exactly one scheduler must own it, or you will get duplicate executions.
Inventory your sidekiq-cron jobs:
Sidekiq::Cron::Job.all.each do |job|
puts [job.name, job.cron, job.klass].join(" | ")
end
Translate each to config/recurring.yml:
production:
expire_trial_accounts:
class: ExpireTrialAccountsJob
queue: maintenance
schedule: every day at 3am UTC
reconcile_stripe_charges:
class: ReconcileStripeChargesJob
queue: critical
schedule: every hour at minute 17
Plan a cutover window. In one deploy: remove the job from sidekiq-cron, add it to recurring.yml, ensure Solid Queue workers are running. There is a small window where neither scheduler owns the job — pick a window where one missed run is harmless. Anything where a missed run matters (billing, compliance) should be migrated during a quiet hour with a human watching.
Scheduled jobs in flight (PaymentJob.set(wait: 24.hours).perform_later(payment)) are different. Those live in Sidekiq’s ScheduledSet and will fire when their perform_at arrives. Leave Sidekiq running until its ScheduledSet is empty. For a normal app this takes one to two weeks.
Running Solid Queue Workers in Production
Sidekiq runs as one process per machine with a thread pool. Solid Queue runs as a process supervisor with multiple worker types: workers (do jobs), dispatchers (pick up scheduled jobs), and the scheduler (recurring jobs). You configure them in config/queue.yml:
production:
dispatchers:
- polling_interval: 1
batch_size: 500
workers:
- queues: [critical, default]
threads: 5
processes: 2
polling_interval: 0.1
- queues: [low, maintenance]
threads: 3
processes: 1
polling_interval: 1
scheduler:
recurring_tasks_path: config/recurring.yml
Polling interval is the knob people get wrong. On the critical queue you want 0.1s — workers will wake every hundred milliseconds to claim work. On a low-priority queue 1s or more is fine and reduces database load.
If you deploy with Kamal — and you should, see my Kamal 2 production guide — run Solid Queue as a separate accessory or role:
servers:
web:
hosts:
- 10.0.0.10
jobs:
hosts:
- 10.0.0.20
cmd: bin/jobs start
options:
memory: 2g
bin/jobs is the supervisor that came with solid_queue:install. It handles graceful shutdown on SIGTERM — in-flight jobs get up to SolidQueue.shutdown_timeout to finish before the worker exits. Set this to longer than your longest job. I default to two minutes.
Connection Pool Math
This is the silent killer of Solid Queue migrations. Each worker thread holds a Postgres connection. With two worker processes of five threads each, you have ten connections per machine just for Solid Queue. Add dispatcher, scheduler, and your web Pumas, and you can blow through your Postgres max_connections fast.
The arithmetic:
total_connections = (web_processes * web_threads)
+ (worker_processes * (worker_threads + 1))
+ dispatcher_connections
+ scheduler_connections
+ admin/migrations slack
Most teams need PgBouncer in transaction-pooling mode in front of Postgres once they switch to Solid Queue. If you are already at the edge of your connection pool with Sidekiq, the migration will push you over. Plan for it before you cut over your first job.
Observability and the Web UI
Sidekiq Web is excellent. Mission Control Jobs — the official Solid Queue UI — is good but newer. Mount it:
# config/routes.rb
authenticate :user, ->(u) { u.admin? } do
mount MissionControl::Jobs::Engine, at: "/jobs"
end
It gives you live job counts, failed jobs with stack traces, retries, and a button to manually re-enqueue. For production observability you want more. Hook into Active Job’s notifications:
ActiveSupport::Notifications.subscribe "perform.active_job" do |*, payload|
StatsD.increment(
"jobs.processed",
tags: ["job:#{payload[:job].class.name}", "queue:#{payload[:job].queue_name}"]
)
end
ActiveSupport::Notifications.subscribe "enqueue.active_job" do |*, payload|
StatsD.increment("jobs.enqueued", tags: ["job:#{payload[:job].class.name}"])
end
Add Postgres queries to your dashboard for queue depth and oldest unstarted job:
SELECT queue_name, COUNT(*) AS depth,
EXTRACT(EPOCH FROM (now() - MIN(created_at))) AS oldest_seconds
FROM solid_queue_ready_executions
GROUP BY queue_name;
If oldest_seconds on critical ever exceeds your SLO (mine is usually 30), you page someone. This is the same mental model as Sidekiq’s latency metric, just expressed in SQL.
The Cutover
After two weeks of dual-running, every job class is on Solid Queue and Sidekiq has nothing left. Walk through the verification checklist:
# Sidekiq queues are empty
Sidekiq::Queue.all.map { |q| [q.name, q.size] }.to_h
# => all zeros
# Sidekiq scheduled set is empty
Sidekiq::ScheduledSet.new.size == 0
# Sidekiq retry set is empty (or you have decided to abandon those)
Sidekiq::RetrySet.new.size == 0
# Solid Queue is processing
SolidQueue::Job.where("created_at > ?", 5.minutes.ago).count > 0
Then deploy the final cleanup: remove sidekiq from the Gemfile, delete config/sidekiq.yml, remove the Sidekiq Web mount, change config.active_job.queue_adapter = :solid_queue, delete the JobRouting module, and remove the Flipper flags. One small PR, easy to revert.
Shut down the Redis instance last. Wait a week. Take a final RDB snapshot before terminating. You may not need Redis ever again — or you may rediscover it for caching (Solid Cache is the same idea for caching, incidentally).
The Rollback Plan
Write this before you start. The dual-running pattern means rollback is mostly free: flip every Flipper flag off and the system is back on Sidekiq inside thirty seconds. The danger window is after you have removed Sidekiq from the Gemfile and deleted JobRouting. Until then you are always one toggle away from safety.
If a job class corrupts data on Solid Queue specifically — usually because of a transactional-enqueue assumption that did not exist on Sidekiq — flip that one class back, investigate, fix, redeploy, re-enable. Do not panic-roll-back the whole migration over one job class.
FAQ
How long does a typical Sidekiq to Solid Queue migration take?
For a mid-sized Rails app with twenty to fifty job classes and moderate volume, plan four to eight weeks end to end: one week of audit and setup, two to four weeks of gradual cutover with dual-running, one week of monitoring with Solid Queue handling everything, and a final cleanup deploy. Skip the audit at your peril.
Can Solid Queue handle the same throughput as Sidekiq?
For most apps yes — anything under ten thousand jobs per minute is comfortable on a properly tuned Postgres. Above fifty thousand jobs per minute on a single queue you will start to feel the polling overhead and may need to shard queues across worker pools or stay on Sidekiq. Benchmark with your real workload before committing.
What happens to Sidekiq Pro batches in a Solid Queue migration?
There is no direct equivalent. You have three options: keep Sidekiq Pro for batch workflows and run both indefinitely, rewrite batches as parent-child job records in your own database with completion tracking, or model the fan-out/fan-in as a state machine. I usually recommend the second — a JobBatch table with pending_count and completed_count columns is straightforward and gives you better observability than Sidekiq batches anyway.
Do I need a separate database for Solid Queue?
Strongly recommended for any production app processing more than a few hundred thousand jobs a day. The job tables have very different access patterns from your application tables — high insert and delete rates, lots of index churn — and you do not want autovacuum on those tables affecting your user-facing queries. A dedicated database also lets you tune fsync and replication separately if you want to trade a little durability for performance on the queue side.
Planning a Sidekiq to Solid Queue migration and want a second pair of eyes? TTB Software helps Rails teams plan and execute background-job migrations without losing jobs in production. We have been doing this for nineteen years.
About the Author
Roger Heykoop is a senior Ruby on Rails developer with 19+ years of Rails experience and 35+ years in software development. He specializes in Rails modernization, performance optimization, and AI-assisted development.
Get in TouchRelated Articles
Need Expert Rails Development?
Let's discuss how we can help you build or modernize your Rails application with 19+ years of expertise
Schedule a Free Consultation