Debug Memory Leaks in Ruby on Rails: A Production Hunting Guide
Memory leaks in Ruby on Rails apps almost never come from actual C-extension leaks. In eight years of running Rails in production, I’ve traced maybe two issues to genuine memory leaks in native code. The rest? Unbounded growth — hashes that never get pruned, strings retained by accident, callbacks that accumulate references faster than the GC can collect them.
The distinction matters because the tools and approach differ. A real leak requires a gem update or patch. Unbounded growth requires you to find the code doing the accumulating and put a limit on it.
Recognizing the Symptoms
Your Rails app has a memory problem when worker RSS grows monotonically across requests. Healthy Ruby processes stabilize after a warm-up period — typically 50-200 requests depending on your app’s complexity. The GC reclaims objects, RSS flattens, and life goes on.
When RSS climbs without stabilizing, you have unbounded growth. A quick diagnostic:
# Watch Puma worker RSS over time (Linux)
while true; do
ps -o pid,rss,command -p $(pgrep -f 'puma.*worker') | tail -n +2
sleep 30
done
If RSS increases by 10-50 MB per hour under steady traffic, that’s your signal. Anything under 5 MB/hour might just be fragmentation.
Step 1: Measure Before You Hunt
Install derailed_benchmarks (gem version 2.2+, Ruby 3.2+):
# Gemfile
group :development, :test do
gem 'derailed_benchmarks'
gem 'stackprof'
end
Run the static memory analysis first — it catches the easy wins:
bundle exec derailed bundle:mem
This shows memory consumed at boot by each gem. I’ve seen apps where a single unused gem pulled in 40 MB of dependencies. On a recent Rails 8 project, removing mini_magick (replaced by ActiveStorage’s built-in processing) dropped boot memory by 28 MB across 4 Puma workers.
For request-level analysis:
bundle exec derailed exec perf:mem_over_time
This hits your app repeatedly and tracks memory growth. A flat line means no leak. An upward slope tells you to keep digging.
Step 2: Heap Dumps with ObjectSpace
Ruby’s ObjectSpace module is your primary investigation tool. Enable heap dump support in your Rails app:
# config/initializers/memory_debug.rb (temporary — remove after investigation)
if ENV['MEMORY_DEBUG']
require 'objspace'
ObjectSpace.trace_object_allocations_start
end
Trigger a heap dump from a running process:
# Via rails console attached to a production worker, or via a debug endpoint
GC.start(full_mark: true, immediate_sweep: true)
GC.start # Run twice to clear weak references
file = "/tmp/heap_dump_#{Process.pid}_#{Time.now.to_i}.json"
ObjectSpace.dump_all(output: File.open(file, 'w'))
puts "Heap dump written to #{file} (#{File.size(file) / 1024 / 1024} MB)"
The dump is a JSON-lines file where each line represents one live Ruby object. The fields that matter:
type: Object type (STRING, HASH, ARRAY, OBJECT, etc.)file: Source file where the object was allocatedline: Line numbermemsize: Memory consumed in bytesgeneration: GC generation when allocated (lower = older = more suspicious)
Step 3: Analyze the Heap Dump
Parse the dump to find accumulation patterns:
# analyze_heap.rb
require 'json'
counts = Hash.new(0)
sizes = Hash.new(0)
locations = Hash.new(0)
File.foreach(ARGV[0]) do |line|
obj = JSON.parse(line)
type = obj['type']
counts[type] += 1
sizes[type] += obj['memsize'].to_i
if obj['file']
loc = "#{obj['file']}:#{obj['line']}"
locations[loc] += 1
end
end
puts "=== Object counts by type ==="
counts.sort_by { |_, v| -v }.first(10).each { |k, v| puts " #{k}: #{v}" }
puts "\n=== Memory by type (MB) ==="
sizes.sort_by { |_, v| -v }.first(10).each { |k, v| puts " #{k}: #{(v / 1024.0 / 1024).round(2)} MB" }
puts "\n=== Top allocation sites ==="
locations.sort_by { |_, v| -v }.first(20).each { |k, v| puts " #{k}: #{v} objects" }
Run it:
ruby analyze_heap.rb /tmp/heap_dump_12345_1711353600.json
The “Top allocation sites” output is where the investigation gets real. When you see 500,000 strings allocated from one line in your codebase, you’ve found your culprit.
The Usual Suspects
After analyzing dozens of Rails memory issues across client projects, these patterns account for roughly 80% of cases:
Unbounded Memoization
# The classic leak
class ProductService
def self.lookup(sku)
@cache ||= {}
@cache[sku] ||= Product.find_by(sku: sku)
end
end
This class-level hash grows with every unique SKU looked up and never shrinks. With 100,000 products, you’re holding 100,000 ActiveRecord objects in memory permanently.
Fix: Use Rails.cache with TTL, or use an LRU cache like lru_redux:
class ProductService
@cache = LruRedux::TTL::ThreadSafeCache.new(1000, 15 * 60) # 1000 items, 15 min TTL
def self.lookup(sku)
@cache.getset(sku) { Product.find_by(sku: sku) }
end
end
ActiveRecord Callback Accumulation
class Order < ApplicationRecord
after_commit :notify_warehouse
def notify_warehouse
WarehouseNotifier.perform_later(self) # Holds reference to `self`
end
end
This isn’t a leak by itself, but when combined with bulk operations that load thousands of records, the callback chain holds references to all of them until the transaction completes. For batch processing, use find_each with smaller batch sizes or bypass callbacks entirely:
Order.where(status: :pending).find_each(batch_size: 100) do |order|
WarehouseNotifier.perform_later(order.id) # Pass ID, not the object
end
String Retention from Logging
Rails.logger.info "Processing order #{order.inspect} with items #{order.items.map(&:inspect)}"
inspect on ActiveRecord objects generates massive strings. In production with debug-level logging accidentally enabled, I’ve seen this consume 2 GB in under an hour. The strings survive longer than you’d expect because the logger may buffer them.
Fix: Use structured logging and lazy evaluation:
Rails.logger.info { "Processing order #{order.id} with #{order.items.count} items" }
The block form means the string is never built if the log level filters it out.
Global Event Subscribers
ActiveSupport::Notifications.subscribe('process_action.action_controller') do |*args|
event = ActiveSupport::Notifications::Event.new(*args)
MetricsCollector.record(event) # If MetricsCollector accumulates without flushing...
end
Check that any metrics collectors, event subscribers, or instrumentation hooks flush their buffers periodically.
Step 4: Confirm the Fix
After applying a fix, verify with a controlled test. The derailed_benchmarks memory-over-time test works, but for production confirmation, I prefer tracking RSS per worker with a simple Prometheus metric:
# config/initializers/memory_metrics.rb
if defined?(Prometheus)
MEMORY_GAUGE = Prometheus::Client::Gauge.new(
:ruby_process_rss_bytes,
docstring: 'RSS memory of the Ruby process',
labels: [:worker]
)
Thread.new do
loop do
rss = File.read("/proc/#{Process.pid}/statm").split[1].to_i * 4096
MEMORY_GAUGE.set(rss, labels: { worker: Process.pid.to_s })
sleep 60
end
end
end
Deploy, watch the graph for 24-48 hours under production traffic. RSS should stabilize after warm-up. If it does, you’re done.
When It Actually Is a Native Leak
If heap dumps show stable Ruby object counts but RSS keeps growing, the leak is in a C extension. The approach changes:
- Check gem changelogs for known memory fixes —
nokogiri,mysql2, and image processing gems are common offenders - Use
valgrindon a staging server (not production — the overhead is 10-30x):
valgrind --tool=massif bundle exec rails runner "1000.times { YourSuspiciousCode.call }"
ms_print massif.out.*
- Try upgrading the suspect gem. If that fixes it, you’re done. If not, file an issue with a minimal reproduction.
In my experience, upgrading nokogiri fixes about half of all native memory issues in Rails apps. The Nokogiri team is responsive and their recent releases (1.16+) have addressed several memory management issues.
Production-Safe Memory Monitoring
For ongoing protection, configure Puma’s worker killer. It’s a band-aid, not a fix, but it prevents OOM kills while you investigate:
# config/puma.rb
plugin :tmp_restart
before_fork do
require 'puma_worker_killer'
PumaWorkerKiller.config do |config|
config.ram = 2048 # MB total for all workers
config.frequency = 30 # Check every 30 seconds
config.percent_usage = 0.90 # Kill at 90% of ram limit
config.rolling_restart_frequency = 6 * 3600 # Rolling restart every 6 hours
end
PumaWorkerKiller.start
end
This buys you time. The rolling restart every 6 hours keeps RSS in check while you track down the root cause with the techniques above.
Frequently Asked Questions
How much memory should a Rails 8 app use per Puma worker?
A typical Rails 8 app uses 150-300 MB per Puma worker after warm-up, depending on gem count and application complexity. Apps with heavy image processing or large ActiveRecord result sets can hit 500 MB+. If a single worker exceeds 1 GB under normal traffic, you likely have unbounded growth somewhere.
Does Ruby’s garbage collector cause memory bloat?
Ruby’s GC (particularly with YJIT enabled) manages object memory well, but it can’t shrink the process heap. Once Ruby requests memory from the OS via malloc, that RSS is permanent for the process lifetime — even after objects are freed. This is memory fragmentation, not a leak. The fix is controlling peak memory usage rather than expecting RSS to decrease. Jemalloc as a malloc replacement (LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2) reduces fragmentation by 10-30% in most Rails apps.
Can I use ObjectSpace.dump_all in production safely?
Yes, but with caveats. The dump pauses the Ruby process for 1-10 seconds depending on heap size (a 1 GB process takes about 3-5 seconds). Run it on a single worker during low traffic, not during peak hours. The trace_object_allocations_start call adds 5-10% overhead, so enable it temporarily and disable after collecting your dump. Never leave allocation tracing on permanently.
What’s the difference between RSS growth and a memory leak?
RSS (Resident Set Size) growth after warm-up can mean three things: a genuine C-extension leak, unbounded Ruby object accumulation, or memory fragmentation. Check Ruby heap object counts first — if they’re stable but RSS grows, it’s either fragmentation (try jemalloc) or a native leak. If object counts grow proportionally with RSS, you have Ruby-level accumulation. The heap dump analysis technique in this guide distinguishes between these cases.
Should I use puma_worker_killer or just increase server RAM?
Use both, but treat puma_worker_killer as a safety net, not a solution. Adding RAM masks the problem and costs scale linearly — doubling RAM doubles your hosting bill. Worker killers keep your app stable while you fix the root cause. Set a generous RAM limit, enable rolling restarts, and use the monitoring in this guide to find and eliminate the underlying growth pattern.
About the Author
Roger Heykoop is a senior Ruby on Rails developer with 19+ years of Rails experience and 35+ years in software development. He specializes in Rails modernization, performance optimization, and AI-assisted development.
Get in TouchRelated Articles
Need Expert Rails Development?
Let's discuss how we can help you build or modernize your Rails application with 19+ years of expertise
Schedule a Free Consultation