FRACTIONAL CTO · 12 MIN READ · 13 APR 2026

What I Do First When I Inherit a Legacy Rails Codebase

A fractional CTO's practical framework for the first 30 days taking over someone else's Rails app. Where to look, what to fix, how to write an honest assessment.

Three weeks ago I walked into a codebase that had been built by four different agencies over seven years. The startup had burned through two CTOs, never established a tech lead, and was running an e-commerce platform doing two million euros a month on infrastructure nobody fully understood. The founder called it “a bit rough around the edges.” He was being generous.

The User model was 1,400 lines long. There were 23 active Sidekiq queues, three of which nobody could explain. A migration from 2021 had added a column called temp_status. It was being read in fourteen places.

This is what I get called in for.

After nineteen years of Rails, I’ve inherited enough of these codebases to have developed a system. Not a playbook I follow mechanically, but a set of questions in a specific order that tells me where the real problems are — and, crucially, which ones actually matter.

The Gemfile Is the First Chapter

Before I look at a single line of application code, I run:

bundle outdated
bundle exec ruby -v

The Gemfile is a historical document. It shows you what era the app was built in, which decisions were made under pressure and never revisited, and whether the team ever invested in routine maintenance.

A Rails 6.1 app with gems last updated in 2022 isn’t just technically outdated. It tells you that nobody has been doing routine maintenance. Security patches have been missed. The upgrade path compounds with each skipped version.

Look specifically for:

Gems with pinned versions and no comment explaining why. gem 'rails', '6.0.4' with no context is a red flag. Someone pinned it for a reason they didn’t document. That reason is now your problem.

Abandoned gems. Check the GitHub repo for anything in your Gemfile that touches core infrastructure. If the last commit was three years ago and there are 200 open issues, you’re carrying a dependency that will eventually strand you.

The authentication gem. Devise is everywhere. That’s fine. But I’ve walked into apps still running the clearance gem on a version from 2019, or custom authentication rolled during a panic with no tests. Your auth stack deserves thirty minutes of careful reading before anything else.

The Schema Is Archaeology

wc -l db/schema.rb

The raw line count in schema.rb isn’t a quality metric, but the content of the schema tells you everything about the data model’s history.

Things I look for immediately:

temp_ or old_ columns still in production. In seven years of auditing Rails apps, I have found at least one in almost every long-running codebase. These are columns added during a crisis as a stopgap and never cleaned up. They are often load-bearing in ways nobody planned.

Tables with more than 60 columns. This is a God table. It started as a clean model and absorbed responsibilities over time because it was easier to add a column than to model the problem correctly. A users table with 80 columns usually has half of those belonging to three or four different concerns.

Indexes — or the absence of them. schema.rb shows you exactly what’s indexed and what isn’t. A foreign_key_id column with no index on a table with millions of rows is an active production performance problem you can point to on day one.

Polymorphic associations. They’re not inherently wrong, but they signal complexity. Every imageable_type and imageable_id pair means you have a generic association that might be hiding messy domain logic.

Find the God Objects

Every long-running Rails app has them. Models that grew because it was convenient to put logic there. I run:

find app/models -name "*.rb" | xargs wc -l | sort -rn | head -10
find app/controllers -name "*.rb" | xargs wc -l | sort -rn | head -10

The top of that list is where I spend time first. A 1,400-line User model isn’t just a code smell. It’s a coordination problem. Every developer touching that file risks conflicts. Every change requires understanding too much context. It becomes load-bearing in ways nobody intended.

What I look for inside a God model:

Callbacks that modify other models. An after_save in User that touches Order records is a trap. It makes User aware of Order domain logic, creates invisible coupling, and makes debugging hell when orders behave unexpectedly. The symptom is usually a bug report that starts with “it only happens sometimes.”

# The kind of thing that will ruin your week
class User < ApplicationRecord
  after_save :sync_order_statuses  # why is this here???

  private

  def sync_order_statuses
    orders.pending.update_all(notified: false) if email_changed?
  end
end

Scopes that encode complex business logic. A named scope spanning eight lines with three joins and a subquery should be a query object. As a scope, it gets composed with other scopes in unpredictable ways and nobody can reason about it in isolation.

Methods that obviously don’t belong here. An invoice_pdf method on User that renders a PDF is a domain logic leak. The model has become a dumping ground because it was the path of least resistance.

I’m not going to refactor these immediately. But I need to know where they are.

The Tests Tell You What Anyone Trusted

bundle exec rails test 2>&1 | tail -5
# or
bundle exec rspec --format progress 2>&1 | tail -5

Does the test suite pass? Can I even run it? In roughly 30% of inherited codebases I’ve worked with, the test suite doesn’t run cleanly on a fresh checkout. That’s your first data point, and it’s a significant one.

After that, coverage:

grep -r "simplecov" Gemfile Gemfile.lock

If there’s no coverage tool configured, estimate coverage by comparing test file count to application file count. It won’t be precise, but “there are 15 test files for 80 model, service, and controller files” tells you something real and immediate.

The tests also reveal which parts of the codebase the team trusted versus which parts they were afraid of. Heavy test coverage on payment processing and light coverage on notifications — that’s a team that understood their risk surface. Zero coverage anywhere — that’s a team that was always sprinting.

The Deployment Pipeline Is a Culture Signal

How does code get to production?

If the answer is “we SSH into the server and run git pull,” I know a lot about this organization without asking another question. It tells me about risk tolerance, about whether deployments are feared events, and about what the rollback story is when something goes wrong at 6 PM on a Friday.

What I want to see:

Automated tests running on every pull request
A branch-to-deploy workflow where main deploys automatically
Zero-downtime deployment — Kamal, Heroku, Fly.io, or equivalent
A way to roll back without requiring manual database intervention

That last one is the test most teams fail. If rolling back means SSHing into production to manually reverse a migration, you don’t have a deployment pipeline. You have a deployment ceremony.

Writing the Honest Assessment

This is where being a fractional CTO diverges from being a consultant who tells clients what they want to hear.

After a week of investigation, I write a document structured like this:

What’s working well. Always start here, and be genuine. Every codebase has parts the team got right. Identifying these matters because it tells you which patterns to reinforce, and it shows respect for the people who built this under real constraints.

Critical risks. Things that could cause data loss, a security breach, or extended downtime. These go first regardless of effort required. A hard-coded API key in the codebase — yes, I’ve found them — is a one-day fix that protects the business. It goes to the top of the list.

Performance problems with measurable business impact. Slow queries on high-traffic endpoints. Memory bloat causing periodic process restarts. These have measurable costs and usually have measurable solutions. They’re easier to prioritize because you can show the number.

Technical debt ranked by how much it slows current work. Not everything broken needs fixing. A God model in a part of the system that hasn’t changed in two years is lower priority than a God model in the middle of active feature development. Rank debt by its friction cost, not by its aesthetic offense.

What I’m not going to fix. The hardest paragraph to write. Every codebase has legacy patterns that are technically wrong but functionally stable. Rewriting the polymorphic notification system might be the right call in six months. It’s not the right call this week when you have a roadmap to ship. Say this explicitly — if you don’t, stakeholders will wonder why it isn’t in the plan.

The First Three Fixes

After the assessment comes prioritization. In seven years of this work, the first fixes are almost always the same three things:

Add error tracking if it’s missing. Honeybadger, Sentry, or Rollbar — pick one. You cannot fix what you cannot see. This takes an afternoon and immediately changes what conversations you can have about production behavior. Before error tracking, every incident starts with “we’re getting reports that something is broken.” After, it starts with a stack trace.

Establish a deployment pipeline if there isn’t one. Even a minimal GitHub Actions workflow that runs the test suite and deploys to staging on merge to main is a step change in team confidence. It doesn’t need to be sophisticated. It needs to exist. The moment deployments stop being manual events, the team starts making more frequent, smaller changes — which is how you actually reduce risk.

Write down what you know. A docs/architecture.md that explains what the major components do, what the key data flows are, and where the known landmines are. This is not glamorous work. It’s the work that makes everything else possible. Six months from now, you’ll be glad it exists. So will the next engineer who needs to onboard.

The codebase with 23 Sidekiq queues, a 1,400-line User model, and three columns that started as temp_fix is not hopeless. I’ve seen worse in production, serving real customers, making real money. These systems are messy because real people with real constraints built them under pressure with incomplete information. The job isn’t to judge how it got here. The job is to understand it clearly, communicate that understanding honestly, and then make it incrementally better until the risk comes down to a level you can live with.

Frequently Asked Questions

How long does a proper codebase audit take?

A surface-level audit — enough to write a meaningful written assessment — takes three to five days. A deep audit of a complex application, including tracing critical code paths and running performance analysis against production-scale data, takes one to two weeks. The difference matters when you’re scoping an engagement with a client.

Should I rewrite or refactor?

Almost always refactor. The “big rewrite” is one of the most reliably bad ideas in software engineering. You spend months rebuilding functionality that already exists, without the institutional knowledge of why the original code worked the way it did, and you often ship something with new bugs that the original system had learned to work around. The exception is when the original system is so fundamentally misaligned with current needs that there is nothing worth preserving. This is rarer than teams think.

How do I handle a team that’s defensive about the existing code?

Acknowledge the constraints they worked under. The 1,400-line User model didn’t happen because the team was careless — it happened because they were moving fast with limited resources and made locally-rational decisions. Name that explicitly. Then focus the conversation on what the code needs to do going forward, not on how it got here. People defend code they feel is being judged. They collaborate on code they feel is being understood.

Inheriting a Rails codebase and not sure where to start? TTB Software does technical due diligence and legacy Rails assessments. After nineteen years in this industry, we’ve seen — and dug out of — most of the situations you’re facing.

#rails #fractional-cto #legacy-code #technical-debt #code-review #architecture

FRACTIONAL CTO