Build vs Buy Software: A Fractional CTO's Framework for Engineering Decisions
Build vs buy software decisions can sink a startup. A fractional CTO's framework for evaluating vendors, total cost, and what to keep in-house.
A Series A founder pulled me into a Friday afternoon meeting last October with a question that felt urgent: should they build their own analytics pipeline? His head of growth wanted Mixpanel for product analytics, his head of finance wanted Looker for revenue dashboards, and his lead engineer had a half-finished internal tool that “would do both and only takes another sprint to finish.” The combined SaaS bill was projected at €74,000 a year. The internal tool, on paper, would cost nothing. The founder was leaning toward the build. I asked him one question: who is going to be on call for that pipeline at 3 AM during a product launch? The room went quiet. We bought Mixpanel and Looker that afternoon.
After nineteen years of building Rails systems and the last six advising startups as a fractional CTO, I have seen the build vs buy software decision wreck more roadmaps than almost any other category of mistake. Engineers want to build because building is interesting. Founders want to build because vendor bills look big and direct salaries feel like investments. CFOs want to buy because SaaS contracts are predictable. None of those instincts is correct on its own. This post is the framework I use with every client, in the order I actually apply it.
Why Build vs Buy Software Is a CTO-Level Decision
Every “should we build this” conversation eventually reaches a fork: the engineers will tell you it’s two weeks of work, the vendor will quote you a number that feels expensive, and someone in the room will say “well, if it’s only two weeks…” That is the moment the decision goes wrong. The two-week estimate is almost always honest about the first version. It is almost never honest about the next five years of maintenance, the on-call rotation, the security patches, the compliance audits, the integrations, the docs, and the engineer-hours that get pulled away from the actual product.
A fractional CTO earns their fee in these conversations. The job is not to say no to building. The job is to make the true cost of building visible, compare it to the true cost of buying, and route the decision back to the strategic question: does this thing differentiate us, or is it plumbing? Most companies build the plumbing and buy the differentiator. They have it exactly backwards.
The build vs buy software question shows up in obvious places — analytics, CRM, billing, auth — and in non-obvious ones: feature flags, search infrastructure, image processing, email sending, observability, customer support, document generation, scheduled job orchestration. Each of those is a category where mature vendors exist, and each is a category where Rails teams routinely roll their own and regret it within eighteen months.
The Five Questions I Ask Before Anyone Writes Code
Before I let a team start a serious build, I make them answer five questions in writing. If they cannot answer all five clearly, the answer is buy. If they can answer all five well, the answer is probably still buy — but at least we are having a real conversation.
1. Is this our core differentiator? If a customer would choose us over a competitor specifically because we built this thing in-house, build it. If they would not notice or care which vendor we use, buy it. Stripe did not build a custom Rails application server. Linear did not build a custom Postgres. They built the things that made them different.
2. What is the five-year total cost of ownership? Not the build cost. The build cost plus maintenance plus on-call plus security plus the engineers who quit because they hate maintaining the internal billing tool. I assume 30 percent of the initial build cost annually for five years as a floor. That is conservative.
3. Who owns it when the person who built it leaves? Internal tools accumulate tribal knowledge faster than any other code in the codebase. Two years after launch, exactly one person understands the message queue retry logic. They are now your single point of failure, and they know it.
4. What does the failure mode cost us? If this system goes down, what happens? If our internal feature-flag service breaks at 2 AM, every product surface goes dark. If LaunchDarkly goes down, we wake up to a status page and a postmortem they wrote.
5. Are we sure we understand the problem? Most “we should build this” pitches come from someone who has used one or two vendor products and found them lacking. They have not used the other six. They do not know about the integrations, the audit logs, the SOC 2 reports, or the seventh edge case the vendor handled three years ago.
A Real Example: Feature Flags
Two years ago a Series B fintech client decided to build their own feature-flag system in Rails. Their lead engineer had used Flipper and thought he could do better with a custom Redis-backed implementation. The first version shipped in three weeks and looked clean.
class FeatureFlag
def self.enabled?(name, user)
return false unless flag = $redis.get("flag:#{name}")
flag_data = JSON.parse(flag)
return true if flag_data["enabled_for_all"]
flag_data["enabled_users"].include?(user.id)
end
def self.enable_for(name, user)
flag_data = JSON.parse($redis.get("flag:#{name}") || "{}")
flag_data["enabled_users"] ||= []
flag_data["enabled_users"] << user.id
$redis.set("flag:#{name}", flag_data.to_json)
end
end
Six months later it had grown to 400 lines and counting. They needed percentage rollouts, then organization-level rollouts, then time-windowed rollouts, then geographic targeting, then an audit log because compliance asked, then a UI because PM was tired of editing Redis directly, then SDK clients for the React frontend, then a feature flag for the feature flag system because they were terrified of changing it. Total invested: about €180,000 in engineer time across eighteen months.
LaunchDarkly would have cost them €18,000 a year and shipped every one of those capabilities on day one. They eventually migrated. The custom system survives only as a dependency they cannot easily remove.
The lesson is not that feature flags are too hard to build. They are not. The lesson is that the problem they were actually solving was “make it easy for product to ship safely,” and a vendor had spent ten years solving exactly that problem. Building solved 20 percent of the real problem and called it done.
When Building Actually Wins
Buying is the default in my framework, but it is not the answer to every question. There are scenarios where building is clearly correct, and I want to be specific about them.
The vendor does not exist at your scale. If you are processing 50 billion events a month and the SaaS pricing model breaks above one billion, you have outgrown the market. This is rare and you will know when it happens.
The integration cost dominates the build cost. Sometimes the vendor exists, but integrating it into your specific data model is harder than just writing the feature yourself. This is most common with legacy systems that have non-standard data shapes.
It is genuinely a competitive moat. If your AI inference pipeline, your matching algorithm, or your fraud detection model is what customers pay for, you build. You do not let a vendor own your moat.
Compliance or data residency makes vendors impossible. Healthcare and EU public-sector work sometimes have residency requirements that no SaaS can meet. In that case you build, but you build deliberately and you staff for maintenance.
You are vendor-replacing for strategic reasons. Sometimes you start on a vendor, you learn the problem deeply, and then you build a tailored version. This is the right path: buy to learn, build to optimize.
The Hybrid Pattern: Buy the Engine, Build the Interface
The most resilient pattern I deploy with clients is hybrid. Use vendors for the engine — the hard infrastructure work — and build a thin Rails layer that wraps them. Your customers see your interface. Your team owns the customer-facing logic. The vendor handles the parts that are commodity.
class BillingService
def self.create_subscription(user:, plan:, trial_days: 14)
stripe_subscription = Stripe::Subscription.create(
customer: user.stripe_customer_id,
items: [{ price: plan.stripe_price_id }],
trial_period_days: trial_days,
metadata: { user_id: user.id, plan_code: plan.code }
)
user.subscriptions.create!(
stripe_subscription_id: stripe_subscription.id,
plan: plan,
status: stripe_subscription.status,
trial_ends_at: Time.at(stripe_subscription.trial_end || 0)
)
end
end
Stripe owns the credit-card form, the PCI scope, the bank integrations, the dunning logic, the tax calculations, and the failed-payment retries. The Rails app owns the subscription model, the entitlement logic, the customer-facing dashboard, and the business rules that make the company specific. If Stripe ever becomes the wrong vendor, the surface area you have to migrate is intentional and limited.
This is also the pattern I recommend for AI workloads. Anthropic owns the model. Your Rails app owns the prompt engineering, the conversation state, the retrieval pipeline, and the integration with your domain data. You do not train your own foundation model. You also do not let a vendor own your conversation history schema.
The Build vs Buy Software Decision Matrix
When I present the framework to a board, I use a simple matrix with two axes: strategic value to the business (low/high) and switching cost (low/high). The four quadrants give clear default answers.
Low value, low switching cost — buy whatever is cheapest. Status pages, error tracking, marketing email. The choice does not matter much; the cost of changing later is low.
Low value, high switching cost — buy carefully and negotiate hard. Billing, CRM, HRIS. These are commodities, but they will be embedded forever, so vendor stability and pricing escalation clauses matter.
High value, low switching cost — buy the best, even if it costs more. Customer support, observability, product analytics. The quality of the tool affects how your team operates daily, but you can change vendors if you outgrow them.
High value, high switching cost — this is the build candidate. Your core product, your differentiating algorithms, your competitive moat. Build deliberately, staff appropriately, and never let it become someone’s side project.
Most “should we build this” debates resolve themselves when you place the thing on this matrix honestly. The trap is that engineers see everything as high-value and high-switching-cost because they want to build it. The fractional CTO’s job is to push back on the value claim with evidence.
The Cost Conversation Founders Get Wrong
I once watched a founder reject a €30,000-per-year SaaS contract and approve €240,000 of engineering salary to build the equivalent in-house, because “salary is an investment and SaaS is an expense.” The accounting was clean. The strategy was disastrous. He paid eight times more for the same outcome, and now had to maintain it forever.
The cost framing that actually works:
Direct vendor cost is the SaaS bill plus integration time. Include the engineer-weeks to set up the vendor properly and the recurring maintenance for SDK upgrades and API changes.
Direct build cost is the engineer-weeks to ship the first usable version, multiplied by 1.5 because the estimate is wrong, plus 30 percent of that annually for maintenance, plus the opportunity cost of what those engineers would have shipped instead.
Indirect costs include on-call burden, security review overhead, compliance audit work, hiring difficulty (vendors with skills are easier to hire for than your bespoke internal tool), and the cognitive load on the entire engineering team.
Risk-adjusted cost is the probability the build fails to deliver multiplied by the cost of the failure. Internal tools fail to deliver more often than founders admit. The most common failure is not catastrophic — it is a half-finished tool that nobody dares to throw away.
When I present these four numbers side by side, the “expensive” SaaS contract is usually two to four times cheaper than the “free” internal build over five years. Pair this with the technical debt prioritization framework you should already be applying — internal tools become technical debt the day they ship.
What I Tell Engineers Who Want to Build
The hardest conversation is not with the founder. It is with the senior engineer who genuinely wants to build the thing, has a good design in mind, and is technically capable of executing it. They are not wrong about being able to do it. They are wrong about it being a good use of their time.
The framing that lands: every hour you spend on the internal feature-flag system is an hour you do not spend on the product feature that wins us the next customer. Vendors exist precisely to free up your time for the work that only you can do. Building plumbing is not why we hired senior engineers. We hired senior engineers to ship product.
This usually works. When it does not, I let them prototype on a Friday and then we look at the prototype together on Monday with the five questions in hand. The framework is its own argument.
FAQ
When does it make sense to build instead of buy software?
Build when the system is a competitive differentiator, when vendors cannot meet your scale or compliance requirements, or when the integration cost of using a vendor exceeds the build cost. For commodity infrastructure — billing, auth, feature flags, analytics, observability — buy.
How do I calculate the true cost of building software in-house?
Start with the engineer-weeks to ship version one, multiply by 1.5 to account for estimate slippage, add 30 percent of that annually for maintenance for five years, then add opportunity cost (what those engineers would have shipped for paying customers instead). Compare that fully-loaded number to the five-year SaaS contract, not the first year.
What’s the biggest mistake startups make with build vs buy decisions?
Treating engineering salaries as fixed costs and SaaS as variable costs. Engineering time is the most expensive and most constrained resource in any startup. Every hour spent maintaining an internal tool is an hour not spent on the product. The accounting framing makes building look free; the strategic framing shows it almost never is.
How does a fractional CTO help with build vs buy decisions?
A fractional CTO has seen the same decision play out across dozens of companies and knows the long-tail costs that internal advocates always underestimate. They also have the seniority to overrule a senior engineer’s enthusiasm without the political cost a full-time CTO would face. The framework in this post is what they bring to the table.
Need help framing a build vs buy decision before your next sprint? TTB Software advises startups and scale-ups as a fractional CTO partner, applying this exact framework to Rails architectures and vendor selection. We’ve been doing this for nineteen years.
Related Articles
Senior Rails Engineer Interview: A Fractional CTO's Hiring Rubric for 2026
Senior Rails Engineer Interview rubric from a fractional CTO with 19 years of Rails. The questions, scoring, signals ...
What I Do First When I Inherit a Legacy Rails Codebase
A fractional CTO's practical framework for the first 30 days taking over someone else's Rails app. Where to look, wha...
Technical Due Diligence on a Rails Codebase: What I Actually Check
Before you acquire a company or sign on as fractional CTO, here's the Rails codebase audit I run every time — and wha...