— 02 / AI engineering

AI inside your product. Done right.

RAG systems, agentic workflows, on-prem LLM serving, function-calling pipelines. We build AI features that hold up in production — not demos that hold up on Twitter.

Where AI actually pays

If the AI breaks, your product still works.

— 01

RAG over your own data

pgvector or Qdrant, chunking that actually respects the document, citations on every answer, evaluation harnesses so you know when retrieval regresses. No vendor lock-in to a managed vector DB you can't audit.

— 02

Agentic workflows that don't loop forever

Tool-using agents with budgets, timeouts, escape hatches, and an audit log of every call. We use Anthropic's Claude with MCP servers, OpenAI's tool-calling, or whatever the right shape is for the job.

— 03

On-prem LLM serving

vLLM on your GPUs, model selection, quantisation, batching, observability. For regulated industries that can't ship customer text to OpenAI or for cost reasons when volume is high.

— 04

The boring parts of AI ops

Prompt versioning, eval harnesses with real production traffic, regression detection, prompt-injection defences, PII redaction. The stuff that's the difference between "demoed it" and "running it for two years."

Last section. Then please call.

It's a phone call. That's the worst it can get.

No discovery deck. No 45-minute "qualification" call. 30 minutes, your problem, my opinion. If we're a fit, you'll know by minute 12.

Direct line — answered by Roger
+31 6 5123 6132
Mon–Fri, 09:00–18:00 CET · Currently available

OR
info@ttb.software