OpenTelemetry in Rails 8: Set Up Production Observability in 30 Minutes

Add the opentelemetry-sdk gem to your Rails 8 app, configure auto-instrumentation, and you’ll have distributed traces flowing to your backend within 30 minutes. Here’s exactly how.

Why OpenTelemetry Instead of APM Vendor Lock-in

Traditional APM tools (New Relic, Datadog, Scout) require vendor-specific agents. When you want to switch providers — or send telemetry to multiple backends — you’re stuck rewriting instrumentation code. OpenTelemetry (OTel) is the CNCF standard that decouples instrumentation from export. You instrument once, then route signals wherever you want.

I switched a production Rails 8.0.1 app from Scout APM to OpenTelemetry last month. The migration took an afternoon, and the flexibility gain was immediate: traces go to Jaeger for development and Grafana Tempo in production, using the same instrumentation code.

Step 1: Add the Gems

In your Gemfile:

# OpenTelemetry core
gem "opentelemetry-sdk", "~> 1.4"
gem "opentelemetry-exporter-otlp", "~> 0.29"

# Auto-instrumentation (picks up Rails, ActiveRecord, etc.)
gem "opentelemetry-instrumentation-all", "~> 0.68"

Run bundle install. The opentelemetry-instrumentation-all meta-gem pulls in instrumentors for Rails, ActiveRecord, Action Pack, Action View, Net::HTTP, Faraday, Redis, Sidekiq, and about 30 others. If you prefer granular control, pick individual gems like opentelemetry-instrumentation-rails and opentelemetry-instrumentation-active_record.

Step 2: Configure the SDK

Create config/initializers/opentelemetry.rb:

require "opentelemetry/sdk"
require "opentelemetry/exporter/otlp"
require "opentelemetry/instrumentation/all"

OpenTelemetry::SDK.configure do |c|
  c.service_name = "my-rails-app"
  c.service_version = ENV.fetch("GIT_SHA", "unknown")

  c.use_all(
    "OpenTelemetry::Instrumentation::Rack" => {
      untraced_endpoints: ["/up", "/healthz"]
    },
    "OpenTelemetry::Instrumentation::ActiveRecord" => {
      db_statement: :obfuscate
    }
  )
end

A few things worth calling out:

db_statement: :obfuscate replaces query parameter values with ? in trace spans. Without this, your traces contain raw SQL with user data — a security and compliance problem. Don’t skip it.

untraced_endpoints filters out health check noise. Kubernetes probes hitting /up every 10 seconds generate thousands of useless spans per day. Exclude them.

service_version tied to GIT_SHA lets you correlate performance regressions to specific deploys. If you’re deploying with Kamal 2, pass the SHA as an environment variable in your deploy config.

Step 3: Set Environment Variables

The OTLP exporter reads configuration from environment variables by default:

# Where to send telemetry
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318

# Resource attributes (shows up in your backend UI)
OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,host.name=web-1

If you’re running a local collector during development:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_TRACES_EXPORTER=otlp
OTEL_LOG_LEVEL=debug

The debug log level prints every exported span to stdout — useful for verifying instrumentation works, noisy enough that you’ll want to turn it off fast.

Step 4: Add Custom Spans Where It Matters

Auto-instrumentation covers HTTP requests, database queries, cache operations, and template rendering. But your most interesting performance data lives in application-specific code: payment processing, PDF generation, external API calls to services without HTTP instrumentation.

class InvoiceGenerator
  def generate(order)
    tracer = OpenTelemetry.tracer_provider.tracer("invoice-generator")

    tracer.in_span("generate_invoice", attributes: {
      "order.id" => order.id,
      "order.line_items" => order.line_items.count
    }) do |span|
      pdf = render_pdf(order)

      span.set_attribute("invoice.pages", pdf.page_count)
      span.set_attribute("invoice.size_bytes", pdf.bytesize)

      upload_to_s3(pdf)
    end
  end
end

Keep custom span names short and consistent. Use dot notation for attributes (order.id, not orderId or order_id). The OpenTelemetry semantic conventions define standard attribute names for common operations — follow them where applicable.

Step 5: Run a Local Collector for Development

Don’t point your development Rails server directly at a production Grafana instance. Run a local OpenTelemetry Collector that exports to Jaeger:

# docker-compose.otel.yml
services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.96.0
    ports:
      - "4318:4318"   # OTLP HTTP receiver
      - "4317:4317"   # OTLP gRPC receiver
    volumes:
      - ./otel-collector-config.yml:/etc/otelcol-contrib/config.yaml

  jaeger:
    image: jaegertracing/all-in-one:1.54
    ports:
      - "16686:16686" # Jaeger UI
      - "4317"        # Receives from collector

Collector config:

# otel-collector-config.yml
receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch:
    timeout: 5s
    send_batch_size: 1024

exporters:
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/jaeger]

Run docker compose -f docker-compose.otel.yml up, start your Rails server, make some requests, and open Jaeger at http://localhost:16686. You’ll see traces with spans for every controller action, database query, and view render.

Step 6: Production Collector Architecture

In production, the collector sits between your app and your observability backend. This gives you:

Buffering — the collector batches and retries, so your app never blocks on telemetry export
Sampling — drop low-value traces before they reach your paid backend
Routing — send traces to Tempo, metrics to Prometheus, logs to Loki, all from one pipeline

For a Rails app handling 500 requests/second, I use tail-based sampling in the collector to keep 100% of error traces and slow requests (>2s) while sampling 10% of successful fast requests. This cuts storage costs by roughly 80% without losing the data that actually matters.

processors:
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: errors
        type: status_code
        status_code: {status_codes: [ERROR]}
      - name: slow-requests
        type: latency
        latency: {threshold_ms: 2000}
      - name: baseline
        type: probabilistic
        probabilistic: {sampling_percentage: 10}

Connecting to Your Existing Stack

If you’ve already set up GitHub Actions CI/CD for your Rails app, you can add a step to verify OpenTelemetry initialization in your test suite:

# test/test_helper.rb
require "opentelemetry/sdk"

# Use in-memory exporter for tests
OpenTelemetry::SDK.configure do |c|
  c.service_name = "my-rails-app-test"
  c.add_span_processor(
    OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor.new(
      OpenTelemetry::SDK::Trace::Export::InMemorySpanExporter.new
    )
  )
end

This lets you assert that specific operations create the expected spans without hitting any external service.

For background job monitoring, the Sidekiq and Solid Queue auto-instrumentors automatically create parent spans for each job execution, linking them back to the web request that enqueued the job. If a Turbo Frames request triggers a background job, you’ll see the full chain from browser click to job completion in a single trace.

Performance Impact

On the production app I mentioned earlier (Rails 8.0.1, Ruby 3.3.6, ~500 req/s): OpenTelemetry auto-instrumentation adds roughly 1-2ms of overhead per request. Memory footprint increased by about 15MB per Puma worker. The OTLP exporter batches spans and sends them asynchronously, so export latency doesn’t affect request response times.

If you’re running GC-tuned Puma workers, account for the additional object allocations from span creation. In my benchmarks, OTel added ~200 allocations per request with auto-instrumentation enabled. Not zero, but well within acceptable bounds for the visibility you gain.

FAQ

Does OpenTelemetry work with Solid Queue in Rails 8?

Yes. The opentelemetry-instrumentation-active_job gem instruments any Active Job backend, including Solid Queue. Each job execution gets its own trace span linked to the parent request. Install it via the opentelemetry-instrumentation-all meta-gem or add it individually.

Can I use OpenTelemetry alongside an existing APM agent?

You can, but I wouldn’t recommend it long-term. Running both means double the instrumentation overhead and potentially conflicting monkey-patches. A better approach: migrate to OpenTelemetry and use the OTLP exporter to send data to your existing APM vendor (most now accept OTLP natively — Datadog, New Relic, Honeycomb, and Grafana Cloud all do).

How much does OpenTelemetry slow down my Rails app?

In my production measurements: 1-2ms per request with full auto-instrumentation and about 15MB additional memory per Puma worker. The async batch exporter ensures telemetry export doesn’t block request processing. If you need to reduce overhead further, disable instrumentors you don’t need or increase the batch export interval.

What’s the difference between opentelemetry-instrumentation-all and picking individual gems?

The -all meta-gem installs every available instrumentor (~30 gems). At boot, each instrumentor checks whether its target library is loaded and activates only if present. The downside: more gems in your bundle, slightly longer bundle install times. The upside: you never forget to instrument a new dependency. For most Rails apps, the meta-gem is the pragmatic choice.

Should I use the OTLP HTTP or gRPC exporter?

Use HTTP (opentelemetry-exporter-otlp gem, port 4318) unless you have a specific reason to use gRPC. HTTP is simpler to debug, works through more proxies and load balancers, and the performance difference is negligible for most Rails applications. The gRPC exporter exists for environments where you’re already running gRPC infrastructure and want connection multiplexing.

OpenTelemetry in Rails 8: Set Up Production Observability in 30 Minutes

Why OpenTelemetry Instead of APM Vendor Lock-in

Step 1: Add the Gems

Step 2: Configure the SDK

Step 3: Set Environment Variables

Step 4: Add Custom Spans Where It Matters

Step 5: Run a Local Collector for Development

Step 6: Production Collector Architecture

Connecting to Your Existing Stack

Performance Impact

FAQ

Does OpenTelemetry work with Solid Queue in Rails 8?

Can I use OpenTelemetry alongside an existing APM agent?

How much does OpenTelemetry slow down my Rails app?

What’s the difference between opentelemetry-instrumentation-all and picking individual gems?

Should I use the OTLP HTTP or gRPC exporter?

About the Author

Share this article

Related Articles

Kamal 2 Deploy Rails: Zero-Downtime Deployments Without Kubernetes

Docker Multi-Stage Builds for Rails 8: Cut Your Image Size by 60%

GitHub Actions for Rails in 2026: A CI/CD Pipeline That Actually Works

Need Expert Rails Development?