Building Pixevo: Engineering Challenges Behind an AI Image Platform

When someone types “a cat wearing a space suit on Mars” into Pixevo and gets a photorealistic image back in three seconds, they don’t think about what’s happening underneath. They shouldn’t have to. But building the platform that makes that happen? That required solving problems we hadn’t encountered in fifteen years of Rails development.

Pixevo is an AI image and video generation platform we built from scratch. It integrates over 50 AI models, processes thousands of generations daily, and ships features like visual workflow builders, batch processing, and a marketplace where users sell their creations. Here’s what we learned building it.

The Multi-Model Problem

Most AI image platforms pick one model and build around it. Midjourney uses their own model. DALL-E uses OpenAI’s. We wanted Pixevo to offer every major model — Flux, Imagen, Stable Diffusion, Kling for video, plus our own Nano Banana Pro — and let users switch between them seamlessly.

The challenge isn’t calling different APIs. That’s just HTTP. The challenge is that every model has different input formats, resolution constraints, generation times, pricing structures, and failure modes. Flux wants aspect ratios. Stable Diffusion wants exact pixel dimensions. Some models accept negative prompts. Others ignore them. Response times range from two seconds to two minutes.

We built a model adapter layer that normalizes all of this behind a consistent interface:

class Generation::Orchestrator
  def generate(prompt:, model:, params: {})
    adapter = ModelRegistry.adapter_for(model)
    normalized = adapter.normalize_params(params)
    
    adapter.validate!(normalized)
    
    result = with_retry(adapter.retry_policy) do
      adapter.generate(prompt: prompt, **normalized)
    end
    
    PostProcessor.new(result, target: params[:output_format]).process
  end
end

Each adapter handles its model’s quirks. The orchestrator doesn’t care whether it’s talking to a local GPU cluster or a remote API. New models get added by writing an adapter — we’ve shipped twelve since launch without touching the core generation code.

Real-Time Generation Updates

Users expect to see their image appear, not stare at a spinner for thirty seconds. Some models stream intermediate results. Others return nothing until they’re done. We needed a unified real-time experience regardless of the model.

We use Action Cable with a generation-specific channel that handles both streaming and polling models:

class GenerationChannel < ApplicationCable::Channel
  def subscribed
    stream_for current_user
  end

  def self.broadcast_progress(user, generation)
    broadcast_to(user, {
      id: generation.id,
      status: generation.status,
      progress: generation.progress_percentage,
      preview_url: generation.preview_url,
      final_url: generation.completed? ? generation.result_url : nil
    })
  end
end

For streaming models, we push preview frames as they arrive. For non-streaming models, the background job polls the provider and broadcasts progress estimates based on historical timing data. The frontend doesn’t know the difference — it just renders whatever comes down the socket.

The tricky part was progress estimation. A model might typically take twelve seconds, but if the provider is under load, it could take forty. We track P50 and P95 generation times per model and use a logarithmic curve for the progress bar. It starts fast, slows down, and never hits 100% until the image is actually done. Nobody notices, but it feels responsive.

The Workflow Engine

Pixevo’s visual workflow builder lets users chain AI operations: generate an image, upscale it, swap a face, remove the background, apply color correction. Each node in the workflow is an independent operation with its own inputs, outputs, and error handling.

Building a reliable DAG (Directed Acyclic Graph) execution engine inside a Rails app was the most complex piece. Each workflow step might take anywhere from one to sixty seconds. Steps can run in parallel if they don’t depend on each other. Any step can fail, and you need to be able to retry individual nodes without re-running the entire pipeline.

We modeled it as a state machine with persistent state:

class Workflow::Execution < ApplicationRecord
  has_many :node_executions, dependent: :destroy
  
  state_machine :status, initial: :pending do
    event :start do
      transition pending: :running
    end
    
    event :complete do
      transition running: :completed
    end
    
    event :fail do
      transition running: :failed
    end
  end

  def execute!
    start!
    ready_nodes.each { |node| WorkflowNodeJob.perform_later(self, node) }
  end

  def node_completed(node)
    node.dependents.each do |dependent|
      if dependent.dependencies_met?
        WorkflowNodeJob.perform_later(self, dependent)
      end
    end
    
    complete! if all_nodes_completed?
  end
end

Each node execution runs as an independent background job. When a node finishes, it checks which downstream nodes are now unblocked and enqueues them. This gives us automatic parallelization — if a workflow has three independent branches, they all run simultaneously.

The marketplace adds another dimension. Users publish workflows as templates, and other users can install and customize them. We had to separate the workflow definition (the template) from the execution (a specific run), handle versioning when creators update their workflows, and manage the economic model where creators earn credits from installs.

Image Processing at Scale

Generated images need post-processing before they reach users. We resize for thumbnails, strip metadata for privacy, convert formats, generate blurhash placeholders for lazy loading, and run quality checks. A single generation can produce six derivative images.

Processing these synchronously would kill response times. We built a pipeline using Active Storage with custom analyzers:

class ImageAnalyzer::Generation < ActiveStorage::Analyzer
  def metadata
    {
      width: image.width,
      height: image.height,
      format: image.format,
      blurhash: generate_blurhash,
      nsfw_score: content_safety_check,
      quality_score: assess_quality
    }
  end
end

The content safety check deserves mention. Every generated image runs through a moderation pipeline before it’s visible. This has to be fast (users are waiting) and accurate (false positives frustrate users, false negatives create legal exposure). We use a combination of model-level safety settings and our own classification layer.

Handling Provider Failures Gracefully

When you depend on external AI model providers, things break. Rate limits hit. Providers go down for maintenance. GPU clusters run out of capacity. One model might reject a prompt that another handles fine.

Our circuit breaker implementation tracks failure rates per provider and automatically reroutes:

class ProviderCircuitBreaker
  FAILURE_THRESHOLD = 5
  RESET_TIMEOUT = 30.seconds

  def call(provider, &block)
    if circuit_open?(provider)
      raise CircuitOpenError if no_fallback?(provider)
      return fallback_provider(provider).call(&block)
    end

    begin
      result = yield
      record_success(provider)
      result
    rescue ProviderError => e
      record_failure(provider)
      retry_with_fallback(provider, e, &block)
    end
  end
end

When Flux’s API starts returning errors, we stop hammering it and either queue the request for later or offer the user an alternative model. This happens transparently — the user sees “generation taking longer than usual” rather than an error page.

Database Design for Generations

Pixevo stores millions of generation records, each with metadata about the prompt, model, parameters, timing, and results. The naive approach — one fat table — would collapse under query pressure.

We partition generations by month and use a combination of PostgreSQL’s JSONB columns for flexible model-specific parameters and indexed columns for things we query frequently:

class CreateGenerations < ActiveRecord::Migration[8.0]
  def change
    create_table :generations, id: :uuid do |t|
      t.references :user, null: false
      t.string :model_key, null: false
      t.string :status, null: false, default: 'pending'
      t.text :prompt
      t.jsonb :params, default: {}
      t.jsonb :result_metadata, default: {}
      t.integer :generation_time_ms
      t.decimal :cost_credits, precision: 10, scale: 4
      
      t.timestamps
      t.index [:user_id, :created_at]
      t.index [:model_key, :status]
      t.index :created_at
    end
  end
end

The JSONB params column stores model-specific settings without requiring schema changes when we add models. Need to store LoRA weights for Stable Diffusion? It goes in params. Aspect ratio for Flux? Params. Camera settings for Nano Banana Pro’s JSON mode? Params. The database doesn’t care.

What We’d Do Differently

Three things stand out in retrospect.

First, we should have built the workflow engine as a separate service from day one. It grew organically inside the monolith, and extracting it later was painful. The execution state, the node graph, the marketplace — it’s complex enough to warrant its own bounded context.

Second, we underestimated the importance of generation caching. Identical prompts with identical parameters produce identical results from deterministic models. We now cache aggressively, but retrofitting a cache layer onto an existing generation pipeline is messier than building it in from the start.

Third, we should have invested in observability earlier. When a generation fails at 2 AM, you need to know whether it’s your code, the provider, or the user’s prompt. We added structured logging, distributed tracing, and provider-specific health dashboards — but only after spending too many mornings debugging production issues by reading raw logs.

The Stack

For anyone curious about the technology choices:

Backend: Ruby on Rails 8, PostgreSQL 16, Redis
Real-time: Action Cable with AnyCable for production
Background jobs: Solid Queue (migrated from Sidekiq)
Frontend: Hotwire with Stimulus controllers for the interactive bits
Image processing: libvips via ImageProcessing gem
Deployment: Kamal on Hetzner (EU data centers for GDPR compliance)
CDN: Cloudflare with R2 for image storage

Rails might seem like an unusual choice for an AI platform. It isn’t. The AI models run elsewhere — on GPU clusters behind APIs. Our job is orchestration, user management, billing, marketplace logic, and real-time communication. That’s exactly what Rails excels at.

Wrapping Up

Building Pixevo taught us that the hard part of an AI product isn’t the AI. It’s everything around it: the orchestration layer that makes multiple models feel like one platform, the real-time infrastructure that makes thirty-second generations feel instant, the failure handling that keeps users happy when providers go down, and the data model that stays flexible as the AI landscape shifts underneath you.

If you’re building something similar and want to talk architecture, get in touch. We’ve made enough mistakes to save you from the expensive ones.

Building Pixevo: Engineering Challenges Behind an AI Image Platform

The Multi-Model Problem

Real-Time Generation Updates

The Workflow Engine

Image Processing at Scale

Handling Provider Failures Gracefully

Database Design for Generations

What We’d Do Differently

The Stack

Wrapping Up

About the Author

Share this article

Related Articles

Rails Concerns: When They Clean Up Code and When They Create Hidden Complexity

Rails 8 Multiple Databases: Read Replicas, Sharding, and Automatic Role Switching

Custom Rack Middleware in Rails 8: A Practical Guide with Real Examples

Need Expert Rails Development?