Welcome to ModelRelay

ModelRelay is production infrastructure for AI agent systems. Build durable agent loops, orchestrate multi-agent workflows, and track usage per step—all through a single API.

What is ModelRelay?

ModelRelay is the infrastructure layer for production AI agents. It handles the reliability, observability, and billing layers so you can focus on your product logic.

  • Durable execution - Agent runs survive disconnects and transient failures.
  • Observability - Per-step usage, cost, and tool I/O for every agent loop.
  • Orchestration - Coordinate multiple agents with explicit DAG workflows.
  • Provider flexibility - Run the same agents across Anthropic, OpenAI, Google, and xAI.
  • Production-ready - Built-in billing, rate limiting, audit logs, and auth.

The Problem

Building agent systems into your product means solving multiple hard problems at once:

  • Provider complexity - Each AI provider has a different API, auth model, and pricing structure
  • Usage tracking - You need to know exactly how much each customer costs you
  • Billing integration - Turning AI usage into revenue requires Stripe integration, metering, and invoicing
  • Access control - Free users need limits, paid users need higher quotas, enterprise needs custom terms

Most teams cobble together agent frameworks, custom billing code, and provider SDKs. It works, but it’s months of engineering that isn’t your core product.

The Solution

ModelRelay handles the infrastructure so you can focus on your product:

graph LR
    A[Your App] --> B[ModelRelay]
    B --> C[Anthropic]
    B --> D[OpenAI]
    B --> E[Google AI]
    B --> F[xAI]
    B --> G[Customers & Billing]

Unified Provider Access

One API for all major AI providers. Switch models without changing code. Use Claude Opus 4 for complex reasoning, GPT-5.2 for vision, Gemini 3 for long context, Grok 4 for speed—whatever fits your use case.

Built-in Customer Management

Create customers, assign them to tiers, and track their usage automatically. Know exactly what each customer costs you before you send them an invoice.

Stripe-Native Billing

Checkout sessions, subscriptions, and usage-based billing out of the box. Connect your Stripe account and start monetizing AI features immediately.

Tiered Access Control

Define tiers with different limits and pricing. Free tier gets 100 requests/month. Pro gets 10,000. Enterprise gets unlimited. ModelRelay enforces the limits automatically.

Durable Agent Execution

Agentic loops run server-side with per-step logs, tool I/O capture, and automatic retries. Runs survive disconnects and provide full traceability.

Use Cases

  • AI copilots that must stay reliable under network instability or client disconnects
  • Research assistants that call tools and need per-step audit trails
  • Customer support agents with per-user billing and quota enforcement
  • Multi-agent pipelines that aggregate outputs into a final answer

Credential Types

  • Secret keys (mr_sk_*) - Full access for your backend
  • Customer tokens - Scoped to a single customer, perfect for customer requests

Quick Start

import { ModelRelay } from '@modelrelay/sdk';

const mr = new ModelRelay({ apiKey: process.env.MODELRELAY_API_KEY });

// Make an AI request
const answer = await mr.responses.text(
  'claude-sonnet-4-5',
  'You are a helpful assistant.',
  'What is the capital of France?'
);

console.log(answer);
// "The capital of France is Paris."

Get Started → — Set up your account and make your first request in 5 minutes.

Official SDKs

ModelRelay provides official SDKs with consistent APIs across languages:

Next Steps

  • Getting Started - Create your first project and make a request
  • Guides - Learn streaming, tool use, and structured output
  • API Reference - Complete endpoint documentation