TypeScript SDK

The official TypeScript SDK for ModelRelay. Works with Node.js, Bun, Deno, and modern browsers.

Installation

npm install @modelrelay/sdk

Or with other package managers:

bun add @modelrelay/sdk
pnpm add @modelrelay/sdk
yarn add @modelrelay/sdk

Quick Start

import { ModelRelay } from "@modelrelay/sdk";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);

const answer = await mr.responses.text(
  "claude-sonnet-4-5",
  "You are a helpful assistant.",
  "What is the capital of France?"
);

console.log(answer);
// "The capital of France is Paris."

Convenience API

The simplest way to get started. Three methods cover the most common use cases:

Ask — Get a Quick Answer

import { ModelRelay } from "@modelrelay/sdk";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);

const answer = await mr.ask("claude-sonnet-4-5", "What is 2 + 2?");
console.log(answer); // "4"

Chat — Full Response with Metadata

const response = await mr.chat("claude-sonnet-4-5", "Explain quantum computing", {
  system: "You are a physics professor",
});

console.log(response.output);
console.log("Tokens:", response.usage.totalTokens);

Agent — Agentic Tool Loops

Run an agent that automatically executes tools until completion:

import { z } from "zod";

const tools = mr
  .tools()
  .add(
    "read_file",
    "Read a file from the filesystem",
    z.object({ path: z.string().describe("File path to read") }),
    async (args) => {
      const content = await fs.readFile(args.path, "utf-8");
      return content;
    }
  );

const result = await mr.agent("claude-sonnet-4-5", {
  tools,
  prompt: "Read config.json and summarize it",
  system: "You are a helpful file assistant",
});

console.log(result.output);
console.log("Tool calls:", result.usage.toolCalls);

Configuration

From API Key

The simplest way to create a client:

import { ModelRelay } from "@modelrelay/sdk";

// From secret key (backend use)
const mr = ModelRelay.fromSecretKey("mr_sk_...");

// From API key string
const mr = ModelRelay.fromApiKey(process.env.MODELRELAY_API_KEY!);

Full Configuration

For more control over client behavior:

import { ModelRelay, parseSecretKey } from "@modelrelay/sdk";

const mr = new ModelRelay({
  key: parseSecretKey("mr_sk_..."),
  baseUrl: "https://api.modelrelay.ai",  // Optional, defaults to production
  timeoutMs: 30_000,                      // Request timeout
  connectTimeoutMs: 10_000,               // Connection timeout
  retry: {
    maxAttempts: 3,                       // Retry failed requests
  },
});

Configuration Options

Option	Type	Default	Description
`key`	`string`	—	Secret API key (`mr_sk_*`)
`baseUrl`	`string`	`https://api.modelrelay.ai`	API base URL
`timeoutMs`	`number`	`120000`	Request timeout in milliseconds
`connectTimeoutMs`	`number`	`10000`	Connection timeout in milliseconds
`retry.maxAttempts`	`number`	`0`	Number of retry attempts

Making Requests

Simple Text Response

For the common “system prompt + user message → text” pattern:

const answer = await mr.responses.text(
  "claude-sonnet-4-5",  // model
  "You are a helpful assistant.",  // system prompt
  "What is 2 + 2?"  // user message
);

Request Builder

For more control over request parameters:

const response = await mr.responses.create(
  mr.responses
    .new()
    .model("claude-sonnet-4-5")
    .system("You are a helpful assistant.")
    .user("What is 2 + 2?")
    .maxOutputTokens(256)
    .temperature(0.7)
    .build()
);

console.log(response.output);  // Array of output items
console.log(response.usage);   // { input_tokens, output_tokens }

Customer-Attributed Requests

For metered billing, attribute requests to customers:

// Option 1: Use customerId in the request
const req = mr.responses
  .new()
  .customerId("cust_abc123")
  .system("You are a helpful assistant.")
  .user("Hello!")
  .build();

const response = await mr.responses.create(req);

// Option 2: Use the convenience method
const answer = await mr.responses.textForCustomer({
  customerId: "cust_abc123",
  system: "You are a helpful assistant.",
  user: "Hello!",
});

// Option 3: Create a customer-scoped client
const customer = mr.forCustomer("cust_abc123");
const answer = await customer.responses.text(
  "You are a helpful assistant.",
  "Hello!"
);

Streaming

Stream Text Deltas

For real-time response streaming:

const stream = await mr.responses.streamTextDeltas(
  "claude-sonnet-4-5",
  "You are a helpful assistant.",
  "Write a haiku about programming."
);

for await (const delta of stream) {
  process.stdout.write(delta);
}

Full Event Stream

For access to all streaming events:

const req = mr.responses
  .new()
  .model("claude-sonnet-4-5")
  .user("Hello!")
  .build();

const stream = await mr.responses.stream(req);

for await (const event of stream) {
  switch (event.type) {
    case "message_start":
      console.log("Started:", event.message.id);
      break;
    case "message_delta":
      if (event.textDelta) {
        process.stdout.write(event.textDelta);
      }
      break;
    case "message_complete":
      console.log("\nUsage:", event.message.usage);
      break;
  }
}

Collect Stream to Response

To aggregate an entire stream into a final response object:

const stream = await mr.responses.stream(req);
const response = await stream.collect();

// response is now a complete Response object
console.log(response.output);
console.log(response.usage);

This is useful when you want streaming progress indicators but need the complete response for further processing.

Sessions

Sessions manage multi-turn conversations with automatic tool handling.

import { ModelRelay } from "@modelrelay/sdk";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);
const session = mr.sessions.createLocal({
  defaultModel: "gpt-5.2",
});

const result = await session.run("Summarize the last meeting.", {
  contextManagement: "truncate",
});

console.log(result.output);

Context Management

When contextManagement is "truncate", the SDK trims older messages to fit
within the model’s context window. It fetches model metadata from /models to
derive a default history budget. You can override the budget with
maxHistoryTokens. A model is required (set defaultModel or options.model).
Use reserveOutputTokens to override the output-token reservation when
computing the history budget. If maxHistoryTokens is set, it takes precedence.

const result = await session.run("Continue the plan.", {
  contextManagement: "truncate",
  maxHistoryTokens: 8_000,
});

"summarize" is reserved for a future release and currently throws an error.

You can also observe truncation decisions:

await session.run("Continue the plan.", {
  contextManagement: "truncate",
  maxHistoryTokens: 8_000,
  onContextTruncate: ({ originalMessages, keptMessages }) => {
    console.log(`Trimmed ${originalMessages} → ${keptMessages} messages`);
  },
});

Syncing Local to Remote

You can sync a local session to a remote session for cross-device access,
team collaboration, or server-side backup:

// Work locally for privacy
const local = mr.sessions.createLocal({
  defaultModel: "gpt-5.2",
});

await local.run("Implement the authentication feature");
await local.run("Now add the logout button");

// Later, sync to remote for sharing/backup
const remote = await mr.sessions.createRemote();
const result = await local.syncTo(remote, {
  onProgress: (synced, total) => {
    console.log(`Syncing: ${synced}/${total} messages`);
  },
});

console.log(`Synced ${result.messagesSynced} messages to ${result.remoteSessionId}`);

Structured Output

With Zod Schemas

Parse responses into typed objects:

import { ModelRelay } from "@modelrelay/sdk";
import { z } from "zod";

const mr = ModelRelay.fromSecretKey("mr_sk_...");

const Person = z.object({
  name: z.string(),
  age: z.number(),
});

// Simple one-call API (recommended)
const person = await mr.responses.object<z.infer<typeof Person>>({
  model: "claude-sonnet-4-5",
  schema: Person,
  prompt: "Extract: John Doe is 30 years old",
});

console.log(person.name);  // "John Doe"
console.log(person.age);   // 30

For parallel structured output:

const [security, performance] = await Promise.all([
  mr.responses.object<SecurityReview>({
    model: "claude-sonnet-4-5",
    schema: SecuritySchema,
    system: "You are a security expert.",
    prompt: code,
  }),
  mr.responses.object<PerformanceReview>({
    model: "claude-sonnet-4-5",
    schema: PerformanceSchema,
    system: "You are a performance expert.",
    prompt: code,
  }),
]);

For metadata (attempts, request ID) or more control:

const result = await mr.responses.objectWithMetadata<z.infer<typeof Person>>({
  model: "claude-sonnet-4-5",
  schema: Person,
  prompt: "Extract: John Doe is 30 years old",
  maxRetries: 2,
});

console.log(result.value);     // { name: "John Doe", age: 30 }
console.log(result.attempts);  // 1 (first try succeeded)
console.log(result.requestId); // Server request ID

Streaming Structured Output

Build progressive UIs that render fields as they complete:

import { z } from "zod";

const Article = z.object({
  title: z.string(),
  summary: z.string(),
  body: z.string(),
});

const stream = await mr.responses.streamStructured(
  Article,
  mr.responses
    .new()
    .model("claude-sonnet-4-5")
    .user("Write an article about TypeScript")
    .build()
);

for await (const event of stream) {
  // Render fields as soon as they're complete
  if (event.completeFields.has("title")) {
    renderTitle(event.payload.title);
  }
  if (event.completeFields.has("summary")) {
    renderSummary(event.payload.summary);
  }
  // Show streaming preview of incomplete fields
  if (!event.completeFields.has("body")) {
    renderBodyPreview(event.payload.body + "▋");
  }
}

Error Handling

Error Types

The SDK provides typed errors for different failure modes:

import {
  ModelRelay,
  ModelRelayError,
  APIError,
  ConfigError,
  TransportError,
  ErrorCodes,
} from "@modelrelay/sdk";

try {
  const answer = await mr.responses.text(
    "claude-sonnet-4-5",
    "You are helpful.",
    "Hello!"
  );
} catch (error) {
  if (error instanceof APIError) {
    // Server returned an error response
    console.error(`API error ${error.status}: ${error.message}`);
    console.error(`Code: ${error.code}`);
    console.error(`Request ID: ${error.requestId}`);

    // Check specific error types
    if (error.isRateLimit()) {
      console.error("Rate limited, retry later");
    }
    if (error.isUnauthorized()) {
      console.error("Invalid API key");
    }
  } else if (error instanceof TransportError) {
    // Network or connection error
    console.error(`Transport error: ${error.message}`);
  } else if (error instanceof ConfigError) {
    // Invalid configuration
    console.error(`Config error: ${error.message}`);
  } else {
    throw error;
  }
}

Error Codes

Use error codes for programmatic handling:

import { ErrorCodes } from "@modelrelay/sdk";

if (error instanceof APIError) {
  switch (error.code) {
    case ErrorCodes.RATE_LIMIT:
      // Back off and retry
      break;
    case ErrorCodes.UNAUTHORIZED:
      // Re-authenticate
      break;
    case ErrorCodes.NOT_FOUND:
      // Resource doesn't exist
      break;
    case ErrorCodes.VALIDATION_ERROR:
      // Check request parameters
      console.error("Field errors:", error.fields);
      break;
  }
}

Customer Management

For customer CRUD operations, use the REST API directly. See Customer API Reference for details.

// Create or update a customer using fetch
const response = await fetch("https://api.modelrelay.ai/api/v1/customers/upsert", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-ModelRelay-Api-Key": process.env.MODELRELAY_SECRET_KEY!,
  },
  body: JSON.stringify({
    external_id: "your-user-id",
    email: "user@example.com",
  }),
});
const customer = await response.json();

Tool Use

Define tools that models can call:

import {
  ModelRelay,
  createFunctionTool,
  hasToolCalls,
  firstToolCall,
  toolResultMessage,
  zodToJsonSchema,
} from "@modelrelay/sdk";
import { z } from "zod";

const weatherSchema = z.object({
  location: z.string().describe("City name"),
});

const weatherTool = createFunctionTool({
  name: "get_weather",
  description: "Get current weather for a location",
  parameters: zodToJsonSchema(weatherSchema),
});

const response = await mr.responses.create(
  mr.responses
    .new()
    .model("claude-sonnet-4-5")
    .tools([weatherTool])
    .user("What's the weather in Paris?")
    .build()
);

if (hasToolCalls(response)) {
  const call = firstToolCall(response);
  console.log(call.name);       // "get_weather"
  console.log(call.arguments);  // { location: "Paris" }

  // Execute tool and continue conversation
  const weatherData = await getWeather(call.arguments.location);

  const followUp = await mr.responses.create(
    mr.responses
      .new()
      .model("claude-sonnet-4-5")
      .tools([weatherTool])
      .fromResponse(response)
      .message(toolResultMessage(call.id, JSON.stringify(weatherData)))
      .build()
  );
}

Handling Multiple Tool Calls

Models may return multiple tool calls in a single response. Iterate over all calls:

import {
  hasToolCalls,
  toolResultMessage,
  assistantMessageWithToolCalls,
  asModelId,
} from "@modelrelay/sdk";
import type { ToolCall } from "@modelrelay/sdk";

if (hasToolCalls(response)) {
  // Collect all tool calls from the response
  const allToolCalls: ToolCall[] = [];
  for (const item of response.output || []) {
    for (const call of item?.toolCalls || []) {
      allToolCalls.push(call);
    }
  }

  // Execute each tool and collect results
  const toolResultItems = [];
  for (const call of allToolCalls) {
    const result = await executeMyTool(call);
    toolResultItems.push(toolResultMessage(call.id, JSON.stringify(result)));
  }

  // Build follow-up request with conversation history
  let builder = mr.responses
    .new()
    .model(asModelId("claude-sonnet-4-5"))
    .tools(myTools)
    .user("Original user message")
    .item(assistantMessageWithToolCalls("", allToolCalls));

  // Add all tool results
  for (const resultItem of toolResultItems) {
    builder = builder.item(resultItem);
  }

  const followUp = await mr.responses.create(builder.build());
}

For parallel execution, use the ToolRegistry:

import { ToolRegistry, asModelId } from "@modelrelay/sdk";

const registry = new ToolRegistry();

// Handler receives (args, call) - args are already parsed from JSON
registry.register<{ location: string }, WeatherData>(
  "get_weather",
  async (args) => {
    return await getWeather(args.location);
  }
);

registry.register<{ timezone: string }, string>(
  "get_time",
  async (args) => {
    return new Date().toLocaleTimeString(args.timezone);
  }
);

// Execute all tool calls in parallel
const results = await registry.executeAll(response.output[0].toolCalls);
const messages = registry.resultsToMessages(results);

Tool Loops

For agentic workflows where the model may call tools multiple times, you can use the pure tool loop helper or LocalSession:

import { runToolLoop, createUserMessage, createSystemMessage } from "@modelrelay/sdk";
import { z } from "zod";
import fs from "node:fs/promises";

const tools = mr.tools()
  .add("read_file", "Read a file", z.object({ path: z.string() }), async (args) => {
    return fs.readFile(args.path, "utf-8");
  })
  .add("write_file", "Write a file", z.object({ path: z.string(), content: z.string() }), async (args) => {
    await fs.writeFile(args.path, args.content);
    return "File written";
  });

const { definitions: toolDefs, registry } = tools.build();

const input = [
  createSystemMessage("You are a careful refactor bot."),
  createUserMessage("Read config.json and add a version field."),
];

const outcome = await runToolLoop({
  client: mr.responses,
  input,
  tools: toolDefs,
  registry,
  maxTurns: 25,
  buildRequest: (builder) => builder.model(asModelId("claude-sonnet-4-5")),
});

if (outcome.status === "complete") {
  console.log(outcome.output);
}

For multi-turn conversations with persistence, use LocalSession with a tool registry:

import {
  ModelRelay,
  ToolRegistry,
  asModelId,
  ContextManager,
  createModelContextResolver,
} from "@modelrelay/sdk";
import fs from "node:fs/promises";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);

// Define your tools
const registry = new ToolRegistry();

// Handler receives (args, call) - args are already parsed from JSON
registry.register<{ path: string }, string>(
  "read_file",
  async (args) => {
    return await fs.readFile(args.path, "utf-8");
  }
);

registry.register<{ path: string; content: string }, string>(
  "write_file",
  async (args) => {
    await fs.writeFile(args.path, args.content);
    return "File written successfully";
  }
);

// Create a session with automatic tool execution
const session = mr.sessions.createLocal({
  defaultModel: asModelId("claude-sonnet-4-5"),
  toolRegistry: registry,
  persistence: "sqlite", // or "file", "memory"
});

const context = new ContextManager(createModelContextResolver(mr), {
  strategy: "truncate",
  maxHistoryTokens: 4000,
});

const bounded = mr.sessions.createLocal({
  defaultModel: asModelId("claude-sonnet-4-5"),
  toolRegistry: registry,
  contextManager: context,
  persistence: "file",
});

// The session handles the tool loop automatically
const result = await session.run("Read config.json and add a new field 'version': '1.0.0'");

console.log(result.output);  // Final text response
console.log(result.usage);   // Total token usage

LocalSession will:

Send your prompt to the model via /responses
Execute any tool calls the model makes
Send tool results back to the model
Repeat until the model responds with text (no more tool calls)

Note: file and sqlite persistence require a Node.js-compatible runtime. SQLite persistence also requires installing the optional better-sqlite3 dependency.

Type Exports

The SDK exports types for all request and response objects:

import type {
  // Client types
  ModelRelayOptions,
  Response,
  ModelId,

  // Customer types
  Customer,
  CustomerCreateRequest,
  CheckoutSession,
  SubscriptionStatus,

  // Tier types
  Tier,
  PriceInterval,

  // Error types
  FieldError,
  ErrorCode,
} from "@modelrelay/sdk";

// Generated types from OpenAPI spec
import { generated } from "@modelrelay/sdk";
type ResponsesResponse = generated.components["schemas"]["ResponsesResponse"];

Next Steps

First Request — Make your first API call
Streaming — Real-time response streaming
Tool Use — Let models call functions
Structured Output — Get typed JSON responses