TypeScript SDK
The official TypeScript SDK for ModelRelay. Works with Node.js, Bun, Deno, and modern browsers.
Installation
npm install @modelrelay/sdk
Or with other package managers:
bun add @modelrelay/sdk
pnpm add @modelrelay/sdk
yarn add @modelrelay/sdk
Quick Start
import { ModelRelay } from "@modelrelay/sdk";
const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);
const answer = await mr.responses.text(
"claude-sonnet-4-5",
"You are a helpful assistant.",
"What is the capital of France?"
);
console.log(answer);
// "The capital of France is Paris."
Convenience API
The simplest way to get started. Three methods cover the most common use cases:
Ask — Get a Quick Answer
import { ModelRelay } from "@modelrelay/sdk";
const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);
const answer = await mr.ask("claude-sonnet-4-5", "What is 2 + 2?");
console.log(answer); // "4"
Chat — Full Response with Metadata
const response = await mr.chat("claude-sonnet-4-5", "Explain quantum computing", {
system: "You are a physics professor",
});
console.log(response.output);
console.log("Tokens:", response.usage.totalTokens);
Agent — Agentic Tool Loops
Run an agent that automatically executes tools until completion:
import { z } from "zod";
const tools = mr
.tools()
.add(
"read_file",
"Read a file from the filesystem",
z.object({ path: z.string().describe("File path to read") }),
async (args) => {
const content = await fs.readFile(args.path, "utf-8");
return content;
}
);
const result = await mr.agent("claude-sonnet-4-5", {
tools,
prompt: "Read config.json and summarize it",
system: "You are a helpful file assistant",
});
console.log(result.output);
console.log("Tool calls:", result.usage.toolCalls);
Configuration
From API Key
The simplest way to create a client:
import { ModelRelay } from "@modelrelay/sdk";
// From secret key (backend use)
const mr = ModelRelay.fromSecretKey("mr_sk_...");
// From API key string
const mr = ModelRelay.fromApiKey(process.env.MODELRELAY_API_KEY!);
Full Configuration
For more control over client behavior:
import { ModelRelay, parseSecretKey } from "@modelrelay/sdk";
const mr = new ModelRelay({
key: parseSecretKey("mr_sk_..."),
baseUrl: "https://api.modelrelay.ai", // Optional, defaults to production
timeoutMs: 30_000, // Request timeout
connectTimeoutMs: 10_000, // Connection timeout
retry: {
maxAttempts: 3, // Retry failed requests
},
});
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
key |
string |
— | Secret API key (mr_sk_*) |
baseUrl |
string |
https://api.modelrelay.ai |
API base URL |
timeoutMs |
number |
120000 |
Request timeout in milliseconds |
connectTimeoutMs |
number |
10000 |
Connection timeout in milliseconds |
retry.maxAttempts |
number |
0 |
Number of retry attempts |
Making Requests
Simple Text Response
For the common “system prompt + user message → text” pattern:
const answer = await mr.responses.text(
"claude-sonnet-4-5", // model
"You are a helpful assistant.", // system prompt
"What is 2 + 2?" // user message
);
Request Builder
For more control over request parameters:
const response = await mr.responses.create(
mr.responses
.new()
.model("claude-sonnet-4-5")
.system("You are a helpful assistant.")
.user("What is 2 + 2?")
.maxOutputTokens(256)
.temperature(0.7)
.build()
);
console.log(response.output); // Array of output items
console.log(response.usage); // { input_tokens, output_tokens }
Customer-Attributed Requests
For metered billing, attribute requests to customers:
// Option 1: Use customerId in the request
const req = mr.responses
.new()
.customerId("cust_abc123")
.system("You are a helpful assistant.")
.user("Hello!")
.build();
const response = await mr.responses.create(req);
// Option 2: Use the convenience method
const answer = await mr.responses.textForCustomer({
customerId: "cust_abc123",
system: "You are a helpful assistant.",
user: "Hello!",
});
// Option 3: Create a customer-scoped client
const customer = mr.forCustomer("cust_abc123");
const answer = await customer.responses.text(
"You are a helpful assistant.",
"Hello!"
);
Streaming
Stream Text Deltas
For real-time response streaming:
const stream = await mr.responses.streamTextDeltas(
"claude-sonnet-4-5",
"You are a helpful assistant.",
"Write a haiku about programming."
);
for await (const delta of stream) {
process.stdout.write(delta);
}
Full Event Stream
For access to all streaming events:
const req = mr.responses
.new()
.model("claude-sonnet-4-5")
.user("Hello!")
.build();
const stream = await mr.responses.stream(req);
for await (const event of stream) {
switch (event.type) {
case "message_start":
console.log("Started:", event.message.id);
break;
case "message_delta":
if (event.textDelta) {
process.stdout.write(event.textDelta);
}
break;
case "message_complete":
console.log("\nUsage:", event.message.usage);
break;
}
}
Collect Stream to Response
To aggregate an entire stream into a final response object:
const stream = await mr.responses.stream(req);
const response = await stream.collect();
// response is now a complete Response object
console.log(response.output);
console.log(response.usage);
This is useful when you want streaming progress indicators but need the complete response for further processing.
Sessions
Sessions manage multi-turn conversations with automatic tool handling.
import { ModelRelay } from "@modelrelay/sdk";
const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);
const session = mr.sessions.createLocal({
defaultModel: "gpt-5.2",
});
const result = await session.run("Summarize the last meeting.", {
contextManagement: "truncate",
});
console.log(result.output);
Context Management
When contextManagement is "truncate", the SDK trims older messages to fit
within the model’s context window. It fetches model metadata from /models to
derive a default history budget. You can override the budget with
maxHistoryTokens. A model is required (set defaultModel or options.model).
Use reserveOutputTokens to override the output-token reservation when
computing the history budget. If maxHistoryTokens is set, it takes precedence.
const result = await session.run("Continue the plan.", {
contextManagement: "truncate",
maxHistoryTokens: 8_000,
});
"summarize" is reserved for a future release and currently throws an error.
You can also observe truncation decisions:
await session.run("Continue the plan.", {
contextManagement: "truncate",
maxHistoryTokens: 8_000,
onContextTruncate: ({ originalMessages, keptMessages }) => {
console.log(`Trimmed ${originalMessages} → ${keptMessages} messages`);
},
});
Syncing Local to Remote
You can sync a local session to a remote session for cross-device access,
team collaboration, or server-side backup:
// Work locally for privacy
const local = mr.sessions.createLocal({
defaultModel: "gpt-5.2",
});
await local.run("Implement the authentication feature");
await local.run("Now add the logout button");
// Later, sync to remote for sharing/backup
const remote = await mr.sessions.createRemote();
const result = await local.syncTo(remote, {
onProgress: (synced, total) => {
console.log(`Syncing: ${synced}/${total} messages`);
},
});
console.log(`Synced ${result.messagesSynced} messages to ${result.remoteSessionId}`);
Structured Output
With Zod Schemas
Parse responses into typed objects:
import { ModelRelay } from "@modelrelay/sdk";
import { z } from "zod";
const mr = ModelRelay.fromSecretKey("mr_sk_...");
const Person = z.object({
name: z.string(),
age: z.number(),
});
// Simple one-call API (recommended)
const person = await mr.responses.object<z.infer<typeof Person>>({
model: "claude-sonnet-4-5",
schema: Person,
prompt: "Extract: John Doe is 30 years old",
});
console.log(person.name); // "John Doe"
console.log(person.age); // 30
For parallel structured output:
const [security, performance] = await Promise.all([
mr.responses.object<SecurityReview>({
model: "claude-sonnet-4-5",
schema: SecuritySchema,
system: "You are a security expert.",
prompt: code,
}),
mr.responses.object<PerformanceReview>({
model: "claude-sonnet-4-5",
schema: PerformanceSchema,
system: "You are a performance expert.",
prompt: code,
}),
]);
For metadata (attempts, request ID) or more control:
const result = await mr.responses.objectWithMetadata<z.infer<typeof Person>>({
model: "claude-sonnet-4-5",
schema: Person,
prompt: "Extract: John Doe is 30 years old",
maxRetries: 2,
});
console.log(result.value); // { name: "John Doe", age: 30 }
console.log(result.attempts); // 1 (first try succeeded)
console.log(result.requestId); // Server request ID
Streaming Structured Output
Build progressive UIs that render fields as they complete:
import { z } from "zod";
const Article = z.object({
title: z.string(),
summary: z.string(),
body: z.string(),
});
const stream = await mr.responses.streamStructured(
Article,
mr.responses
.new()
.model("claude-sonnet-4-5")
.user("Write an article about TypeScript")
.build()
);
for await (const event of stream) {
// Render fields as soon as they're complete
if (event.completeFields.has("title")) {
renderTitle(event.payload.title);
}
if (event.completeFields.has("summary")) {
renderSummary(event.payload.summary);
}
// Show streaming preview of incomplete fields
if (!event.completeFields.has("body")) {
renderBodyPreview(event.payload.body + "▋");
}
}
Error Handling
Error Types
The SDK provides typed errors for different failure modes:
import {
ModelRelay,
ModelRelayError,
APIError,
ConfigError,
TransportError,
ErrorCodes,
} from "@modelrelay/sdk";
try {
const answer = await mr.responses.text(
"claude-sonnet-4-5",
"You are helpful.",
"Hello!"
);
} catch (error) {
if (error instanceof APIError) {
// Server returned an error response
console.error(`API error ${error.status}: ${error.message}`);
console.error(`Code: ${error.code}`);
console.error(`Request ID: ${error.requestId}`);
// Check specific error types
if (error.isRateLimit()) {
console.error("Rate limited, retry later");
}
if (error.isUnauthorized()) {
console.error("Invalid API key");
}
} else if (error instanceof TransportError) {
// Network or connection error
console.error(`Transport error: ${error.message}`);
} else if (error instanceof ConfigError) {
// Invalid configuration
console.error(`Config error: ${error.message}`);
} else {
throw error;
}
}
Error Codes
Use error codes for programmatic handling:
import { ErrorCodes } from "@modelrelay/sdk";
if (error instanceof APIError) {
switch (error.code) {
case ErrorCodes.RATE_LIMIT:
// Back off and retry
break;
case ErrorCodes.UNAUTHORIZED:
// Re-authenticate
break;
case ErrorCodes.NOT_FOUND:
// Resource doesn't exist
break;
case ErrorCodes.VALIDATION_ERROR:
// Check request parameters
console.error("Field errors:", error.fields);
break;
}
}
Customer Management
For customer CRUD operations, use the REST API directly. See Customer API Reference for details.
// Create or update a customer using fetch
const response = await fetch("https://api.modelrelay.ai/api/v1/customers/upsert", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-ModelRelay-Api-Key": process.env.MODELRELAY_SECRET_KEY!,
},
body: JSON.stringify({
external_id: "your-user-id",
email: "user@example.com",
}),
});
const customer = await response.json();
Tool Use
Define tools that models can call:
import {
ModelRelay,
createFunctionTool,
hasToolCalls,
firstToolCall,
toolResultMessage,
zodToJsonSchema,
} from "@modelrelay/sdk";
import { z } from "zod";
const weatherSchema = z.object({
location: z.string().describe("City name"),
});
const weatherTool = createFunctionTool({
name: "get_weather",
description: "Get current weather for a location",
parameters: zodToJsonSchema(weatherSchema),
});
const response = await mr.responses.create(
mr.responses
.new()
.model("claude-sonnet-4-5")
.tools([weatherTool])
.user("What's the weather in Paris?")
.build()
);
if (hasToolCalls(response)) {
const call = firstToolCall(response);
console.log(call.name); // "get_weather"
console.log(call.arguments); // { location: "Paris" }
// Execute tool and continue conversation
const weatherData = await getWeather(call.arguments.location);
const followUp = await mr.responses.create(
mr.responses
.new()
.model("claude-sonnet-4-5")
.tools([weatherTool])
.fromResponse(response)
.message(toolResultMessage(call.id, JSON.stringify(weatherData)))
.build()
);
}
Handling Multiple Tool Calls
Models may return multiple tool calls in a single response. Iterate over all calls:
import {
hasToolCalls,
toolResultMessage,
assistantMessageWithToolCalls,
asModelId,
} from "@modelrelay/sdk";
import type { ToolCall } from "@modelrelay/sdk";
if (hasToolCalls(response)) {
// Collect all tool calls from the response
const allToolCalls: ToolCall[] = [];
for (const item of response.output || []) {
for (const call of item?.toolCalls || []) {
allToolCalls.push(call);
}
}
// Execute each tool and collect results
const toolResultItems = [];
for (const call of allToolCalls) {
const result = await executeMyTool(call);
toolResultItems.push(toolResultMessage(call.id, JSON.stringify(result)));
}
// Build follow-up request with conversation history
let builder = mr.responses
.new()
.model(asModelId("claude-sonnet-4-5"))
.tools(myTools)
.user("Original user message")
.item(assistantMessageWithToolCalls("", allToolCalls));
// Add all tool results
for (const resultItem of toolResultItems) {
builder = builder.item(resultItem);
}
const followUp = await mr.responses.create(builder.build());
}
For parallel execution, use the ToolRegistry:
import { ToolRegistry, asModelId } from "@modelrelay/sdk";
const registry = new ToolRegistry();
// Handler receives (args, call) - args are already parsed from JSON
registry.register<{ location: string }, WeatherData>(
"get_weather",
async (args) => {
return await getWeather(args.location);
}
);
registry.register<{ timezone: string }, string>(
"get_time",
async (args) => {
return new Date().toLocaleTimeString(args.timezone);
}
);
// Execute all tool calls in parallel
const results = await registry.executeAll(response.output[0].toolCalls);
const messages = registry.resultsToMessages(results);
Tool Loops
For agentic workflows where the model may call tools multiple times, you can use the pure tool loop helper or LocalSession:
import { runToolLoop, createUserMessage, createSystemMessage } from "@modelrelay/sdk";
import { z } from "zod";
import fs from "node:fs/promises";
const tools = mr.tools()
.add("read_file", "Read a file", z.object({ path: z.string() }), async (args) => {
return fs.readFile(args.path, "utf-8");
})
.add("write_file", "Write a file", z.object({ path: z.string(), content: z.string() }), async (args) => {
await fs.writeFile(args.path, args.content);
return "File written";
});
const { definitions: toolDefs, registry } = tools.build();
const input = [
createSystemMessage("You are a careful refactor bot."),
createUserMessage("Read config.json and add a version field."),
];
const outcome = await runToolLoop({
client: mr.responses,
input,
tools: toolDefs,
registry,
maxTurns: 25,
buildRequest: (builder) => builder.model(asModelId("claude-sonnet-4-5")),
});
if (outcome.status === "complete") {
console.log(outcome.output);
}
For multi-turn conversations with persistence, use LocalSession with a tool registry:
import {
ModelRelay,
ToolRegistry,
asModelId,
ContextManager,
createModelContextResolver,
} from "@modelrelay/sdk";
import fs from "node:fs/promises";
const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_API_KEY!);
// Define your tools
const registry = new ToolRegistry();
// Handler receives (args, call) - args are already parsed from JSON
registry.register<{ path: string }, string>(
"read_file",
async (args) => {
return await fs.readFile(args.path, "utf-8");
}
);
registry.register<{ path: string; content: string }, string>(
"write_file",
async (args) => {
await fs.writeFile(args.path, args.content);
return "File written successfully";
}
);
// Create a session with automatic tool execution
const session = mr.sessions.createLocal({
defaultModel: asModelId("claude-sonnet-4-5"),
toolRegistry: registry,
persistence: "sqlite", // or "file", "memory"
});
const context = new ContextManager(createModelContextResolver(mr), {
strategy: "truncate",
maxHistoryTokens: 4000,
});
const bounded = mr.sessions.createLocal({
defaultModel: asModelId("claude-sonnet-4-5"),
toolRegistry: registry,
contextManager: context,
persistence: "file",
});
// The session handles the tool loop automatically
const result = await session.run("Read config.json and add a new field 'version': '1.0.0'");
console.log(result.output); // Final text response
console.log(result.usage); // Total token usage
LocalSession will:
- Send your prompt to the model via
/responses - Execute any tool calls the model makes
- Send tool results back to the model
- Repeat until the model responds with text (no more tool calls)
Note: file and sqlite persistence require a Node.js-compatible runtime. SQLite persistence also requires installing the optional better-sqlite3 dependency.
Type Exports
The SDK exports types for all request and response objects:
import type {
// Client types
ModelRelayOptions,
Response,
ModelId,
// Customer types
Customer,
CustomerCreateRequest,
CheckoutSession,
SubscriptionStatus,
// Tier types
Tier,
PriceInterval,
// Error types
FieldError,
ErrorCode,
} from "@modelrelay/sdk";
// Generated types from OpenAPI spec
import { generated } from "@modelrelay/sdk";
type ResponsesResponse = generated.components["schemas"]["ResponsesResponse"];
Next Steps
- First Request — Make your first API call
- Streaming — Real-time response streaming
- Tool Use — Let models call functions
- Structured Output — Get typed JSON responses