Rust SDK

The official Rust SDK for ModelRelay. Provides idiomatic Rust patterns with strong typing, async/await support, and zero-copy streaming.

Installation

Add to your Cargo.toml:

[dependencies]
modelrelay = "5.12.0"
tokio = { version = "1", features = ["full"] }

For blocking (synchronous) usage:

[dependencies]
modelrelay = { version = "5.12.0", features = ["blocking"] }

Quick Start

use modelrelay::{Client, ResponseBuilder};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?
        .build()?;

    let response = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .system("You are a helpful assistant.")
        .user("What is the capital of France?")
        .send(&client.responses())
        .await?;

    println!("{}", response.text());
    Ok(())
}

Convenience API

The simplest way to get started. Three methods cover the most common use cases:

Ask — Get a Quick Answer

use modelrelay::Client;

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let answer = client.ask("claude-sonnet-4-5", "What is 2 + 2?", None).await?;
println!("{}", answer); // "4"

Chat — Full Response with Metadata

use modelrelay::{Client, ChatOptions};

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let response = client.chat(
    "claude-sonnet-4-5",
    "Explain quantum computing",
    Some(ChatOptions::new().with_system("You are a physics professor")),
).await?;

println!("{}", response.text());
println!("Tokens: {}", response.usage.total_tokens);

Agent — Agentic Tool Loops

Run an agent that automatically executes tools until completion:

use modelrelay::{Client, AgentOptions, ToolBuilder};
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(JsonSchema, Deserialize)]
struct ReadFileArgs {
    /// File path to read
    path: String,
}

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let tools = ToolBuilder::new()
    .add_sync::<ReadFileArgs, _>("read_file", "Read a file", |args, _call| {
        let content = std::fs::read_to_string(&args.path)
            .map_err(|e| e.to_string())?;
        Ok(serde_json::json!({ "content": content }))
    });

let result = client.agent(
    "claude-sonnet-4-5",
    AgentOptions::new(tools, "Read config.json and summarize it")
        .with_system("You are a helpful file assistant"),
).await?;

println!("{}", result.output);
println!("Tool calls: {}", result.usage.tool_calls);

Configuration

From API Key

use modelrelay::Client;

// From secret key (backend use)
let client = Client::from_secret_key("mr_sk_...")?
    .build()?;

// Auto-detect key type from environment
let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?
    .build()?;

From Bearer Token

For customer-scoped access using a minted token:

let client = Client::from_token("eyJ...")?
    .build()?;

Configuration Options

Use the builder pattern to configure the client:

use std::time::Duration;

let client = Client::from_api_key(api_key)?
    .base_url("https://custom.api.com")
    .connect_timeout(Duration::from_secs(5))
    .request_timeout(Duration::from_secs(30))
    .build()?;

Method	Default	Description
`base_url(url)`	`https://api.modelrelay.ai/api/v1`	API base URL
`connect_timeout(d)`	`5s`	Connection timeout
`request_timeout(d)`	`60s`	Request timeout

Making Requests

ResponseBuilder

The ResponseBuilder provides a fluent API for constructing requests:

use modelrelay::ResponseBuilder;

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .system("You are a helpful assistant.")
    .user("What is 2 + 2?")
    .max_output_tokens(256)
    .temperature(0.7)
    .send(&client.responses())
    .await?;

println!("{}", response.text());
println!("Tokens: {}", response.usage.total_tokens);

Multi-Turn Conversations

Build conversations with multiple messages:

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .system("You are a helpful assistant.")
    .user("My name is Alice.")
    .assistant("Hello Alice! How can I help you today?")
    .user("What's my name?")
    .send(&client.responses())
    .await?;

Customer-Attributed Requests

For metered billing, attribute requests to customers:

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .customer_id("customer-123")
    .system("You are helpful.")
    .user("Hello!")
    .send(&client.responses())
    .await?;

Streaming

Stream Text Deltas

For real-time response streaming with just the text:

use futures_util::StreamExt;

let mut stream = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .system("You are a helpful assistant.")
    .user("Write a haiku about programming.")
    .stream_deltas(&client.responses())
    .await?;

while let Some(delta) = stream.next().await {
    let delta = delta?;
    print!("{}", delta);
}
println!();

Full Event Stream

For access to all streaming events:

use futures_util::StreamExt;
use modelrelay::StreamEvent;

let mut stream = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .system("You are helpful.")
    .user("Hello!")
    .stream(&client.responses())
    .await?;

while let Some(event) = stream.next().await {
    let event = event?;
    match event {
        StreamEvent::Delta { text, .. } => print!("{}", text),
        StreamEvent::Done { usage, .. } => {
            println!("\nTokens used: {}", usage.total_tokens);
        }
        _ => {}
    }
}

Structured Output

Parse to Typed Struct

Use schemars to generate JSON schemas and parse responses into typed Rust structs:

use modelrelay::ResponseBuilder;
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(Debug, Deserialize, JsonSchema)]
struct Person {
    name: String,
    age: u32,
}

let result: Person = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Extract: John Doe is 30 years old")
    .structured(&client.responses())
    .await?;

println!("Name: {}, Age: {}", result.name, result.age);

The SDK automatically:

Generates a JSON schema from your struct’s JsonSchema derive
Instructs the model to output valid JSON
Parses and validates the response

Complex Types

Structured output works with enums, nested structs, and optional fields:

#[derive(Debug, Deserialize, JsonSchema)]
struct Analysis {
    sentiment: Sentiment,
    confidence: f64,
    keywords: Vec<String>,
    summary: Option<String>,
}

#[derive(Debug, Deserialize, JsonSchema)]
enum Sentiment {
    Positive,
    Negative,
    Neutral,
}

let analysis: Analysis = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Analyze: I love this product!")
    .structured(&client.responses())
    .await?;

Tool Use

Defining Tools

Use the #[tool] attribute macro to define tools from functions:

use modelrelay::{tool, ToolRegistry};

#[tool(description = "Get the current weather for a location")]
fn get_weather(
    #[doc = "City name"] city: String,
    #[doc = "Temperature unit"] unit: Option<String>,
) -> String {
    format!("Weather in {}: 72°F, sunny", city)
}

#[tool(description = "Search the web for information")]
fn web_search(
    #[doc = "Search query"] query: String,
) -> String {
    format!("Results for '{}': ...", query)
}

Tool Registry

let mut registry = ToolRegistry::new();
registry.register(get_weather);
registry.register(web_search);

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .system("You have access to weather and search tools.")
    .user("What's the weather in Tokyo?")
    .tools(&registry)
    .send(&client.responses())
    .await?;

// Check if the model wants to call a tool
if let Some(tool_call) = response.tool_call() {
    let result = registry.call(&tool_call)?;

    // Continue conversation with tool result
    let final_response = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .messages(response.messages())
        .tool_result(tool_call.id(), &result)
        .send(&client.responses())
        .await?;

    println!("{}", final_response.text());
}

Error Handling

Result Types

All SDK methods return Result with typed errors:

use modelrelay::{Client, ResponseBuilder, Error};

async fn make_request() -> Result<(), Error> {
    let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?
        .build()?;

    let response = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .user("Hello!")
        .send(&client.responses())
        .await?;

    Ok(())
}

Error Variants

Handle specific error types:

use modelrelay::Error;

match result {
    Ok(response) => println!("{}", response.text()),
    Err(Error::Api { status, code, message, .. }) => {
        eprintln!("API error {}: {} - {}", status, code, message);

        match code.as_str() {
            "rate_limit_exceeded" => {
                // Back off and retry
            }
            "unauthorized" => {
                // Re-authenticate
            }
            "payment_required" => {
                // Customer quota exceeded
            }
            _ => {}
        }
    }
    Err(Error::Transport { message, .. }) => {
        eprintln!("Network error: {}", message);
    }
    Err(Error::Config { reason, .. }) => {
        eprintln!("Configuration error: {}", reason);
    }
    Err(e) => eprintln!("Other error: {}", e),
}

Retry with Backoff

Implement exponential backoff for transient errors:

use std::time::Duration;
use tokio::time::sleep;

async fn with_retry<F, T, Fut>(mut f: F, max_attempts: u32) -> Result<T, Error>
where
    F: FnMut() -> Fut,
    Fut: std::future::Future<Output = Result<T, Error>>,
{
    let mut attempt = 0;
    loop {
        match f().await {
            Ok(result) => return Ok(result),
            Err(Error::Api { code, retry_after, .. })
                if code == "rate_limit_exceeded" && attempt < max_attempts =>
            {
                let delay = retry_after.unwrap_or(Duration::from_secs(1 << attempt));
                sleep(delay).await;
                attempt += 1;
            }
            Err(e) => return Err(e),
        }
    }
}

Blocking API

For synchronous code, enable the blocking feature:

[dependencies]
modelrelay = { version = "5.12.0", features = ["blocking"] }

Then use the blocking client:

use modelrelay::blocking::{Client, ResponseBuilder};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?
        .build()?;

    let response = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .system("You are helpful.")
        .user("Hello!")
        .send(&client.responses())?;

    println!("{}", response.text());
    Ok(())
}

Blocking Streaming

use modelrelay::blocking::{Client, ResponseBuilder};

let client = Client::from_api_key(api_key)?.build()?;

let stream = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Write a story.")
    .stream_deltas(&client.responses())?;

for delta in stream {
    print!("{}", delta?);
}

Customer Management

Manage customers for billing and usage tracking:

use modelrelay::{CustomerSubscribeRequest, CustomerUpsertRequest};

// Create or update a customer
let customer = client.customers()
    .upsert(CustomerUpsertRequest {
        external_id: Some("your-user-id".into()),
        email: Some("user@example.com".into()),
        ..Default::default()
    })
    .await?;

// Create a Stripe checkout session
let session = client.customers()
    .subscribe(customer.customer.id, CustomerSubscribeRequest {
        tier_id: tier_uuid,
        success_url: "https://myapp.com/success".to_string(),
        cancel_url: "https://myapp.com/cancel".to_string(),
    })
    .await?;

// Redirect user to session.url for payment

// Check subscription status
let status = client.customers()
    .get_subscription(customer.customer.id)
    .await?;

println!("Active: {}", status.active);

Feature Flags

Feature	Description
`blocking`	Enable synchronous/blocking API
`rustls`	Use rustls instead of native TLS (default)
`native-tls`	Use platform native TLS

Example with multiple features:

[dependencies]
modelrelay = { version = "5.12.0", features = ["blocking", "native-tls"] }

Next Steps

First Request - Make your first API call
Streaming - Real-time response streaming
Tool Use - Let models call functions
Structured Output - Get typed JSON responses