Responses

The /responses endpoint is the primary way to generate AI model responses through ModelRelay.

Create a Response

POST /api/v1/responses

Authentication

Requires either:

  • Secret key (mr_sk_*): Backend use with full project access
  • Customer bearer token: Customer scoped token for data-plane access

Publishable keys (mr_pk_*) cannot call this endpoint directly.

Request Body

Field Type Required Description
input array Yes Array of input messages
model string No Model identifier (e.g., claude-sonnet-4-20250514)
provider string No Provider override (anthropic, openai, xai, google-ai-studio)
output_format object No Response format (text or json_schema)
max_output_tokens integer No Maximum tokens to generate
temperature number No Sampling temperature (0-2)
stop array No Stop sequences (max 8, each max 128 chars)
tools array No Available tools for the model
tool_choice object No Tool selection behavior

Input Messages

Each input message has the following structure:

Field Type Required Description
type string Yes Always "message"
role string Yes system, user, assistant, or tool
content array No Array of content parts
tool_calls array No Tool calls (for assistant messages)
tool_call_id string No Tool call ID (for tool result messages)

Content Parts

Field Type Required Description
type string Yes Currently "text"
text string No Text content

Output Format

Field Type Required Description
type string Yes text or json_schema
json_schema object No JSON schema definition (required when type is json_schema)

JSON Schema Format

Field Type Required Description
name string Yes Schema name
description string No Schema description
schema object Yes JSON Schema object
strict boolean No Enable strict mode

Tools

Field Type Required Description
type string Yes function, web, x_search, code_execution, or image_generation
function object No Function definition
web object No Web search configuration
x_search object No X search configuration (Grok only)
code_execution object No Code execution configuration
image_generation object No Image generation configuration

Function Tool

Field Type Required Description
name string Yes Function name (dot-separated lowercase, e.g., fs.search)
description string No Function description
parameters object No JSON Schema for parameters

Image Generation Tool

The image_generation tool enables models to generate images during a conversation. When the model decides to generate an image, the server handles the image generation and returns the result.

Field Type Required Description
model string Yes Image generation model ID (e.g., gemini-2.5-flash-image)

How it works:

  1. Include an image_generation tool in your request with the target image model
  2. The model may call the tool with a prompt and optional response_format
  3. The server generates the image and returns the result as a tool response
  4. The model can then reference or describe the generated image

Example:

{
  "tools": [
    {
      "type": "image_generation",
      "image_generation": {
        "model": "gemini-2.5-flash-image"
      }
    }
  ]
}

Limitations:

  • Streaming is not supported when using image generation tools
  • Only one image_generation tool is allowed per request
  • Image generation tools cannot be combined with other tool calls in the same model turn

Tool Choice

Field Type Required Description
type string Yes auto, required, or none
function string No Specific function name to call

Response

Field Type Description
id string Response identifier from the provider
output array Array of output messages
model string Model that generated the response
provider string Provider that handled the request
stop_reason string Why generation stopped (stop, max_tokens, tool_use, etc.)
usage object Token usage statistics
citations array Sources from web search results (if applicable)

Usage

Field Type Description
input_tokens integer Tokens in the input
output_tokens integer Tokens in the output
total_tokens integer Total tokens used

Tool Calls in Output

When the model calls tools, the output message includes:

Field Type Description
id string Unique identifier for the tool call
type string function, web, x_search, code_execution, or image_generation
function.name string Function name
function.arguments string JSON string of function arguments

Examples

Basic Text Request

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "What is 2 + 2?"}]
      }
    ]
  }'
import { ModelRelay } from "@modelrelay/sdk";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_SECRET_KEY!);

const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "What is 2 + 2?" }],
    },
  ],
});

console.log(response.output);
response, err := client.Responses.Create(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "What is 2 + 2?"},
            },
        },
    },
})
if err != nil {
    return err
}

fmt.Println(response.Output)
let response = client.responses().create(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("What is 2 + 2?".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    ..Default::default()
}).await?;

println!("{:?}", response.output);

Response

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "2 + 2 equals 4."
        }
      ]
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "stop_reason": "stop",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 8,
    "total_tokens": 20
  }
}

Structured Output (JSON Schema)

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "Extract: John is 30 years old"}]
      }
    ],
    "output_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person",
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"}
          },
          "required": ["name", "age"]
        }
      }
    }
  }'
const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "Extract: John is 30 years old" }],
    },
  ],
  output_format: {
    type: "json_schema",
    json_schema: {
      name: "person",
      schema: {
        type: "object",
        properties: {
          name: { type: "string" },
          age: { type: "integer" },
        },
        required: ["name", "age"],
      },
    },
  },
});
response, err := client.Responses.Create(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "Extract: John is 30 years old"},
            },
        },
    },
    OutputFormat: &sdk.OutputFormat{
        Type: "json_schema",
        JSONSchema: &sdk.JSONSchemaFormat{
            Name: "person",
            Schema: map[string]any{
                "type": "object",
                "properties": map[string]any{
                    "name": map[string]any{"type": "string"},
                    "age":  map[string]any{"type": "integer"},
                },
                "required": []string{"name", "age"},
            },
        },
    },
})
use serde_json::json;

let response = client.responses().create(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("Extract: John is 30 years old".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    output_format: Some(OutputFormat {
        r#type: "json_schema".to_string(),
        json_schema: Some(JsonSchemaFormat {
            name: "person".to_string(),
            schema: Some(json!({
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"}
                },
                "required": ["name", "age"]
            })),
            ..Default::default()
        }),
    }),
    ..Default::default()
}).await?;

With Tools

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "What is the weather in London?"}]
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'
const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "What is the weather in London?" }],
    },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
});

// Check for tool calls
for (const output of response.output) {
  if (output.tool_calls) {
    for (const call of output.tool_calls) {
      console.log(`Tool: ${call.function?.name}`);
      console.log(`Args: ${call.function?.arguments}`);
    }
  }
}

Streaming

To receive streaming responses, set the Accept header to application/x-ndjson; profile="responses-stream/v2" (NDJSON) or text/event-stream (SSE).

Stream Request

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -H "Accept: application/x-ndjson; profile=\"responses-stream/v2\"" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "Tell me a story"}]
      }
    ]
  }'

Stream Events

The stream returns newline-delimited JSON objects (or SSE data: payloads) with the following event types:

Event Type Description
start Stream initialization with model/provider info
update Text delta or content update
completion Final response with usage stats
error Error during generation
keepalive Heartbeat to keep connection alive
tool_use_start Tool call initiated
tool_use_delta Tool call argument delta
tool_use_stop Tool call completed

Stream Event Structure

Field Type Description
type string Event type
stream_mode string text-delta or structured-patch
stream_version string Currently v2
delta string Text delta (for update events)
content string Full content so far
patch array JSON Patch operations (for structured output)
request_id string Request identifier
provider string Provider handling the request
model string Model generating the response
stop_reason string Why generation stopped (on completion)
usage object Token usage (on completion)
code string Error code (on error)
message string Error message (on error)
status integer HTTP status (on error)

SDK Streaming

const stream = await mr.responses.stream({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "Tell me a story" }],
    },
  ],
});

for await (const event of stream) {
  if (event.type === "update" && event.delta) {
    process.stdout.write(event.delta);
  }
}
stream, err := client.Responses.Stream(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "Tell me a story"},
            },
        },
    },
})
if err != nil {
    return err
}
defer stream.Close()

for event := range stream.Events() {
    if event.Type == "update" && event.Delta != "" {
        fmt.Print(event.Delta)
    }
}
use futures_util::StreamExt;

let mut stream = client.responses().stream(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("Tell me a story".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    ..Default::default()
}).await?;

while let Some(event) = stream.next().await {
    let event = event?;
    if event.r#type == "update" {
        if let Some(delta) = &event.delta {
            print!("{}", delta);
        }
    }
}

Error Codes

Status Code Description
400 bad_request Invalid request format or parameters
401 unauthorized Missing or invalid authentication
402 payment_required Customer spend limit exceeded
403 forbidden Publishable keys cannot access this endpoint
404 not_found Model or resource not found
429 rate_limited Too many requests
500 internal_error Server error

Next Steps