Responses

The /responses endpoint is the primary way to generate AI model responses through ModelRelay.

Create a Response

POST /api/v1/responses

Authentication

Requires either:

Secret key (mr_sk_*): Backend use with full project access
Customer bearer token: Customer scoped token for data-plane access

Publishable keys (mr_pk_*) cannot call this endpoint directly.

Request Body

Field	Type	Required	Description
`input`	array	Yes	Array of input messages
`model`	string	No	Model identifier (e.g., `claude-sonnet-4-20250514`)
`provider`	string	No	Provider override (`anthropic`, `openai`, `xai`, `google-ai-studio`)
`output_format`	object	No	Response format (`text` or `json_schema`)
`max_output_tokens`	integer	No	Maximum tokens to generate
`temperature`	number	No	Sampling temperature (0-2)
`stop`	array	No	Stop sequences (max 8, each max 128 chars)
`tools`	array	No	Available tools for the model
`tool_choice`	object	No	Tool selection behavior

Input Messages

Each input message has the following structure:

Field	Type	Required	Description
`type`	string	Yes	Always `"message"`
`role`	string	Yes	`system`, `user`, `assistant`, or `tool`
`content`	array	No	Array of content parts
`tool_calls`	array	No	Tool calls (for assistant messages)
`tool_call_id`	string	No	Tool call ID (for tool result messages)

Content Parts

Field	Type	Required	Description
`type`	string	Yes	Currently `"text"`
`text`	string	No	Text content

Output Format

Field	Type	Required	Description
`type`	string	Yes	`text` or `json_schema`
`json_schema`	object	No	JSON schema definition (required when type is `json_schema`)

JSON Schema Format

Field	Type	Required	Description
`name`	string	Yes	Schema name
`description`	string	No	Schema description
`schema`	object	Yes	JSON Schema object
`strict`	boolean	No	Enable strict mode

Tools

Field	Type	Required	Description
`type`	string	Yes	`function`, `web`, `x_search`, `code_execution`, or `image_generation`
`function`	object	No	Function definition
`web`	object	No	Web search configuration
`x_search`	object	No	X search configuration (Grok only)
`code_execution`	object	No	Code execution configuration
`image_generation`	object	No	Image generation configuration

Function Tool

Field	Type	Required	Description
`name`	string	Yes	Function name (dot-separated lowercase, e.g., `fs.search`)
`description`	string	No	Function description
`parameters`	object	No	JSON Schema for parameters

Image Generation Tool

The image_generation tool enables models to generate images during a conversation. When the model decides to generate an image, the server handles the image generation and returns the result.

Field	Type	Required	Description
`model`	string	Yes	Image generation model ID (e.g., `gemini-2.5-flash-image`)

How it works:

Include an image_generation tool in your request with the target image model
The model may call the tool with a prompt and optional response_format
The server generates the image and returns the result as a tool response
The model can then reference or describe the generated image

Example:

{
  "tools": [
    {
      "type": "image_generation",
      "image_generation": {
        "model": "gemini-2.5-flash-image"
      }
    }
  ]
}

Limitations:

Streaming is not supported when using image generation tools
Only one image_generation tool is allowed per request
Image generation tools cannot be combined with other tool calls in the same model turn

Tool Choice

Field	Type	Required	Description
`type`	string	Yes	`auto`, `required`, or `none`
`function`	string	No	Specific function name to call

Response

Field	Type	Description
`id`	string	Response identifier from the provider
`output`	array	Array of output messages
`model`	string	Model that generated the response
`provider`	string	Provider that handled the request
`stop_reason`	string	Why generation stopped (`stop`, `max_tokens`, `tool_use`, etc.)
`usage`	object	Token usage statistics
`citations`	array	Sources from web search results (if applicable)

Usage

Field	Type	Description
`input_tokens`	integer	Tokens in the input
`output_tokens`	integer	Tokens in the output
`total_tokens`	integer	Total tokens used

Tool Calls in Output

When the model calls tools, the output message includes:

Field	Type	Description
`id`	string	Unique identifier for the tool call
`type`	string	`function`, `web`, `x_search`, `code_execution`, or `image_generation`
`function.name`	string	Function name
`function.arguments`	string	JSON string of function arguments

Examples

Basic Text Request

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "What is 2 + 2?"}]
      }
    ]
  }'

import { ModelRelay } from "@modelrelay/sdk";

const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_SECRET_KEY!);

const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "What is 2 + 2?" }],
    },
  ],
});

console.log(response.output);

response, err := client.Responses.Create(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "What is 2 + 2?"},
            },
        },
    },
})
if err != nil {
    return err
}

fmt.Println(response.Output)

let response = client.responses().create(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("What is 2 + 2?".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    ..Default::default()
}).await?;

println!("{:?}", response.output);

Response

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "2 + 2 equals 4."
        }
      ]
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "stop_reason": "stop",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 8,
    "total_tokens": 20
  }
}

Structured Output (JSON Schema)

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "Extract: John is 30 years old"}]
      }
    ],
    "output_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person",
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"}
          },
          "required": ["name", "age"]
        }
      }
    }
  }'

const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "Extract: John is 30 years old" }],
    },
  ],
  output_format: {
    type: "json_schema",
    json_schema: {
      name: "person",
      schema: {
        type: "object",
        properties: {
          name: { type: "string" },
          age: { type: "integer" },
        },
        required: ["name", "age"],
      },
    },
  },
});

response, err := client.Responses.Create(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "Extract: John is 30 years old"},
            },
        },
    },
    OutputFormat: &sdk.OutputFormat{
        Type: "json_schema",
        JSONSchema: &sdk.JSONSchemaFormat{
            Name: "person",
            Schema: map[string]any{
                "type": "object",
                "properties": map[string]any{
                    "name": map[string]any{"type": "string"},
                    "age":  map[string]any{"type": "integer"},
                },
                "required": []string{"name", "age"},
            },
        },
    },
})

use serde_json::json;

let response = client.responses().create(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("Extract: John is 30 years old".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    output_format: Some(OutputFormat {
        r#type: "json_schema".to_string(),
        json_schema: Some(JsonSchemaFormat {
            name: "person".to_string(),
            schema: Some(json!({
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"}
                },
                "required": ["name", "age"]
            })),
            ..Default::default()
        }),
    }),
    ..Default::default()
}).await?;

With Tools

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "What is the weather in London?"}]
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

const response = await mr.responses.create({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "What is the weather in London?" }],
    },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
});

// Check for tool calls
for (const output of response.output) {
  if (output.tool_calls) {
    for (const call of output.tool_calls) {
      console.log(`Tool: ${call.function?.name}`);
      console.log(`Args: ${call.function?.arguments}`);
    }
  }
}

Streaming

To receive streaming responses, set the Accept header to application/x-ndjson; profile="responses-stream/v2" (NDJSON) or text/event-stream (SSE).

Stream Request

curl -X POST https://api.modelrelay.ai/api/v1/responses \
  -H "Authorization: Bearer mr_sk_..." \
  -H "Content-Type: application/json" \
  -H "Accept: application/x-ndjson; profile=\"responses-stream/v2\"" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "text", "text": "Tell me a story"}]
      }
    ]
  }'

Stream Events

The stream returns newline-delimited JSON objects (or SSE data: payloads) with the following event types:

Event Type	Description
`start`	Stream initialization with model/provider info
`update`	Text delta or content update
`completion`	Final response with usage stats
`error`	Error during generation
`keepalive`	Heartbeat to keep connection alive
`tool_use_start`	Tool call initiated
`tool_use_delta`	Tool call argument delta
`tool_use_stop`	Tool call completed

Stream Event Structure

Field	Type	Description
`type`	string	Event type
`stream_mode`	string	`text-delta` or `structured-patch`
`stream_version`	string	Currently `v2`
`delta`	string	Text delta (for `update` events)
`content`	string	Full content so far
`patch`	array	JSON Patch operations (for structured output)
`request_id`	string	Request identifier
`provider`	string	Provider handling the request
`model`	string	Model generating the response
`stop_reason`	string	Why generation stopped (on `completion`)
`usage`	object	Token usage (on `completion`)
`code`	string	Error code (on `error`)
`message`	string	Error message (on `error`)
`status`	integer	HTTP status (on `error`)

SDK Streaming

const stream = await mr.responses.stream({
  model: "claude-sonnet-4-20250514",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "text", text: "Tell me a story" }],
    },
  ],
});

for await (const event of stream) {
  if (event.type === "update" && event.delta) {
    process.stdout.write(event.delta);
  }
}

stream, err := client.Responses.Stream(ctx, sdk.ResponseRequest{
    Model: "claude-sonnet-4-20250514",
    Input: []sdk.InputItem{
        {
            Type: "message",
            Role: "user",
            Content: []sdk.ContentPart{
                {Type: "text", Text: "Tell me a story"},
            },
        },
    },
})
if err != nil {
    return err
}
defer stream.Close()

for event := range stream.Events() {
    if event.Type == "update" && event.Delta != "" {
        fmt.Print(event.Delta)
    }
}

use futures_util::StreamExt;

let mut stream = client.responses().stream(ResponseRequest {
    model: Some("claude-sonnet-4-20250514".to_string()),
    input: vec![
        InputItem {
            r#type: "message".to_string(),
            role: Some("user".to_string()),
            content: Some(vec![
                ContentPart { r#type: "text".to_string(), text: Some("Tell me a story".to_string()) }
            ]),
            ..Default::default()
        }
    ],
    ..Default::default()
}).await?;

while let Some(event) = stream.next().await {
    let event = event?;
    if event.r#type == "update" {
        if let Some(delta) = &event.delta {
            print!("{}", delta);
        }
    }
}

Error Codes

Status	Code	Description
400	`bad_request`	Invalid request format or parameters
401	`unauthorized`	Missing or invalid authentication
402	`payment_required`	Customer spend limit exceeded
403	`forbidden`	Publishable keys cannot access this endpoint
404	`not_found`	Model or resource not found
429	`rate_limited`	Too many requests
500	`internal_error`	Server error

Next Steps

Streaming Guide - Detailed streaming patterns
Tool Use Guide - Working with function calling
Structured Output Guide - JSON schema responses
Models - Available models