Responses
The /responses endpoint is the primary way to generate AI model responses through ModelRelay.
Create a Response
POST /api/v1/responses
Authentication
Requires either:
Secret key (mr_sk_*): Backend use with full project access
Customer bearer token : Customer scoped token for data-plane access
Publishable keys (mr_pk_*) cannot call this endpoint directly.
Request Body
Field
Type
Required
Description
input
array
Yes
Array of input messages
model
string
No
Model identifier (e.g., claude-sonnet-4-20250514)
provider
string
No
Provider override (anthropic, openai, xai, google-ai-studio)
output_format
object
No
Response format (text or json_schema)
max_output_tokens
integer
No
Maximum tokens to generate
temperature
number
No
Sampling temperature (0-2)
stop
array
No
Stop sequences (max 8, each max 128 chars)
tools
array
No
Available tools for the model
tool_choice
object
No
Tool selection behavior
Each input message has the following structure:
Field
Type
Required
Description
type
string
Yes
Always "message"
role
string
Yes
system, user, assistant, or tool
content
array
No
Array of content parts
tool_calls
array
No
Tool calls (for assistant messages)
tool_call_id
string
No
Tool call ID (for tool result messages)
Content Parts
Field
Type
Required
Description
type
string
Yes
Currently "text"
text
string
No
Text content
Field
Type
Required
Description
type
string
Yes
text or json_schema
json_schema
object
No
JSON schema definition (required when type is json_schema)
Field
Type
Required
Description
name
string
Yes
Schema name
description
string
No
Schema description
schema
object
Yes
JSON Schema object
strict
boolean
No
Enable strict mode
Field
Type
Required
Description
type
string
Yes
function, web, x_search, code_execution, or image_generation
function
object
No
Function definition
web
object
No
Web search configuration
x_search
object
No
X search configuration (Grok only)
code_execution
object
No
Code execution configuration
image_generation
object
No
Image generation configuration
Field
Type
Required
Description
name
string
Yes
Function name (dot-separated lowercase, e.g., fs.search)
description
string
No
Function description
parameters
object
No
JSON Schema for parameters
The image_generation tool enables models to generate images during a conversation. When the model decides to generate an image, the server handles the image generation and returns the result.
Field
Type
Required
Description
model
string
Yes
Image generation model ID (e.g., gemini-2.5-flash-image)
How it works:
Include an image_generation tool in your request with the target image model
The model may call the tool with a prompt and optional response_format
The server generates the image and returns the result as a tool response
The model can then reference or describe the generated image
Example:
{
"tools" : [
{
"type" : "image_generation" ,
"image_generation" : {
"model" : "gemini-2.5-flash-image"
}
}
]
}
Limitations:
Streaming is not supported when using image generation tools
Only one image_generation tool is allowed per request
Image generation tools cannot be combined with other tool calls in the same model turn
Field
Type
Required
Description
type
string
Yes
auto, required, or none
function
string
No
Specific function name to call
Response
Field
Type
Description
id
string
Response identifier from the provider
output
array
Array of output messages
model
string
Model that generated the response
provider
string
Provider that handled the request
stop_reason
string
Why generation stopped (stop, max_tokens, tool_use, etc.)
usage
object
Token usage statistics
citations
array
Sources from web search results (if applicable)
Usage
Field
Type
Description
input_tokens
integer
Tokens in the input
output_tokens
integer
Tokens in the output
total_tokens
integer
Total tokens used
When the model calls tools, the output message includes:
Field
Type
Description
id
string
Unique identifier for the tool call
type
string
function, web, x_search, code_execution, or image_generation
function.name
string
Function name
function.arguments
string
JSON string of function arguments
Examples
Basic Text Request
cURL
TypeScript
Go
Rust
curl -X POST https://api.modelrelay.ai/api/v1/responses \
-H "Authorization: Bearer mr_sk_..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"input": [
{
"type": "message",
"role": "user",
"content": [{"type": "text", "text": "What is 2 + 2?"}]
}
]
}'
import { ModelRelay } from "@modelrelay/sdk" ;
const mr = ModelRelay.fromSecretKey(process.env.MODELRELAY_SECRET_KEY! );
const response = await mr.responses.create({
model: "claude-sonnet-4-20250514" ,
input: [
{
type : "message" ,
role: "user" ,
content: [{ type : "text" , text: "What is 2 + 2?" }],
},
],
});
console.log(response.output);
response, err := client.Responses.Create (ctx, sdk.ResponseRequest{
Model: "claude-sonnet-4-20250514" ,
Input: []sdk.InputItem{
{
Type: "message" ,
Role: "user" ,
Content: []sdk.ContentPart{
{Type: "text" , Text: "What is 2 + 2?" },
},
},
},
})
if err != nil {
return err
}
fmt.Println (response.Output)
let response = client.responses().create(ResponseRequest {
model: Some ("claude-sonnet-4-20250514" .to_string()),
input: vec ! [
InputItem {
r#type: "message" .to_string(),
role: Some ("user" .to_string()),
content: Some (vec![
ContentPart { r#type: "text" .to_string(), text: Some ("What is 2 + 2?" .to_string()) }
]),
.. Default ::default()
}
],
.. Default ::default()
}).await ? ;
println!(" {:?} " , response.output);
Response
{
"id" : "msg_01XFDUDYJgAACzvnptvVoYEL" ,
"output" : [
{
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"text" : "2 + 2 equals 4."
}
]
}
],
"model" : "claude-sonnet-4-20250514" ,
"provider" : "anthropic" ,
"stop_reason" : "stop" ,
"usage" : {
"input_tokens" : 12 ,
"output_tokens" : 8 ,
"total_tokens" : 20
}
}
Structured Output (JSON Schema)
cURL
TypeScript
Go
Rust
curl -X POST https://api.modelrelay.ai/api/v1/responses \
-H "Authorization: Bearer mr_sk_..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"input": [
{
"type": "message",
"role": "user",
"content": [{"type": "text", "text": "Extract: John is 30 years old"}]
}
],
"output_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}
}
}
}'
const response = await mr.responses.create({
model: "claude-sonnet-4-20250514" ,
input: [
{
type : "message" ,
role: "user" ,
content: [{ type : "text" , text: "Extract: John is 30 years old" }],
},
],
output_format: {
type : "json_schema" ,
json_schema: {
name: "person" ,
schema: {
type : "object" ,
properties: {
name: { type : "string" },
age: { type : "integer" },
},
required: ["name" , "age" ],
},
},
},
});
response, err := client.Responses.Create (ctx, sdk.ResponseRequest{
Model: "claude-sonnet-4-20250514" ,
Input: []sdk.InputItem{
{
Type: "message" ,
Role: "user" ,
Content: []sdk.ContentPart{
{Type: "text" , Text: "Extract: John is 30 years old" },
},
},
},
OutputFormat: & sdk.OutputFormat{
Type: "json_schema" ,
JSONSchema: & sdk.JSONSchemaFormat{
Name: "person" ,
Schema: map [string ]any{
"type" : "object" ,
"properties" : map [string ]any{
"name" : map [string ]any{"type" : "string" },
"age" : map [string ]any{"type" : "integer" },
},
"required" : []string {"name" , "age" },
},
},
},
})
use serde_json::json;
let response = client.responses().create(ResponseRequest {
model: Some ("claude-sonnet-4-20250514" .to_string()),
input: vec ! [
InputItem {
r#type: "message" .to_string(),
role: Some ("user" .to_string()),
content: Some (vec![
ContentPart { r#type: "text" .to_string(), text: Some ("Extract: John is 30 years old" .to_string()) }
]),
.. Default ::default()
}
],
output_format: Some (OutputFormat {
r#type: "json_schema" .to_string(),
json_schema: Some (JsonSchemaFormat {
name: "person" .to_string(),
schema: Some (json!({
"type" : "object" ,
"properties" : {
"name" : {"type" : "string" },
"age" : {"type" : "integer" }
},
"required" : ["name" , "age" ]
})),
.. Default ::default()
}),
}),
.. Default ::default()
}).await ? ;
cURL
TypeScript
curl -X POST https://api.modelrelay.ai/api/v1/responses \
-H "Authorization: Bearer mr_sk_..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"input": [
{
"type": "message",
"role": "user",
"content": [{"type": "text", "text": "What is the weather in London?"}]
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
]
}'
const response = await mr.responses.create({
model: "claude-sonnet-4-20250514" ,
input: [
{
type : "message" ,
role: "user" ,
content: [{ type : "text" , text: "What is the weather in London?" }],
},
],
tools: [
{
type : "function" ,
function : {
name: "get_weather" ,
description: "Get current weather for a location" ,
parameters: {
type : "object" ,
properties: {
location: { type : "string" },
},
required: ["location" ],
},
},
},
],
});
// Check for tool calls
for (const output of response.output) {
if (output.tool_calls) {
for (const call of output.tool_calls) {
console.log(`Tool: ${ call.function ? .name} ` );
console.log(`Args: ${ call.function ? .arguments} ` );
}
}
}
Streaming
To receive streaming responses, set the Accept header to application/x-ndjson; profile="responses-stream/v2" (NDJSON) or text/event-stream (SSE).
Stream Request
curl -X POST https://api.modelrelay.ai/api/v1/responses \
-H "Authorization: Bearer mr_sk_..." \
-H "Content-Type: application/json" \
-H "Accept: application/x-ndjson; profile=\"responses-stream/v2\"" \
-d '{
"model": "claude-sonnet-4-20250514",
"input": [
{
"type": "message",
"role": "user",
"content": [{"type": "text", "text": "Tell me a story"}]
}
]
}'
Stream Events
The stream returns newline-delimited JSON objects (or SSE data: payloads) with the following event types:
Event Type
Description
start
Stream initialization with model/provider info
update
Text delta or content update
completion
Final response with usage stats
error
Error during generation
keepalive
Heartbeat to keep connection alive
tool_use_start
Tool call initiated
tool_use_delta
Tool call argument delta
tool_use_stop
Tool call completed
Stream Event Structure
Field
Type
Description
type
string
Event type
stream_mode
string
text-delta or structured-patch
stream_version
string
Currently v2
delta
string
Text delta (for update events)
content
string
Full content so far
patch
array
JSON Patch operations (for structured output)
request_id
string
Request identifier
provider
string
Provider handling the request
model
string
Model generating the response
stop_reason
string
Why generation stopped (on completion)
usage
object
Token usage (on completion)
code
string
Error code (on error)
message
string
Error message (on error)
status
integer
HTTP status (on error)
SDK Streaming
TypeScript
Go
Rust
const stream = await mr.responses.stream({
model: "claude-sonnet-4-20250514" ,
input: [
{
type : "message" ,
role: "user" ,
content: [{ type : "text" , text: "Tell me a story" }],
},
],
});
for await (const event of stream) {
if (event.type === "update" && event.delta) {
process.stdout.write(event.delta);
}
}
stream, err := client.Responses.Stream (ctx, sdk.ResponseRequest{
Model: "claude-sonnet-4-20250514" ,
Input: []sdk.InputItem{
{
Type: "message" ,
Role: "user" ,
Content: []sdk.ContentPart{
{Type: "text" , Text: "Tell me a story" },
},
},
},
})
if err != nil {
return err
}
defer stream.Close ()
for event := range stream.Events () {
if event.Type == "update" && event.Delta != "" {
fmt.Print (event.Delta)
}
}
use futures_util::StreamExt;
let mut stream = client.responses().stream(ResponseRequest {
model: Some ("claude-sonnet-4-20250514" .to_string()),
input: vec ! [
InputItem {
r#type: "message" .to_string(),
role: Some ("user" .to_string()),
content: Some (vec![
ContentPart { r#type: "text" .to_string(), text: Some ("Tell me a story" .to_string()) }
]),
.. Default ::default()
}
],
.. Default ::default()
}).await ? ;
while let Some (event) = stream.next().await {
let event = event? ;
if event.r#type == "update" {
if let Some (delta) = & event.delta {
print!(" {} " , delta);
}
}
}
Error Codes
Status
Code
Description
400
bad_request
Invalid request format or parameters
401
unauthorized
Missing or invalid authentication
402
payment_required
Customer spend limit exceeded
403
forbidden
Publishable keys cannot access this endpoint
404
not_found
Model or resource not found
429
rate_limited
Too many requests
500
internal_error
Server error
Next Steps