Official documentation: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
Create model responses for given messages using Anthropic Claude models.
Request Parameters
The model to use. Available models include:
claude-haiku-4-5-20251001
claude-sonnet-4-20250514
claude-opus-4-5-20251101
The maximum number of tokens to generate before stopping.
Input messages. Each message has a role and content.content can be:
- A string for simple text messages
- An array of content blocks for multimodal input (text, images, documents)
Amount of randomness injected into the response. Ranges from 0.0 to 1.0.
Use nucleus sampling. Only sample from the top P probability mass.
Only sample from the top K options for each subsequent token.
Whether to incrementally stream the response using server-sent events.
System prompt that sets context and instructions for the model.
Multimodal Content
Claude supports various content types in messages:
Text Content
{
"type": "text",
"text": "Your message here"
}
Image Content
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "base64_encoded_image_data"
}
}
Document Content (PDF)
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": "base64_encoded_pdf_data"
},
"cache_control": {
"type": "ephemeral"
}
}
Response
Unique identifier for the message.
Object type, which is message.
Role of the generated message, which is assistant.
Content generated by the model.
The model that processed the request.
The reason the model stopped generating.
curl -X POST https://api.example.com/v1/messages \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '{
"model": "claude-haiku-4-5-20251001",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Hello, Claude!"
}
]
}'
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! How can I help you today?"
}
],
"model": "claude-haiku-4-5-20251001",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 10,
"output_tokens": 12
}
}