POST
/
v1
/
chat
/
completions
curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful AI assistant."
      },
      {
        "role": "user",
        "content": "Hello, Claude!"
      }
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "claude-sonnet-4-20250514",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}
Official documentation: https://docs.anthropic.com/en/api/messages
Create model responses for chat conversations using Anthropic Claude models with OpenAI-compatible API format. This endpoint provides compatibility with OpenAI’s chat completions API while using Claude models.

Request Parameters

model
string
required
The model to use. Available Claude models include:
  • claude-haiku-4-5-20251001
  • claude-sonnet-4-20250514
  • claude-opus-4-5-20251101
messages
array
required
Array of message objects comprising the conversation. Each message has:
  • role: Either “system”, “user”, or “assistant”
  • content: The message content (string or array for multimodal)
max_tokens
integer
The maximum number of tokens to generate. Defaults to model’s maximum.
temperature
number
default:"1"
Sampling temperature between 0 and 2. Higher values make output more random.
top_p
number
Nucleus sampling parameter. Only sample from the top P probability mass.
stream
boolean
default:"false"
Whether to stream back partial progress using server-sent events.
stop
string | array
Up to 4 sequences where the API will stop generating further tokens.
presence_penalty
number
default:"0"
Number between -2.0 and 2.0. Penalizes new tokens based on presence in text.
frequency_penalty
number
default:"0"
Number between -2.0 and 2.0. Penalizes tokens based on frequency in text.

Vision Support

Claude supports image understanding through multimodal content. Include images in the content array:
{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "What's in this image?"
    },
    {
      "type": "image_url",
      "image_url": {
        "url": "https://example.com/image.jpg"
      }
    }
  ]
}

Response

id
string
Unique identifier for the completion.
object
string
Object type, which is chat.completion.
created
integer
Unix timestamp of when the completion was created.
model
string
The model used for completion.
choices
array
List of completion choices.
usage
object
Token usage information.
curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful AI assistant."
      },
      {
        "role": "user",
        "content": "Hello, Claude!"
      }
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "claude-sonnet-4-20250514",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}