POST
/
v1
/
chat
/
completions
Chat Completions (Stream)
curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "stream": true,
  "temperature": 123,
  "max_tokens": 123,
  "top_p": 123,
  "frequency_penalty": 123,
  "presence_penalty": 123
}
'
Create a model response for the given chat conversation with streaming enabled.

Overview

The streaming Chat Completions API allows you to receive partial responses as they are generated, providing a more responsive user experience. When streaming is enabled, the API returns Server-Sent Events (SSE) that contain incremental content.

Authentication

All requests require a Bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Parameters

model
string
required
ID of the model to use. Example: gpt-4, gpt-4-turbo, gpt-4o
messages
array
required
A list of messages comprising the conversation so far.
stream
boolean
required
Must be set to true for streaming responses.
temperature
number
default:"1"
Sampling temperature between 0 and 2. Higher values make output more random.
max_tokens
integer
Maximum number of tokens to generate in the response.
top_p
number
default:"1"
Nucleus sampling parameter. Not recommended to modify both temperature and top_p.
frequency_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.
presence_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.

Request Example

curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true,
    "temperature": 0.7
  }'

Response Example (SSE Stream)

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Response Fields

FieldTypeDescription
idstringUnique identifier for the completion chunk
objectstringObject type, always chat.completion.chunk
createdintegerUnix timestamp of when the chunk was created
modelstringThe model used for completion
choicesarrayList of completion choices with delta content