Chat Completions (Stream)

curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "stream": true,
  "temperature": 123,
  "max_tokens": 123,
  "top_p": 123,
  "frequency_penalty": 123,
  "presence_penalty": 123
}
'

POST

chat

completions

curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "stream": true,
  "temperature": 123,
  "max_tokens": 123,
  "top_p": 123,
  "frequency_penalty": 123,
  "presence_penalty": 123
}
'

Create a model response for the given chat conversation with streaming enabled.

Overview

The streaming Chat Completions API allows you to receive partial responses as they are generated, providing a more responsive user experience. When streaming is enabled, the API returns Server-Sent Events (SSE) that contain incremental content.

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

model

string

required

ID of the model to use. Example: gpt-4, gpt-4-turbo, gpt-4o

messages

array

required

A list of messages comprising the conversation so far.

stream

boolean

required

Must be set to true for streaming responses.

temperature

number

default:"1"

Sampling temperature between 0 and 2. Higher values make output more random.

max_tokens

integer

Maximum number of tokens to generate in the response.

top_p

number

default:"1"

Nucleus sampling parameter. Not recommended to modify both temperature and top_p.

frequency_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.

presence_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.

Request Example

curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true,
    "temperature": 0.7
  }'

Response Example (SSE Stream)

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Response Fields

Field	Type	Description
id	string	Unique identifier for the completion chunk
object	string	Object type, always `chat.completion.chunk`
created	integer	Unix timestamp of when the chunk was created
model	string	The model used for completion
choices	array	List of completion choices with delta content

Chat Completions Chat Completions Vision (Stream)

⌘I

Chat

Responses

Image Models

Video Models

GPTs

Doubao Series

Chat Completions (Stream)

Overview

Authentication

Request Parameters

Request Example

Response Example (SSE Stream)

Response Fields

Chat

Responses

Image Models

Video Models

GPTs

Doubao Series

​Overview

​Authentication

​Request Parameters

​Request Example

​Response Example (SSE Stream)

​Response Fields

Overview

Authentication

Request Parameters

Request Example

Response Example (SSE Stream)

Response Fields