POST
/
v1beta
/
models
/
{model}
:generateContent
curl -X POST "https://api.example.com/v1beta/models/gemini-2.5-pro:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "Transcribe this audio"
          },
          {
            "inline_data": {
              "mime_type": "audio/mp3",
              "data": "BASE64_ENCODED_AUDIO_DATA"
            }
          }
        ]
      }
    ]
  }'
{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Transcription: Hello, this is a test audio recording. The speaker is discussing the benefits of artificial intelligence in modern technology."
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ]
}
Official documentation: https://ai.google.dev/gemini-api/docs/audio
Analyze and understand audio content using Google Gemini models. The model can transcribe speech, answer questions about audio, and extract information from audio files.

Request Parameters

key
string
required
API key.
contents
array
required
Content array containing text and audio data.Each content object contains:
  • role (string): Role (user or model)
  • parts (array): Content parts array, can include:
    • text (string): Text prompt or question about the audio
    • inline_data (object): Audio data
      • mime_type (string): Audio MIME type (e.g., “audio/mp3”, “audio/wav”)
      • data (string): Base64-encoded audio data
generationConfig
object
Generation configuration.
  • temperature (number): Sampling temperature
  • topP (number): Nucleus sampling parameter
  • maxOutputTokens (integer): Maximum output tokens

Response

Returns transcription and analysis of the provided audio.
curl -X POST "https://api.example.com/v1beta/models/gemini-2.5-pro:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "Transcribe this audio"
          },
          {
            "inline_data": {
              "mime_type": "audio/mp3",
              "data": "BASE64_ENCODED_AUDIO_DATA"
            }
          }
        ]
      }
    ]
  }'
{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Transcription: Hello, this is a test audio recording. The speaker is discussing the benefits of artificial intelligence in modern technology."
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ]
}