Process and understand audio content using Gemini models
Official documentation: https://ai.google.dev/gemini-api/docs/audioAnalyze and understand audio content using Google Gemini models. The model can transcribe speech, answer questions about audio, and extract information from audio files.
role (string): Role (user or model)parts (array): Content parts array, can include:
text (string): Text prompt or question about the audioinline_data (object): Audio data
mime_type (string): Audio MIME type (e.g., “audio/mp3”, “audio/wav”)data (string): Base64-encoded audio datatemperature (number): Sampling temperaturetopP (number): Nucleus sampling parametermaxOutputTokens (integer): Maximum output tokens