Transcribe โ From audio file
POST/transcribe
Generate a transcript from an audio file. Only audio/*
mime types are supported. The maximum duration is 10 minutes. If you have longer files, please use the asynchronous equivalent.
Requestโ
- multipart/form-data
Body
required
request_parameters
objectrequired
The spoken or written locale of the transcript, representing both the language and its specific regional variant.
Possible values: [ENGLISH_US
, ENGLISH_UK
, SPANISH_ES
, SPANISH_MX
, FRENCH_FR
, ARABIC_EG
, ARABIC_LB
, ARABIC_MA
, ARABIC_SA
, ARMENIAN_AM
, BENGALI_IN
, CANTONESE_CN
, CROATIAN_HR
, FILIPINO_PH
, GERMAN_DE
, GREEK_GR
, GUJARATI_IN
, HEBREW_IL
, HINDI_IN
, ITALIAN_IT
, JAPANESE_JP
, KHMER_KH
, KOREAN_KR
, MANDARIN_CN
, PERSIAN_IR
, POLISH_PL
, PORTUGUESE_PT
, PUNJABI_IN
, RUSSIAN_RU
, SERBIAN_RS
, TAMIL_IN
, TELUGU_IN
, THAI_TH
, URDU_IN
, VIETNAMESE_VN
]
ENGLISH_US
Indicates whether to segment transcription results at sentence boundaries. Default is false, meaning that a single transcript item may encompass multiple sentences, provided they are not delineated by pauses (silence) in the audio.
false
Responsesโ
- 200
- application/json
- Schema
- Example (from schema)
Schema
Array [
]
transcript
object[]
required
The transcribed text.
Also, Iโm allergic to peanuts.
Who said the text in this transcript item.
Possible values: [doctor
, patient
, unspecified
]
doctor
Start time of this transcription item as the offset, in milliseconds, from the start of the audio file.
65100
End time of this transcription item as the offset, in milliseconds, from the start of the audio file. Equals the start_time_ms
plus the duration of the related transcribed audio portion.
69300
{
"transcript": [
{
"text": "Also, Iโm allergic to peanuts.",
"speaker": "doctor",
"start_offset_ms": 65100,
"end_offset_ms": 69300
}
]
}