Skip to main content

Listen — From audio file



Generate a transcript and a structured clinical note from an audio file. Only audio/* mime types are supported. The maximum duration is 10 minutes. If you have longer files, please use the asynchronous equivalent.



    request_parameters objectrequired

    The object containing all the information needed along with the audio file to transcribe and generate a note.

    output_objects listen_output_object[]required

    Possible values: [transcript_item, note]

    Specifies which items you want us to send you back. In other words, which feature(s) you want to use, transcription and/or note generation.

    language copilot_languagerequired

    Possible values: [fr, en, en-US, en-GB, fr-FR, es-ES, es-MX]

    Language spoken in the audio ('fr' and 'en' are deprecated, and correspond to 'fr-FR' and 'en-US' respectively).

    split_by_sentence booleanrequired

    Default value: false

    Indicates whether to segment transcription results at sentence boundaries. Default is false, meaning that a single transcript item may encompass multiple sentences, provided they are not delineated by pauses (silence) in the audio.

    note_generation_mode copilot_note_generation_mode

    Possible values: [highest_quality, fastest]

    Choose a generation mode:

    highest_quality: generates very high quality notes, might take up to one minute;

    fastest: quicker note generation (few seconds), but might not give the best possible output.

    Default is highest_quality.

    section_style copilot_section_style

    Possible values: [auto, paragraphs, bullet_points]

    Choose a desired style for note sections:

    paragraphs: Prioritizes generating paragraphs;

    bullet_points: Prioritizes structuring content using bullet points.

    auto: Automatically picks the most natural formatting option.

    Default is auto.

    note_template copilot_note_template


    The desired template of the generated note. Default is GENERAL_MEDICINE. Check Note template for details.

    file binaryrequired


Results of processing the audio file.

    transcript object[]

    Transcript items from the audio file.

  • Array [
  • text stringrequired

    The transcribed text.

    speaker copilot_speakerrequired

    Possible values: [doctor, patient, unspecified]

    Who said the text in this transcript item.

    start_offset_ms integerrequired

    Start time of this transcription item as the offset, in milliseconds, from the start of the audio file.

    end_offset_ms integerrequired

    End time of this transcription item as the offset, in milliseconds, from the start of the audio file. Equals the start_time_ms plus the duration of the related transcribed audio portion.

  • ]
  • note object

    The generated note.

    sections object[]required

    Content of the note structured in multiple sections.

  • Array [
  • key stringrequired

    A key identifying a section of a note. The set of possible keys depend on the template that is used. Check Note template for possible values.

    title stringrequired

    The section title.

    text stringrequired

    Content of the note section.

  • ]