8.8.6. OPENAI-06 — Audio (Speech and Transcription)

This tutorial covers text-to-speech (TTS) and speech-to-text (STT).

8.8.6.1. Text to speech

speak synthesizes text into audio and returns the raw bytes (array<uint8>, empty on error). Write them to a file to play later, or stream them onward — the byte format is whatever the server produced:

require openai/openai_audio
require daslib/fio

var audio <- speak(client, "tts-1", "alloy", "Hello from daslang.")
print("received {length(audio)} audio bytes\n")
fopen("out.wav", "wb") $(f) {
    fwrite(f, audio)
}
delete audio

For the full request (response format, speed) use speech(client, req) with a SpeechRequest.

8.8.6.2. Speech to text

transcribe uploads an audio file (multipart) and returns the recognized text. translate is the same but returns English:

let res = transcribe(client, "whisper-1", "recording.wav")
if (res.ok) {
    print("transcript: {res.text}\n")
} else {
    print("error [{res.error.kind}/{res.error.status}]: {res.error.message}\n")
}

8.8.6.3. Quick Reference

Function	Description
`speak(client, model, voice, text)`	TTS → audio bytes (`array<uint8>`)
`speech(client, req)`	Full TTS request (`SpeechResult`)
`transcribe(client, model, path)`	STT: audio file → text
`translate(client, model, path)`	STT into English