7.7.6. OPENAI-06 — Audio (Speech and Transcription)
This tutorial covers text-to-speech (TTS) and speech-to-text (STT).
7.7.6.1. Text to speech
speak synthesizes text into audio and returns the raw bytes
(array<uint8>, empty on error). Write them to a file to play later, or
stream them onward — the byte format is whatever the server produced:
require openai/openai_audio
require daslib/fio
var audio <- speak(client, "tts-1", "alloy", "Hello from daslang.")
print("received {length(audio)} audio bytes\n")
fopen("out.wav", "wb") $(f) {
fwrite(f, audio)
}
delete audio
For the full request (response format, speed) use speech(client, req) with a
SpeechRequest.
7.7.6.2. Speech to text
transcribe uploads an audio file (multipart) and returns the recognized
text. translate is the same but returns English:
let res = transcribe(client, "whisper-1", "recording.wav")
if (res.ok) {
print("transcript: {res.text}\n")
} else {
print("error [{res.error.kind}/{res.error.status}]: {res.error.message}\n")
}
7.7.6.3. Quick Reference
Function |
Description |
|---|---|
|
TTS → audio bytes ( |
|
Full TTS request ( |
|
STT: audio file → text |
|
STT into English |