7.7.7. OPENAI-07 — Streaming Chat
Streaming delivers the assistant’s reply token-by-token over Server-Sent Events
(SSE) instead of waiting for the whole response. chat_stream invokes your
on_delta block for each text increment, accumulates the full text, and
reassembles any streamed tool calls into result.tool_calls.
7.7.7.1. Streaming with on_delta
chat_stream sets stream = true for you. Each text increment is passed to
the trailing block; the return value carries the full accumulated content and
the finish_reason:
var req = ChatCompletionRequest(model = "gpt-4o-mini",
messages <- [ChatMessage(role = "user", content = "Stream a short greeting.")])
let result = chat_stream(client, req) $(delta : string) {
print(delta) // each increment, as it arrives
}
if (result.ok) {
print("\naccumulated: {result.content}\n")
print("finish_reason: {result.finish_reason}\n")
}
7.7.7.2. How it works
Under the hood chat_stream reads the SSE response with request_cb,
splits it into data: lines, decodes each ChatCompletionChunk, appends
choices[0].delta.content, and stops at data: [DONE]. It tolerates CRLF
line endings and SSE comment lines, and accumulates increments efficiently
(joining once at the end).
7.7.7.3. Streamed tool calls
When a model streams a tool call, the id and function name arrive in the first
frame and the JSON arguments accrete across later frames. chat_stream
reassembles them by index into result.tool_calls — on_delta fires for
text only, so a pure tool-call stream produces no text deltas and ends with
finish_reason == "tool_calls":
var req = ChatCompletionRequest(model = "gpt-4o-mini",
messages <- [ChatMessage(role = "user", content = "What's the weather in Paris?")],
tools <- [Tool(_type = "function",
_function = FunctionDef(name = "get_weather", description = "Look up the weather."))])
let result = chat_stream(client, req) $(delta : string) {}
for (tc in result.tool_calls) {
print("{tc._function.name}({tc._function.arguments})\n")
// → get_weather({"location":"Paris"})
}
The reassembled result.tool_calls are ordinary ToolCall values — identical
to what the non-streaming chat() returns, so you
feed the result back the same way.
7.7.7.4. Quick Reference
Function / value |
Description |
|---|---|
|
Stream a reply; block runs per increment |
|
Full accumulated assistant text |
|
Why generation stopped ( |
|
Reassembled streamed tool calls (empty if none) |
|
Success flag / unified error |
See also
Full source: tutorials/dasOPENAI/07_streaming_chat.das
For the SSE protocol and request_cb see HV-07 — SSE and Streaming.