7.7.7. OPENAI-07 — Streaming Chat

Streaming delivers the assistant’s reply token-by-token over Server-Sent Events (SSE) instead of waiting for the whole response. chat_stream invokes your on_delta block for each text increment, accumulates the full text, and reassembles any streamed tool calls into result.tool_calls.

7.7.7.1. Streaming with on_delta

chat_stream sets stream = true for you. Each text increment is passed to the trailing block; the return value carries the full accumulated content and the finish_reason:

var req = ChatCompletionRequest(model = "gpt-4o-mini",
    messages <- [ChatMessage(role = "user", content = "Stream a short greeting.")])

let result = chat_stream(client, req) $(delta : string) {
    print(delta)          // each increment, as it arrives
}

if (result.ok) {
    print("\naccumulated: {result.content}\n")
    print("finish_reason: {result.finish_reason}\n")
}

7.7.7.2. How it works

Under the hood chat_stream reads the SSE response with request_cb, splits it into data: lines, decodes each ChatCompletionChunk, appends choices[0].delta.content, and stops at data: [DONE]. It tolerates CRLF line endings and SSE comment lines, and accumulates increments efficiently (joining once at the end).

7.7.7.3. Streamed tool calls

When a model streams a tool call, the id and function name arrive in the first frame and the JSON arguments accrete across later frames. chat_stream reassembles them by index into result.tool_callson_delta fires for text only, so a pure tool-call stream produces no text deltas and ends with finish_reason == "tool_calls":

var req = ChatCompletionRequest(model = "gpt-4o-mini",
    messages <- [ChatMessage(role = "user", content = "What's the weather in Paris?")],
    tools <- [Tool(_type = "function",
        _function = FunctionDef(name = "get_weather", description = "Look up the weather."))])

let result = chat_stream(client, req) $(delta : string) {}

for (tc in result.tool_calls) {
    print("{tc._function.name}({tc._function.arguments})\n")
    // → get_weather({"location":"Paris"})
}

The reassembled result.tool_calls are ordinary ToolCall values — identical to what the non-streaming chat() returns, so you feed the result back the same way.

7.7.7.4. Quick Reference

Function / value

Description

chat_stream(client, req) $(delta) { }

Stream a reply; block runs per increment

result.content

Full accumulated assistant text

result.finish_reason

Why generation stopped ("stop", "tool_calls")

result.tool_calls

Reassembled streamed tool calls (empty if none)

result.ok / result.error

Success flag / unified error

See also

Full source: tutorials/dasOPENAI/07_streaming_chat.das

For the SSE protocol and request_cb see HV-07 — SSE and Streaming.