fix: emit response.output_text.done streaming event per OpenAI spec by robinnarsinghranabhat · Pull Request #5308 · llamastack/llama-stack

robinnarsinghranabhat · 2026-03-26T01:18:49Z

Summary

The LlamaStack server was missing the response.output_text.done streaming event, which the OpenAI Responses API spec requires between output_text.delta events and content_part.done.

Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client.

Fixes #5309

Changes

streaming.py: Import and emit OutputTextDone with final accumulated text and logprobs, before content_part.done
openai_responses.py: Add logprobs field to OutputTextDone type definition (per OpenAI spec)
test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering

Test plan

Reproduce with this snippet

Prerequisites: LlamaStack server running at localhost:8321 with a registered model.

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8321/v1", api_key="fake")

events = list(client.responses.create(
    model="ollama/gpt-oss:20b",  # or any registered model
    input="What is 2 + 2?",
    stream=True,
))

for e in events:
    item = getattr(e, 'item', None)
    if item and hasattr(item, 'type'):
        print(f"{e.type:<50} item.type={item.type}")
    elif e.type == "response.output_text.done":
        print(f"{e.type:<50} text={e.text!r}")
    elif e.type == "response.content_part.done":
        print(f"{e.type:<50} part.type={e.part.type}")
    else:
        print(e.type)

has_done = any(e.type == "response.output_text.done" for e in events)
print(f"\nHas response.output_text.done: {has_done}")

Ground truth (OpenAI `gpt-5.1` directly)

response.created
response.in_progress
response.output_item.added                         item.type=message
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          text='2 + 2 = 4.'    <-- present
response.content_part.done
response.output_item.done                          item.type=message
response.completed

Before this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
                                                   <-- output_text.done MISSING
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

After this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          <-- NOW PRESENT
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

Note: Reasoning streaming events are not fully spec-compliant yet (e.g. incorrect ordering, missing output_item.added/done for reasoning items). That will be addressed in a follow-up PR. This PR focuses solely on the missing output_text.done event.

Unit tests: 223 passing (uv run pytest tests/unit/providers/responses/builtin/ -q)

The LlamaStack server was missing the `response.output_text.done` streaming event, which the OpenAI Responses API spec requires between `output_text.delta` and `content_part.done`. This event carries the final accumulated text and logprobs. Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client. Changes: - streaming.py: Import and emit OutputTextDone with final text and logprobs before content_part.done - openai_responses.py: Add logprobs field to OutputTextDone type definition (required per OpenAI spec) - test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering (before content_part.done)

github-actions · 2026-03-26T01:40:26Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

fix: emit response.output_text.done streaming event per OpenAI spec

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ llama-stack-client-node studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

✅ llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

✅ llama-stack-client-go studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

✅ llama-stack-client-python studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-03-26 10:24:29 UTC

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026

robinnarsinghranabhat force-pushed the fix/output-text-done-event branch from 8b6ac1c to f76c9c8 Compare March 26, 2026 01:39

robinnarsinghranabhat marked this pull request as ready for review March 26, 2026 01:50

robinnarsinghranabhat requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners March 26, 2026 01:50

Merge branch 'main' into fix/output-text-done-event

1c56df4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: emit response.output_text.done streaming event per OpenAI spec#5308

fix: emit response.output_text.done streaming event per OpenAI spec#5308
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:fix/output-text-done-event

robinnarsinghranabhat commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robinnarsinghranabhat commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Reproduce with this snippet

Ground truth (OpenAI gpt-5.1 directly)

Before this PR

After this PR

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robinnarsinghranabhat commented Mar 26, 2026 •

edited

Loading

Ground truth (OpenAI `gpt-5.1` directly)

github-actions bot commented Mar 26, 2026 •

edited

Loading