Skip to content

fix: emit response.output_text.done streaming event per OpenAI spec#5308

Open
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:fix/output-text-done-event
Open

fix: emit response.output_text.done streaming event per OpenAI spec#5308
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:fix/output-text-done-event

Conversation

@robinnarsinghranabhat
Copy link
Copy Markdown
Contributor

@robinnarsinghranabhat robinnarsinghranabhat commented Mar 26, 2026

Summary

The LlamaStack server was missing the response.output_text.done streaming event, which the OpenAI Responses API spec requires between output_text.delta events and content_part.done.

Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client.

Fixes #5309

Changes

  • streaming.py: Import and emit OutputTextDone with final accumulated text and logprobs, before content_part.done
  • openai_responses.py: Add logprobs field to OutputTextDone type definition (per OpenAI spec)
  • test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering

Test plan

Reproduce with this snippet

Prerequisites: LlamaStack server running at localhost:8321 with a registered model.

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8321/v1", api_key="fake")

events = list(client.responses.create(
    model="ollama/gpt-oss:20b",  # or any registered model
    input="What is 2 + 2?",
    stream=True,
))

for e in events:
    item = getattr(e, 'item', None)
    if item and hasattr(item, 'type'):
        print(f"{e.type:<50} item.type={item.type}")
    elif e.type == "response.output_text.done":
        print(f"{e.type:<50} text={e.text!r}")
    elif e.type == "response.content_part.done":
        print(f"{e.type:<50} part.type={e.part.type}")
    else:
        print(e.type)

has_done = any(e.type == "response.output_text.done" for e in events)
print(f"\nHas response.output_text.done: {has_done}")

Ground truth (OpenAI gpt-5.1 directly)

response.created
response.in_progress
response.output_item.added                         item.type=message
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          text='2 + 2 = 4.'    <-- present
response.content_part.done
response.output_item.done                          item.type=message
response.completed

Before this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
                                                   <-- output_text.done MISSING
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

After this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          <-- NOW PRESENT
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

Note: Reasoning streaming events are not fully spec-compliant yet (e.g. incorrect ordering, missing output_item.added/done for reasoning items). That will be addressed in a follow-up PR. This PR focuses solely on the missing output_text.done event.

Unit tests: 223 passing (uv run pytest tests/unit/providers/responses/builtin/ -q)

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026
The LlamaStack server was missing the `response.output_text.done`
streaming event, which the OpenAI Responses API spec requires between
`output_text.delta` and `content_part.done`. This event carries the
final accumulated text and logprobs.

Discovered by comparing streaming event sequences between OpenAI's
gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI
Python client.

Changes:
- streaming.py: Import and emit OutputTextDone with final text and
  logprobs before content_part.done
- openai_responses.py: Add logprobs field to OutputTextDone type
  definition (required per OpenAI spec)
- test_openai_responses.py: Verify output_text.done is emitted with
  correct fields and ordering (before content_part.done)
@robinnarsinghranabhat robinnarsinghranabhat force-pushed the fix/output-text-done-event branch from 8b6ac1c to f76c9c8 Compare March 26, 2026 01:39
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

fix: emit response.output_text.done streaming event per OpenAI spec

Edit this comment to update it. It will appear in the SDK's changelogs.

llama-stack-client-node studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

llama-stack-client-go studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

llama-stack-client-python studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-03-26 10:24:29 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Responses API streaming parity: missing response.output_text.done event

2 participants