Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions client-sdks/stainless/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,10 @@ resources:
models:
response_object_stream: OpenAIResponseObjectStream
response_object: OpenAIResponseObject
compacted_response: OpenAICompactedResponse
response_input: OpenAIResponseInput
response_message: OpenAIResponseMessage
response_output: OpenAIResponseOutput
methods:
create:
type: http
Expand All @@ -189,6 +193,9 @@ resources:
delete:
type: http
endpoint: delete /v1/responses/{response_id}
compact:
type: http
endpoint: post /v1/responses/compact
subresources:
input_items:
methods:
Expand Down
300 changes: 228 additions & 72 deletions client-sdks/stainless/openapi.yml

Large diffs are not rendered by default.

45 changes: 35 additions & 10 deletions docs/docs/api-openai/conformance.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,20 @@ This documentation is auto-generated from the OpenAI API specification compariso

| Metric | Value |
|--------|-------|
| **Overall Conformance Score** | 87.8% |
| **Endpoints Implemented** | 28/146 |
| **Overall Conformance Score** | 87.6% |
| **Endpoints Implemented** | 29/146 |
| **Total Properties Checked** | 3441 |
| **Schema/Type Issues** | 288 |
| **Missing Properties** | 131 |
| **Total Issues to Fix** | 419 |
| **Schema/Type Issues** | 295 |
| **Missing Properties** | 132 |
| **Total Issues to Fix** | 427 |

## Integration Test Coverage

**Overall Test Coverage Score: 44.1%**
**Overall Test Coverage Score: 43.8%**

| Category | Covered | Total | Score |
|----------|---------|-------|-------|
| CRUD Operations | 5 | 6 | 83.3% |
| CRUD Operations | 5 | 7 | 71.4% |
| Conversations | 5 | 9 | 55.6% |
| Request Parameters | 21 | 25 | 84.0% |
| Streaming Events | 16 | 53 | 30.2% |
Expand All @@ -51,7 +51,7 @@ Categories are sorted by conformance score (lowest first, needing most attention
| Embeddings | 64.3% | 14 | 5 | 0 |
| Files | 66.7% | 42 | 8 | 6 |
| Models | 66.7% | 15 | 0 | 5 |
| Responses | 86.7% | 225 | 29 | 1 |
| Responses | 83.1% | 225 | 36 | 2 |
| Chat | 87.1% | 403 | 33 | 19 |
| Conversations | 98.8% | 2165 | 22 | 4 |

Expand Down Expand Up @@ -175,7 +175,6 @@ The following OpenAI API endpoints are not yet implemented in Llama Stack:

### /responses

- `/responses/compact`
- `/responses/input_tokens`

### /skills
Expand Down Expand Up @@ -964,7 +963,7 @@ Below is a detailed breakdown of conformance issues and missing properties for e

### Responses

**Score:** 86.7% Β· **Issues:** 29 Β· **Missing:** 1
**Score:** 83.1% Β· **Issues:** 36 Β· **Missing:** 2

#### `/responses`

Expand Down Expand Up @@ -1014,6 +1013,32 @@ Below is a detailed breakdown of conformance issues and missing properties for e

</details>

#### `/responses/compact`

**POST**

<details>
<summary>Missing Properties (1)</summary>

- `requestBody.content.application/x-www-form-urlencoded`

</details>

<details>
<summary>Schema Issues (7)</summary>

| Property | Issues | Tested |
|----------|--------|--------|
| `requestBody.content.application/json.properties.input` | Union variants added: 2; Union variants removed: 1 | Yes |
| `requestBody.content.application/json.properties.model` | Type added: ['string']; Union variants removed: 3 | Yes |
| `responses.200.content.application/json.properties.object` | Default changed: response.compaction -> None | No |
| `responses.200.content.application/json.properties.output.items` | Union variants added: 4 | Yes |
| `responses.200.content.application/json.properties.usage` | Type removed: ['object'] | Yes |
| `responses.200.content.application/json.properties.usage.properties.input_tokens_details` | Type removed: ['object'] | No |
| `responses.200.content.application/json.properties.usage.properties.output_tokens_details` | Type removed: ['object'] | No |

</details>

### Chat

**Score:** 87.1% Β· **Issues:** 33 Β· **Missing:** 19
Expand Down
26 changes: 21 additions & 5 deletions docs/docs/api-openai/provider_matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,19 @@ inference provider, based on integration test results.

| Provider | Tested | Passing | Failing | Coverage |
|----------|--------|---------|---------|----------|
| azure | 102 | 102 | 0 | 86% |
| bedrock | 25 | 25 | 0 | 21% |
| openai | 119 | 119 | 0 | 100% |
| azure | 113 | 113 | 0 | 87% |
| bedrock | 25 | 25 | 0 | 19% |
| openai | 130 | 130 | 0 | 100% |
| vllm | 1 | 1 | 0 | 1% |
| watsonx | 56 | 56 | 0 | 47% |
| watsonx | 56 | 56 | 0 | 43% |

## Provider Details

Models, endpoints, and versions used during test recordings.

| Provider | Model(s) | Endpoint | Version Info |
|----------|----------|----------|--------------|
| azure | gpt-4o | llama-stack-test.openai.azure.com, lls-test.openai.azure.com | openai sdk: 2.5.0 |
| azure | gpt-4o | llama-stack-test.openai.azure.com, lls-test.openai.azure.com | openai sdk: 2.30.0 |
| bedrock | openai.gpt-oss-20b | bedrock-mantle.us-east-2.api.aws | openai sdk: 2.5.0 |
| openai | gpt-4o, o4-mini, text-embedding-3-small | api.openai.com | openai sdk: 2.5.0 |
| vllm | Qwen/Qwen3-0.6B | β€” | β€” |
Expand All @@ -53,6 +53,22 @@ Models, endpoints, and versions used during test recordings.
| streaming basic | βœ… | βœ… | βœ… | β€” | βœ… |
| streaming incremental content | βœ… | βœ… | βœ… | β€” | βœ… |

## Compact Responses

| Feature | azure | bedrock | openai | vllm | watsonx |
| --- | --- | --- | --- | --- | --- |
| compact basic conversation | βœ… | β€” | βœ… | β€” | β€” |
| compact chain through compaction | βœ… | β€” | βœ… | β€” | β€” |
| compact double compaction | βœ… | β€” | βœ… | β€” | β€” |
| compact input items hides compaction | βœ… | β€” | βœ… | β€” | β€” |
| compact roundtrip | βœ… | β€” | βœ… | β€” | β€” |
| compact single message | βœ… | β€” | βœ… | β€” | β€” |
| compact with previous response id | βœ… | β€” | βœ… | β€” | β€” |
| compact with tool calls dropped | βœ… | β€” | βœ… | β€” | β€” |
| context management auto compacts large input | βœ… | β€” | βœ… | β€” | β€” |
| context management no compact below threshold | βœ… | β€” | βœ… | β€” | β€” |
| context management none does not compact | βœ… | β€” | βœ… | β€” | β€” |

## Conversation Responses

| Feature | azure | bedrock | openai | vllm | watsonx |
Expand Down
Loading
Loading