docs: rewrite README and docs to lead with OpenAI API compatibility by leseb · Pull Request #5323 · llamastack/llama-stack

leseb · 2026-03-26T17:10:14Z

Summary

The project's public-facing docs haven't kept up with how the project has evolved. The README opens with "standardizes the core building blocks" — a year-old message that could describe any AI framework. The word "OpenAI" appears zero times. The Responses API is invisible.

This PR rewrites the README, docs landing page, API overview, and OpenAI compatibility page to reflect what the project actually is today: an OpenAI-compatible API server with pluggable providers.

What changed

README.md — full rewrite

Leads with "Open-source API server. OpenAI-compatible. Any model, any infrastructure."
3-line code snippet showing the base_url swap with the OpenAI client
"What you get" section naming actual endpoints (/v1/chat/completions, /v1/responses, etc.)
Responses API featured (agentic orchestration, MCP, file_search)
Open Responses conformance mentioned
Provider architecture as concept (local → production → managed), not a giant table
Concise — net fewer lines than before

docs/docs/index.mdx — same messaging applied to the docs landing page

API table with actual endpoints and descriptions
Provider lists (inference + vector stores)
Links to conformance report and provider matrix

docs/docs/api-openai/index.mdx — trimmed from 230 lines to ~80

Removed generic filler (troubleshooting, best practices, roadmap, monitoring)
Added Responses API section with code example
Kept the endpoint table, conformance link, and provider matrix link

docs/docs/api-overview.md — now lists actual endpoints

Was: three sections saying "stable/experimental/deprecated APIs exist" with no specifics
Now: table of endpoints with descriptions

pyproject.toml — description updated from "Llama Stack" to a real PyPI description

What was removed

The big provider table (26 rows) — replaced by concept + link to full docs
The distribution table — replaced by install instructions
Bullet points about "unified API layer" and "standardized building blocks"
~170 lines of generic filler from the OpenAI compat page

Test plan

Content reviewed for accuracy against actual API surface
All links point to existing docs pages
pyproject.toml description is valid

🤖 Generated with Claude Code

Rewrite the README, docs landing page, API overview, and OpenAI compatibility page to reflect the current state of the project. The project has evolved from a "standardized Gen AI API" to an OpenAI-compatible API server with pluggable providers. The new messaging leads with what users care about: drop-in compatibility with the OpenAI API, any model, any infrastructure. Key changes: - README leads with "OpenAI-compatible API server" and a code snippet - APIs are described by their actual endpoints, not internal categories - Responses API (agentic orchestration, MCP, file_search) is featured - Provider architecture shown as local-to-production concept, not a table - Open Responses conformance mentioned - OpenAI compat page trimmed from 230 lines of filler to focused content - API overview page now lists actual endpoints - pyproject.toml description updated for PyPI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

leseb · 2026-03-26T17:10:57Z

@raghotham @franciscojavierarceo @cdoern @mattf as discussed in today's community call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

franciscojavierarceo · 2026-03-26T17:17:48Z

docs/docs/api-openai/index.mdx

- OpenAI API compatibility
- Cloud-based execution
- Scalable infrastructure
+## Implemented endpoints


We should also mention the non-OpenAI APIs that are API adjacent like Prompts and File Processor. Eventually we would add Memory to that list too.

franciscojavierarceo · 2026-03-26T17:19:53Z

docs/docs/index.mdx

 # Welcome to Llama Stack

-Llama Stack is the open-source framework for building generative AI applications.
+**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**


Suggested change

**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**

**Open-source Agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**

franciscojavierarceo · 2026-03-26T17:22:29Z

docs/docs/api-openai/index.mdx

+| Embeddings | `/v1/embeddings` | Text embeddings |
+| Models | `/v1/models` | Model listing and management |
+| Files | `/v1/files` | File upload and management |
+| Vector Stores | `/v1/vector_stores` | Document storage and semantic search |


actually we support more than just semantic search

franciscojavierarceo · 2026-03-26T17:23:47Z

README.md

+- **Safety** — content moderation via Llama Guard
+- **[Open Responses](https://www.openresponses.org/) conformant** — the Responses API implementation passes the Open Responses conformance test suite
+
+## Use any model, use any infrastructure


can we outline RAG in this diagram a little more? Right now it only shows inference provider plugin

franciscojavierarceo

some small suggestions but otherwise lgtm

mattf

looks good

gyliu513 · 2026-03-26T20:09:53Z

README.md

+**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**

-## Overview
+Llama Stack is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.


gyliu513 · 2026-03-26T20:11:15Z

README.md

+from openai import OpenAI
+
+client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
+response = client.chat.completions.create(
+    model="llama-3.3-70b",
+    messages=[{"role": "user", "content": "Hello"}],
+)


Here are more demos for openai responses api with openai client opendatahub-io/llama-stack-demos#324

gyliu513 · 2026-03-26T20:12:53Z

README.md

+- **Responses API** — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call ([learn more](https://llamastack.github.io/docs/api-openai))
+- **Vector Stores & Files** — `/v1/vector_stores` and `/v1/files` for managed document storage and retrieval
+- **Batches** — `/v1/batches` for offline batch processing
+- **Safety** — content moderation via Llama Guard


Safety apis are being removed #5291

README.md

- Add "agentic" to tagline per franciscojavierarceo suggestion - Remove Safety/Moderations (being removed in llamastack#5291) - Use uv instead of pip in install instructions - Remove Swift and Kotlin from SDK table - Fix "semantic search" to just "search" for vector stores - Mention non-OpenAI APIs (Prompts, File Processors) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Replace the flow diagram with a server architecture view that shows the API endpoints alongside both inference and vector store providers. This addresses the feedback that RAG/vector stores were missing from the diagram. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Add a third column showing tools & connectors (MCP servers, web search, file search/RAG) and file storage (local filesystem, S3). Add /v1/connectors to the API endpoints row. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Remind agents to update the ASCII architecture diagram in README.md when adding or removing providers, APIs, or backend integrations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

leseb · 2026-03-27T09:44:43Z

All comments addressed, merging now, thanks!

leseb requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, mattf and raghotham as code owners March 26, 2026 17:10

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026

fix: add language tag to fenced code block for markdownlint

e1dd29d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

franciscojavierarceo reviewed Mar 26, 2026

View reviewed changes

franciscojavierarceo approved these changes Mar 26, 2026

View reviewed changes

mattf approved these changes Mar 26, 2026

View reviewed changes

gyliu513 reviewed Mar 26, 2026

View reviewed changes

leseb and others added 7 commits March 27, 2026 10:18

fix: clean up diagram box alignment

f6489bd

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

fix: add file processors to diagram

699ee6d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>

Merge branch 'main' into readme-rewrite

9a26872

leseb merged commit c41932e into llamastack:main Mar 27, 2026
93 of 94 checks passed

leseb deleted the readme-rewrite branch March 27, 2026 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: rewrite README and docs to lead with OpenAI API compatibility#5323

docs: rewrite README and docs to lead with OpenAI API compatibility#5323
leseb merged 9 commits intollamastack:mainfrom
leseb:readme-rewrite

leseb commented Mar 26, 2026

Uh oh!

leseb commented Mar 26, 2026

Uh oh!

franciscojavierarceo Mar 26, 2026

Uh oh!

franciscojavierarceo Mar 26, 2026

Uh oh!

franciscojavierarceo Mar 26, 2026

Uh oh!

franciscojavierarceo Mar 26, 2026

Uh oh!

franciscojavierarceo left a comment

Uh oh!

mattf left a comment

Uh oh!

gyliu513 Mar 26, 2026

Uh oh!

gyliu513 Mar 26, 2026

Uh oh!

gyliu513 Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

leseb commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.
	Open-source Agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.

Conversation

leseb commented Mar 26, 2026

Summary

What changed

What was removed

Test plan

Uh oh!

leseb commented Mar 26, 2026

Uh oh!

franciscojavierarceo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

gyliu513 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gyliu513 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gyliu513 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

leseb commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants