Skip to content

docs: rewrite README and docs to lead with OpenAI API compatibility#5323

Merged
leseb merged 9 commits intollamastack:mainfrom
leseb:readme-rewrite
Mar 27, 2026
Merged

docs: rewrite README and docs to lead with OpenAI API compatibility#5323
leseb merged 9 commits intollamastack:mainfrom
leseb:readme-rewrite

Conversation

@leseb
Copy link
Copy Markdown
Collaborator

@leseb leseb commented Mar 26, 2026

Summary

The project's public-facing docs haven't kept up with how the project has evolved. The README opens with "standardizes the core building blocks" — a year-old message that could describe any AI framework. The word "OpenAI" appears zero times. The Responses API is invisible.

This PR rewrites the README, docs landing page, API overview, and OpenAI compatibility page to reflect what the project actually is today: an OpenAI-compatible API server with pluggable providers.

What changed

README.md — full rewrite

  • Leads with "Open-source API server. OpenAI-compatible. Any model, any infrastructure."
  • 3-line code snippet showing the base_url swap with the OpenAI client
  • "What you get" section naming actual endpoints (/v1/chat/completions, /v1/responses, etc.)
  • Responses API featured (agentic orchestration, MCP, file_search)
  • Open Responses conformance mentioned
  • Provider architecture as concept (local → production → managed), not a giant table
  • Concise — net fewer lines than before

docs/docs/index.mdx — same messaging applied to the docs landing page

  • API table with actual endpoints and descriptions
  • Provider lists (inference + vector stores)
  • Links to conformance report and provider matrix

docs/docs/api-openai/index.mdx — trimmed from 230 lines to ~80

  • Removed generic filler (troubleshooting, best practices, roadmap, monitoring)
  • Added Responses API section with code example
  • Kept the endpoint table, conformance link, and provider matrix link

docs/docs/api-overview.md — now lists actual endpoints

  • Was: three sections saying "stable/experimental/deprecated APIs exist" with no specifics
  • Now: table of endpoints with descriptions

pyproject.toml — description updated from "Llama Stack" to a real PyPI description

What was removed

  • The big provider table (26 rows) — replaced by concept + link to full docs
  • The distribution table — replaced by install instructions
  • Bullet points about "unified API layer" and "standardized building blocks"
  • ~170 lines of generic filler from the OpenAI compat page

Test plan

  • Content reviewed for accuracy against actual API surface
  • All links point to existing docs pages
  • pyproject.toml description is valid

🤖 Generated with Claude Code

Rewrite the README, docs landing page, API overview, and OpenAI
compatibility page to reflect the current state of the project.

The project has evolved from a "standardized Gen AI API" to an
OpenAI-compatible API server with pluggable providers. The new
messaging leads with what users care about: drop-in compatibility
with the OpenAI API, any model, any infrastructure.

Key changes:
- README leads with "OpenAI-compatible API server" and a code snippet
- APIs are described by their actual endpoints, not internal categories
- Responses API (agentic orchestration, MCP, file_search) is featured
- Provider architecture shown as local-to-production concept, not a table
- Open Responses conformance mentioned
- OpenAI compat page trimmed from 230 lines of filler to focused content
- API overview page now lists actual endpoints
- pyproject.toml description updated for PyPI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026
@leseb
Copy link
Copy Markdown
Collaborator Author

leseb commented Mar 26, 2026

@raghotham @franciscojavierarceo @cdoern @mattf as discussed in today's community call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
- OpenAI API compatibility
- Cloud-based execution
- Scalable infrastructure
## Implemented endpoints
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention the non-OpenAI APIs that are API adjacent like Prompts and File Processor. Eventually we would add Memory to that list too.

# Welcome to Llama Stack

Llama Stack is the open-source framework for building generative AI applications.
**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**
**Open-source Agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**

| Embeddings | `/v1/embeddings` | Text embeddings |
| Models | `/v1/models` | Model listing and management |
| Files | `/v1/files` | File upload and management |
| Vector Stores | `/v1/vector_stores` | Document storage and semantic search |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we support more than just semantic search

- **Safety** — content moderation via Llama Guard
- **[Open Responses](https://www.openresponses.org/) conformant** — the Responses API implementation passes the Open Responses conformance test suite

## Use any model, use any infrastructure
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we outline RAG in this diagram a little more? Right now it only shows inference provider plugin

Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small suggestions but otherwise lgtm

Copy link
Copy Markdown
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

**Open-source API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.**

## Overview
Llama Stack is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +18 to +24
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[{"role": "user", "content": "Hello"}],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are more demos for openai responses api with openai client opendatahub-io/llama-stack-demos#324

README.md Outdated
- **Responses API** — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call ([learn more](https://llamastack.github.io/docs/api-openai))
- **Vector Stores & Files** — `/v1/vector_stores` and `/v1/files` for managed document storage and retrieval
- **Batches** — `/v1/batches` for offline batch processing
- **Safety** — content moderation via Llama Guard
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety apis are being removed #5291

leseb and others added 7 commits March 27, 2026 10:18
- Add "agentic" to tagline per franciscojavierarceo suggestion
- Remove Safety/Moderations (being removed in llamastack#5291)
- Use uv instead of pip in install instructions
- Remove Swift and Kotlin from SDK table
- Fix "semantic search" to just "search" for vector stores
- Mention non-OpenAI APIs (Prompts, File Processors)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Replace the flow diagram with a server architecture view that shows
the API endpoints alongside both inference and vector store providers.
This addresses the feedback that RAG/vector stores were missing from
the diagram.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Add a third column showing tools & connectors (MCP servers, web search,
file search/RAG) and file storage (local filesystem, S3). Add
/v1/connectors to the API endpoints row.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Remind agents to update the ASCII architecture diagram in README.md
when adding or removing providers, APIs, or backend integrations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
@leseb
Copy link
Copy Markdown
Collaborator Author

leseb commented Mar 27, 2026

All comments addressed, merging now, thanks!

@leseb leseb merged commit c41932e into llamastack:main Mar 27, 2026
93 of 94 checks passed
@leseb leseb deleted the readme-rewrite branch March 27, 2026 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants