Skip to content

Concurrent prefix caching support for vLLM#274

Open
qinganrice wants to merge 1 commit intoovg-project:mainfrom
qinganrice:apc
Open

Concurrent prefix caching support for vLLM#274
qinganrice wants to merge 1 commit intoovg-project:mainfrom
qinganrice:apc

Conversation

@qinganrice
Copy link
Contributor

Create a PR to support vLLM Prefix Caching step 1: prefix caching for concurrent requests.
Do all implementations in patch file and don't touch KVCached and vLLM core design to make it compatible with other serving engines.

@qinganrice qinganrice changed the title Concurrent prefix caching Concurrent prefix caching support for vLLM Mar 18, 2026
@cui36 cui36 self-requested a review March 18, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant