Skip to content

feat: support gpt-oss in sglang-0.5.9 with clean allocator#268

Merged
cui36 merged 3 commits intoovg-project:mainfrom
ztang2370:feat/support-gpt-oss-in-sglang-0.5.9-simplify-allocator-remove-kvgroup
Mar 17, 2026
Merged

feat: support gpt-oss in sglang-0.5.9 with clean allocator#268
cui36 merged 3 commits intoovg-project:mainfrom
ztang2370:feat/support-gpt-oss-in-sglang-0.5.9-simplify-allocator-remove-kvgroup

Conversation

@ztang2370
Copy link
Contributor

@ztang2370 ztang2370 commented Mar 6, 2026

Summary

Compared to the previous KVGroup approach proposed in #263, this approach replaces the internal KVGroup struct with a multiton pattern, using one FTensorAllocator instance per group_id, lazily created via global_allocator(group_id).

What changed

allocator.hpp: Replaced single g_allocator_ with g_allocators_ map.
allocator.cpp: global_allocator(group_id) lazily creates per-group allocators.
torch_bindings.cpp: Routes group_id to the correct allocator at the binding layer.

Tested gpt-oss-20b on sglang-0.5.9.

@ivanium
Copy link
Collaborator

ivanium commented Mar 6, 2026

Thanks for the PR! This one does look simpler. QQ: how do we set the kv cache config, such as tensor size, number of layers, etc., into the cpp extension?

@ztang2370
Copy link
Contributor Author

Thanks for the PR! This one does look simpler. QQ: how do we set the kv cache config, such as tensor size, number of layers, etc., into the cpp extension?

The kv cache config is set when the Python side calls create_kv_tensors(size, dtype_size, dev_str, num_layers, num_kv_buffers, group_id), which is the same entry point as before.
torch_bindings.cpp receives all config + group_id, and calls global_allocator(group_id) to get or lazily create the right allocator instance.
Then in allocator.cpp, create_kv_tensors() stores the config into its own members, creates the zero page, and builds the FTensors.

So each allocator instance gets configured the first time create_kv_tensors is called on it with group_id.

@ivanium ivanium self-requested a review March 8, 2026 22:05
@ivanium
Copy link
Collaborator

ivanium commented Mar 8, 2026

Oh right I missed that part before. But yeah that makes sense. I am okay with getting the PR in. Thanks for the work!

@cui36 cui36 merged commit 90222cd into ovg-project:main Mar 17, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants