[Issue #245] Add pipeline parallelism support#246
[Issue #245] Add pipeline parallelism support#246iMmAseu wants to merge 11 commits intoovg-project:mainfrom
Conversation
|
Just curious does this PR support enabling both TP and PP, say TP=2 and PP=2? In that case, how do we get the consistent |
|
@ivanium Thanks for pointing this out, your concern is absolutely valid. I’ve updated the implementation to correctly handle the TP and PP setup and compute a consistent global_rank across all processes. I’ve also run tests with configurations like TP=2 and PP=2, and the initialization now works seemly as expected without synchronization issues. |
|
Hi @iMmAseu, thanks for the update! Let's split it into two parts, and I will go through the vllm patch. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for pipeline parallelism by refactoring the handling of distributed process ranks. The changes primarily involve renaming tp_size to world_size and adding pp_rank to correctly manage inter-process communication for KV cache synchronization across different pipeline stages. The implementation correctly namespaces IPC sockets by pp_rank to prevent conflicts.
My review has identified a few areas for improvement:
- There is some duplicated code for resolving distributed ranks in the SGLang integration patches.
- The use of broad
except Exceptionclauses could mask unexpected errors. - The docstrings for the
world_sizeparameter are inconsistent with its actual usage, which could lead to confusion.
Overall, the changes are well-structured and address the issue described. Addressing the feedback will improve the code's maintainability and clarity.
8cbc6ad to
709a5d2
Compare
Summary
This PR fixes a TimeoutError during vLLM initialization when Pipeline Parallelism is enabled.
Fixes #243
Root Cause
KVCacheManageronly usedtensor_parallel_size, which caused incorrect process detection in PP setups.Changes
world_sizeglobal_ranktp_size→world_sizeTest Plan