Skip to content

docs: add comprehensive pipeline parallelism comparison guide#26

Open
yurekami wants to merge 1 commit intodeepseek-ai:mainfrom
yurekami:docs/add-comparison-guide
Open

docs: add comprehensive pipeline parallelism comparison guide#26
yurekami wants to merge 1 commit intodeepseek-ai:mainfrom
yurekami:docs/add-comparison-guide

Conversation

@yurekami
Copy link
Contributor

@yurekami yurekami commented Jan 3, 2026

Summary

This PR adds a comprehensive comparison guide (docs/COMPARISON.md) for pipeline parallelism methods, directly addressing the questions raised in issues #15 and #20.

Motivation

Several users have asked for clarification on:

This documentation provides detailed answers with tables, formulas, and decision guidelines.

Documentation Contents

Methods Compared

  • 1F1B - Traditional pipeline parallelism
  • VPP - Megatron-LM's Virtual Pipeline Parallelism
  • ZB1P - Zero Bubble scheduling
  • DualPipe - DeepSeek's bidirectional pipeline
  • DualPipeV - V-shape half-device variant

Sections Included

  1. Overview Table - Quick comparison of all methods
  2. Method Descriptions - Detailed explanation of each approach
  3. Bubble Ratio Analysis - Formulas and example calculations
  4. Memory Trade-offs - Parameter and activation memory comparison
  5. Communication Patterns - Network requirements for each method
  6. When to Use Each Method - Practical guidance
  7. MoE with EP Considerations - Special notes for Expert Parallelism
  8. Decision Flowchart - Step-by-step guide for choosing

Key Insights Documented

  • When vpp > 3, VPP's bubble ratio becomes competitive with DualPipe
  • DualPipe achieves low bubble through overlap rather than interleaving
  • DualPipeV provides DualPipe-like benefits with half the devices
  • For MoE+EP, compare methods using the same number of PP stages

Test Plan

  • Verified markdown renders correctly
  • Cross-referenced with README comparison table
  • Validated formulas against DeepSeek-V3 paper
  • Checked references are accurate

Closes #15
Closes #20

🤖 Generated with Claude Code

Add COMPARISON.md that provides detailed comparison of pipeline
parallelism methods including 1F1B, VPP, ZB1P, DualPipe, and DualPipeV.

The document includes:
- Overview table comparing all methods
- Detailed description of each approach
- Bubble ratio analysis with formulas
- Memory trade-offs (parameter and activation)
- Communication patterns
- When-to-use guidelines for each method
- MoE with Expert Parallelism considerations
- Decision flowchart for choosing the right method

This addresses the questions raised in:
- Issue deepseek-ai#15: Comparison between VPP and DualPipe
- Issue deepseek-ai#20: What is the benefit of DualPipeV?

Closes deepseek-ai#15
Closes deepseek-ai#20

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

What is the benifit of DualPipeV? Comparison between VPP and DualPipe

1 participant