Skip to content

Refactor cross-mapping related structure #20

@gajagajago

Description

@gajagajago

cm_rank and pp_rank is extremely confusing.

Current status is that cm rank is used for cross mapping pipeline rank ordering, while pp rank is the rank in pipeline model parallel group.

However, its querying APIs are used in somewhat mixed manner.

  • get_pipeline_model_parallel_rank -> cm rank
  • get_pipeline_model_parallel_first/last_rank -> pp rank
  • get_pipeline_model_parallel_prev/next_rank -> pp rank

We currently use translate_cm_rank_to_pp_rank to translate the cm rank (usually during communication requests), but this should be refactored since it is suuuuuuper confusing. While doing refactoring, carefully look at various places, including training.py training_log where lm loss is printed by the last cm rank process.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions