Skip to content

Add tier-aware scheduling/weighting for heterogeneous training #2

@plugyawn

Description

@plugyawn

Problem

The coordinator is tier-agnostic: data assignment is uniform across trainers and the centralized client ignores TrainingAssignment beyond logging. There is no enforcement of a target ratio (e.g., 50% prefix / 50% full) or tier-weighted contribution. Heterogeneous participation is therefore purely operator-managed rather than system-managed.

Refs:

  • shared/coordinator/src/data_selection.rs (uniform batch assignment)
  • architectures/centralized/server/src/app.rs (assignment = capabilities)
  • architectures/centralized/client/src/app.rs (assignment stored, unused)

Expected

Tier-aware scheduling or weighting should be possible to hit a target mix of prefix/full training.

Possible Approach

  • Add tier-aware weighting into assign_data_for_state (weighted by tier, token count, or configured ratio).
  • Or, have coordinator emit assignments that override client tier for the round.
  • Make clients actually apply TrainingAssignment to effective tier (or to loss weighting).

Acceptance Criteria

  • Configurable target ratio for tier participation.
  • Client uses assignment to influence effective tier or gradient weighting.
  • Metrics/logging reflect actual tier mix per round.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions