Skip to content

Conversation

@distributedstatemachine
Copy link
Collaborator

@distributedstatemachine distributedstatemachine commented Oct 27, 2024

Description

This PR

  • Adds FSDP to the miner
  • Makes index operations FSDP aware.
  • Moves support files to boltz library.

To start the miner with 3 GPUs, run ./scripts/start_distributed

TODO:

  • Fix Deadlock issue in miner: After 4 steps , it freezes
  • Apply FSDP Logic to validator
  • Set Weights in Validator.
  • Modify script/run.sh to start the miner with fsdp.
  • Dataloader: send unique pages to each rank

@distributedstatemachine distributedstatemachine changed the title Feat/re boltz Feat/re-boltz Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants