Skip to content

Latest commit

 

History

History
47 lines (36 loc) · 601 Bytes

File metadata and controls

47 lines (36 loc) · 601 Bytes

Torch Distributed Elastic

Makes distributed PyTorch fault-tolerant and elastic.

Get Started

:caption: Usage
:maxdepth: 1

elastic/quickstart
elastic/train_script
elastic/examples

Documentation

:caption: API
:maxdepth: 1

elastic/run
elastic/agent
elastic/multiprocessing
elastic/errors
elastic/rendezvous
elastic/timer
elastic/metrics
elastic/events
elastic/subprocess_handler
elastic/control_plane
elastic/numa
:caption: Advanced
:maxdepth: 1

elastic/customization
:caption: Plugins
:maxdepth: 1

elastic/kubernetes