Skip to content

Enhancement: Make it possible to rollback to a specific checkpoint and resume with modified parameters #41

@some-rando-rl

Description

@some-rando-rl

At the moment distrib-rl does not have any mechanism that allows one to resume training from a checkpoint with config modifications.

It's often the case that practitioners wish to modify environment details and/or hyperparameters during learning. Sometimes these details are understood at the outset and scheduled in advance, and sometimes they're discovered/decided only after training has begun.

In order to not sacrifice reproducibility, it's critical that it be possible to produce a single configuration that, if run unattended from start to finish, would reproduce the full set of checkpoints, including the ones prior to a mid-stream config change.

As a result, I'd propose a change to the config format that allows for defining arbitrary config "scheduling," with any mid-stream config modifications being required to reference the original configuration over the range for which that config was valid.

For example, assuming that you are training a model with a given config that we'll call config-v1, in order to resume training from checkpoint 5 that was produced by config-v1, a hypothetical config-v2 must reference (in some TBD way) the original config-v1 as having been valid for checkpoints 0-5, inclusive. This makes it relatively trivial to compose a series of modified configs into a single snapshotted config on each checkpoint, with that snapshotted config being executable as-is to reproduce the checkpoint that contains it.

Having that composite config stored as part of each snapshot is a requirement for this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions