-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe
My feature does not relate directly to a problem in flashbax.
Describe the solution you'd like
I would like to propose my implementation of “replay across experiments”. The main idea is to be able to load an offline dataset(s) and use it to mix offline and online data when training online. One of the main motivations for this approach is that it allows to recycle old data from experiments. This is particularly useful when training RL policies on real hardware, where each experiment is costly.
Describe alternatives you've considered
N/A
Additional context
I have an implementation already, based on Brax's replay buffers. The main features are:
- Support for arbitrary pytrees for observations (including images)
- Loading old buffers from W&B checkpoints
My understanding is that I will need to port the core logic there to flashbax so that their interfaces match.
Misc
- Check for duplicate requests.
I am happy to contribute this, but might need very minor help because my experience so far has been with Brax-style RL.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request