Support for Gymnasium 1.0

The project currently does not support the new gymnasium 1.0 API.

Among a few minor changes, the reset behavior of environments has changed.  
Previously, the termination step returned the next_observation as the one from *after the reset*. To circumvent this, this observation currently gets overwritten using `info["final_observation"]`.
In the new API, the last observation correctly returns *the final observation* and the next `env.step` will return the "invalid" transition from before the reset to after the reset. **This step can not be used for learning, as it crosses reset boundaries.**
For more information look into the [Gymnasium Release Notes](https://gymnasium.farama.org/main/gymnasium_release_notes/) or [this short writeup](https://github.com/vwxyzjn/cleanrl/issues/499#issuecomment-2688317582) in the CleanRL-repo.

The `MightyAgent` class currently stores the replay buffer as a list of lists of e.g. returns. The second-level list signifies the different environments.
Since reset boundaries might be crossed by the environments at different points in time, we can not just throw away the transition using this structure.
We also can not keep these transitions, as they will lead to performance degradation.
A solution will probably require rewriting quite some code regarding the buffer to make this work, but I am not knowledgeable enough about this repository to properly gauge that.

There is a [backwards-compatible API](https://farama.org/Vector-Autoreset-Mode) using the Autoreset-mode "Same Step", but this will break some wrappers that are currently used.

I have some code lying around that lets the code run, but it currently does not discard the "faulty" trajectories.

- [ ] **Adapt code to new reset-api**
  - [ ] (?) Restructure replay buffer
- [ ] Change some wrapper imports
- If compatibility with older versions of Gymnasium is desirable:
  - [ ] Keep old and new code by conditionally running based on Gymnasium's version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Gymnasium 1.0 #108

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for Gymnasium 1.0 #108

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions