Confusing training results

Hi, thanks for the great work!

I am writing to ask for your advice. I used your task file ''imitation_learning'' for my RL training. I utilized the PPO algorithm and both go1 and a1 robots. However, the results are not good (the gif is attached). Do you have any idea why the result looks so funny? Thank you!

Best
![reward_69_episode_300](https://user-images.githubusercontent.com/64319636/220424748-0f4826a8-9731-4c4a-b7ae-a60593404cff.gif)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing training results #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Confusing training results #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions