Difficulties when Training RL Policies

I tried to run the RL training scripts for multiple tasks such as Stabilize, Reach and Grasp, and Insert by
`python3 main/rl/train.py task=<task_name> sim_device=cuda:<gpu_id> rl_device=cuda:<gpu_id> graphics_device_id=<gpu_id>`
However, none of the RL agents successfully learn to complete the tasks even after long time (an example for ReachAndGraspSingle shown below). I used the num_envs in the default task config file. Are there any hyperparameters I need to tune?
![Screenshot from 2024-07-22 13-12-59](https://github.com/user-attachments/assets/82a7257a-e861-44e0-822f-ed8552fd2103)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difficulties when Training RL Policies #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Difficulties when Training RL Policies #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions