Skip to content

Add python 3.9 and Gymnasium 1.0 support#2

Merged
MarcDcls merged 8 commits intoRhoban:mainfrom
araffin:fix/py39
Nov 23, 2024
Merged

Add python 3.9 and Gymnasium 1.0 support#2
MarcDcls merged 8 commits intoRhoban:mainfrom
araffin:fix/py39

Conversation

@araffin
Copy link
Contributor

@araffin araffin commented Nov 22, 2024

See #1
Also reformated code and fix some warnings.

@araffin
Copy link
Contributor Author

araffin commented Nov 22, 2024

Note for later:
I'm having good results (in less than 100 000 steps) using TQC with Simba:

python train_sbx.py --algo tqc --env frasa-standup-v0 -c hyperparams/simba_tqc.py \
-param n_envs:30 gradient_steps:300 policy_delay:10 --verbose 0 \
--eval-episodes 20 --n-eval-envs 5 -P --vec-env subproc -n 200000

simba_tqc.py (needs araffin/sbx#59)

import optax


default_hyperparams = dict(
    n_envs=1,
    n_timesteps=int(5e5),
    policy="SimbaPolicy",
    # policy="MlpPolicy",
    # learning_rate=3e-4,
    # qf_learning_rate=1e-3,
    policy_kwargs={
        "optimizer_class": optax.adamw,
        "net_arch": {"pi": [128], "qf": [256, 256]},
        # "dropout_rate": 0.01,
        "n_critics": 2,
    },
    learning_starts=10_000,
    normalize={"norm_obs": True, "norm_reward": False},
)

hyperparams = {}

for env_id in [
    "HalfCheetah-v4",
    "Humanoid-v4",
    "HalfCheetahBulletEnv-v0",
    "Ant-v4",
    "Hopper-v4",
    "Walker2d-v4",
    "Swimmer-v4",
    "AntBulletEnv-v0",
    "HopperBulletEnv-v0",
    "Walker2DBulletEnv-v0",
    "BipedalWalkerHardcore-v3",
    "Pendulum-v1",
    "frasa-standup-v0",
]:
    hyperparams[env_id] = default_hyperparams.copy()
    if "Bullet" in env_id:
        hyperparams[env_id].update({"gamma": 0.98})

    if "Swimmer" in env_id:
        hyperparams[env_id].update({"gamma": 0.999})

@MarcDcls MarcDcls merged commit d0b55d6 into Rhoban:main Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants