https://rchalyang.github.io/SoftModule/
"epoch_frames" : 200
"batch_size" : 1280
torchrl/algo/off_policy/twin_sac_q.py
get sparse loss here
"""
Policy Loss
"""
if not self.reparameterization:
raise NotImplementedError
else:
assert log_probs.shape == q_new_actions.shape
policy_loss = ( alpha * log_probs - q_new_actions).mean()
std_reg_loss = self.policy_std_reg_weight * (log_std**2).mean()
mean_reg_loss = self.policy_mean_reg_weight * (mean**2).mean()
policy_loss += std_reg_loss + mean_reg_loss
mujoco210-linux-x86_64.tar.gz
local_debug_logger-master.zip
https://rchalyang.github.io/SoftModule/
"epoch_frames" : 200"batch_size" : 1280torchrl/algo/off_policy/twin_sac_q.pyget sparse loss here
mujoco210-linux-x86_64.tar.gz
local_debug_logger-master.zip