Skip to content

学生蒸馏时的问题 #3

@1242433760

Description

@1242433760

` #! 这样直接clone感觉有问题的. student policy中的priv_self_info_latent用的直接就是priv的了,而不是estimator预测出来的
# obs_est = obs.clone()
# priv_state_estimated = self.alg.estimator(obs_est[: , :self.alg.num_prop])
# obs_est[:, self.alg.num_prop+self.alg.num_scandots:self.alg.num_prop+self.alg.num_scandots+self.alg.priv_states_dim] = priv_state_estimated
# obs_student = obs_est.clone()

            obs_student = obs.clone()
            
            # 使用depth encoder输出的latent输入actor backbone. priv_self_info用的是priv的,priv_env_info用的是history_encoder的
            actions_student = self.alg.depth_actor(obs_student, use_historyestimate=True, terrain_scandots_latent=terrain_depth_latent, ceiling_scandots_latent=ceiling_depth_latent)
            actions_student_buffer.append(actions_student)`

为什么蒸馏是学生的priv_self_info用的是真值,而不是估计出来的?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions