学生蒸馏时的问题

` #! 这样直接clone感觉有问题的. student policy中的priv_self_info_latent用的直接就是priv的了，而不是estimator预测出来的
                # obs_est = obs.clone()
                # priv_state_estimated = self.alg.estimator(obs_est[: , :self.alg.num_prop])
                # obs_est[:, self.alg.num_prop+self.alg.num_scandots:self.alg.num_prop+self.alg.num_scandots+self.alg.priv_states_dim] = priv_state_estimated
                # obs_student = obs_est.clone()
                
                obs_student = obs.clone()
                
                # 使用depth encoder输出的latent输入actor backbone. priv_self_info用的是priv的，priv_env_info用的是history_encoder的
                actions_student = self.alg.depth_actor(obs_student, use_historyestimate=True, terrain_scandots_latent=terrain_depth_latent, ceiling_scandots_latent=ceiling_depth_latent)
                actions_student_buffer.append(actions_student)`

为什么蒸馏是学生的priv_self_info用的是真值，而不是估计出来的？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

学生蒸馏时的问题 #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

学生蒸馏时的问题 #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions