-
Notifications
You must be signed in to change notification settings - Fork 2
How about RL agent training? #1
Copy link
Copy link
Open
Description
Hi, thank you for your great work!
It seems that the provided code trains only the VAE model for each view.
How about the PPO agent that learns to maximize the cumulative reward using those trained model?
To make evaluation like figure 4 or table 1 in the paper, I believe you already have implemented the PPO agent as well.
(Pretty much like Algorithm 1&2 in the appendix A.)
Is it possible for you to provide the RL code as well?
Thank you.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels