-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Hi, I am trying to visualize skill space as well as your figure 5.
QueST's skill-prior transformer predicts 8 tokens for every 8 steps of rollouts (which is action horizon).
I guess these 8 token indices are from skill codebook (256-dimensional), not skillGPT embedding (384-dimensional)
** rollouts 1 of KITCHEN_SCENE10_close_the_top_drawer_of_the_cabinet starting
** Step: 0 : tensor([[746, 639, 631, 614, 644, 459, 506, 466]], device='cuda:0')
** Step: 8 : tensor([[907, 639, 638, 812, 851, 659, 507, 467]], device='cuda:0')
** Step: 16 : tensor([[909, 835, 835, 827, 819, 706, 587, 588]], device='cuda:0')
** Step: 24 : tensor([[919, 833, 834, 826, 939, 980, 780, 388]], device='cuda:0')
** Step: 32 : tensor([[799, 824, 865, 948, 781, 781, 581, 388]], device='cuda:0')
** Step: 40 : tensor([[799, 906, 948, 781, 773, 773, 572, 396]], device='cuda:0')
** Step: 48 : tensor([[798, 906, 980, 774, 774, 573, 373, 148]], device='cuda:0')
** Step: 56 : tensor([[790, 779, 980, 774, 773, 181, 108, 68]], device='cuda:0')
** Step: 64 : tensor([[790, 979, 780, 773, 132, 109, 156, 108]], device='cuda:0')
Above is example of printed token indices.
Interestingly, each step seemingly share some tokens for same location (eg. 639, 388, 799, 980, 790, ...)
From my understanding, these 8 skill codes are not causal as they are cross-attended by action decoder, not as autoregressive inputs.
Thus, for now, I have avg-pooled 8 vectors and visualized for each 8 steps, but would like to ask you for details.
How do I visaulize 256-dimensional 8 vectors for each 8 steps to get the similar figure of yours?
- Is each dot of figure 5 corresponds to each 8 step of rolling out an episode?
- What embedding is plotted by t-sne? Skill prior transformer embedding for each token VS Skill codebook from autoencoder
Thank you!