ViT training ? 

Hello, 

Thank you for your very interesting work ! I'm currently trying to replicate your results with your provided codebase and I was wondering whether you also tested a Vision Transformer architecture as encoder ? You compared in the paper with DINO, but I wanted to know if you where able to get some properties close to what they obtained (a kind of saliency map with the attention map around the object of interest). 

Thank you again for your response !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ViT training ? #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ViT training ? #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions