A question about vocab_size in token embbeding

Hi！
When I read your source code, I found you set vocab_size = self.codebook_size + 1000 + 1  in token embbeding stage. Why not directly set vocal_size=self.codebook_size? What does the extra 1001 embeddings mean? Are these embeddings of class labels and mask tokens? Can I understand it this way, that is, when there is no class condition, vocal_size should be set to self.codebook+1?

Looking forward to your reply！


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about vocab_size in token embbeding #53

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

A question about vocab_size in token embbeding #53

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions