Request for using TopClus on different pretrained language models

Hi, 

I've read your paper and I like this approach. Thank you for sharing the code. I've one question regarding the pretrained language models (PLMs) that you use for getting the contextualized word representations. I saw in the source code that the model you use is fixed, and it's the classical 'bert-base-uncased': 

https://github.com/yumeng5/TopClus/blob/01e22fb73262bc45d361ec9165bdadbd929ac9a5/src/trainer.py#L22

Suppose I'm interested on using this method on a corpus of italian texts. In that case, would it be possible to change this model and use a `bert-base-multilingual-uncased` instead?

If that's possible, can we make `pretrained_lm` a parameter of the `TopClusTrainer`?

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for using TopClus on different pretrained language models #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Request for using TopClus on different pretrained language models #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions