Skip to content

Request for using TopClus on different pretrained language models #4

@RobertoCorti

Description

@RobertoCorti

Hi,

I've read your paper and I like this approach. Thank you for sharing the code. I've one question regarding the pretrained language models (PLMs) that you use for getting the contextualized word representations. I saw in the source code that the model you use is fixed, and it's the classical 'bert-base-uncased':

pretrained_lm = 'bert-base-uncased'

Suppose I'm interested on using this method on a corpus of italian texts. In that case, would it be possible to change this model and use a bert-base-multilingual-uncased instead?

If that's possible, can we make pretrained_lm a parameter of the TopClusTrainer?

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions