Remove language limits

Hi!

I've noticed your project and goals you try to achieve with it, and I found it curious why there is such strict limit on languages (support only on en/ru). I see that you [used spacy to tokenize text](https://github.com/mts-ai/OpenAutoNLU/blob/main/open_autonlu/data/utils.py#L23C5-L23C26), and this idea seemed a bit off to me. I'd agree that spacy tokenization can be beneficial for some use-cases, it seems a bit out-of-place for a project that aims to simplify NER and language understanding pipelines as you are heavily limiting opportunity for others to use your solution. I am not sure whether Spacy is a crucial step in your pipeline, but I want to highlight that it only supports 24 languages out of the box.

Are there any plans to expand supported language set or remove such limitation entirely?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove language limits #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove language limits #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions