Skip to content

Bad performance on single word #1

@ouyangliqi

Description

@ouyangliqi

The translation seems weird when translating single word.

For example,
"Earth" -> "地球 地球 地球 地球 地球 地球 {\cHFFFFFF}{\3cH2F2F2F}{\4cH000000}Earth."
"Cars" -> "汽车汽车"
"Car" -> "车车 车"

The usage of the model is showed below.

import torch
from transformers import BartTokenizer, BartForConditionalGeneration,AutoTokenizer, AutoModelWithLMHead

device = "cuda" if torch.cuda.is_available() else "cpu"


examples = ["Truth, good and beauty have always been considered as the three top pursuits of human beings",
            "Car",
            "Cars",
            "Earth."]

model_name_or_path="Helsinki-NLP/opus-mt-en-zh"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelWithLMHead.from_pretrained(model_name_or_path)
inputs = tokenizer(examples,padding=True, return_tensors="pt").to(device)
model.eval().to(device)
outputs = model.generate(**inputs,max_length=128)
print("Helsinki-NLP/opus-mt-en-zh model outputs:")
print([tokenizer.decode(ids,skip_special_tokens=True) for ids in outputs])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions