Skip to content

请教SFT时预处理数据 #18

@Fu-Fu-Fu-Fu

Description

@Fu-Fu-Fu-Fu

我看到您SFT脚本中,tokenize数据的参数是:work:16,batch size32,我实际实验发现,tokenize的时候会非常慢,请问这个是正常的吗?会有加速的手段吗?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions