Skip to content

pad_token_id setting in generation_config #12

@hwq0726

Description

@hwq0726

Hi,

I encountered a bug while using the "Generate watermarked output" section in a Colab notebook. After loading the model and immediately calling model.generate, the following error will be raised:

ValueError: stopping_criteria is not empty, pad_token_id must be set in generation_config.

Here is the snippet from your Colab notebook that reproduces the issue:
model = load_model(MODEL_NAME, expected_device=DEVICE, enable_watermarking=True)
torch.manual_seed(0)
outputs = model.generate( **inputs, do_sample=True, temperature=0.7, max_length=1024, top_k=40, )

The pad_token_id in the config is set to 0 which will return true when if not pad_token_id. I add this line to fix this:
model.generation_config.pad_token_id = model.generation_config.eos_token_id
This seems to resolve the error, but it appears that the default setting or handling of pad_token_id might need review.

BTW, this error did not occur a few days ago, maybe something changed when fixing Python compatibility?

Could you please look into this? Thanks!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions