Skip to content

Conversation

@Jameswlepage
Copy link

What:

  • Bug Fix
  • New Feature

Description:

The stopping criteria in the generation loop was incorrectly receiving $generatedInputIds (only the newly generated token from the current step) instead of $allInputIds (the full sequence including prompt and all generated tokens).

This caused MaxLengthCriteria to never trigger based on sequence length, because it was always checking a sequence of length 1, which would never exceed max_length. As a result, text generation would run indefinitely until hitting memory limits or an EOS token.

The Fix:

- $stop = $stoppingCriteria($generatedInputIds, $scores);
+ $stop = $stoppingCriteria($allInputIds, $scores);

Testing:

Verified that maxNewTokens parameter now correctly limits generation length.

The stopping criteria was incorrectly receiving $generatedInputIds (only
the newly generated token from the current step) instead of $allInputIds
(the full sequence including prompt and all generated tokens).

This caused MaxLengthCriteria to never trigger because it was always
checking a sequence of length 1, which would never exceed max_length.
@Jameswlepage Jameswlepage closed this by deleting the head repository Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant