Skip to content

Megatron-LM changes to make Hyena/Evo 2 inference usable, especially for 40B models#1727

Draft
antonvnv wants to merge 9 commits intoNVIDIA:mainfrom
antonvnv:next
Draft

Megatron-LM changes to make Hyena/Evo 2 inference usable, especially for 40B models#1727
antonvnv wants to merge 9 commits intoNVIDIA:mainfrom
antonvnv:next

Conversation

@antonvnv
Copy link
Contributor

@antonvnv antonvnv commented Aug 1, 2025

No description provided.

antonvnv added 8 commits July 22, 2025 17:54
This is needed to make Evo 2 40b work on A6000 Ada x2.
Needed to pass None as inference params when we do cache-less forward
pass.
If prompt length exceeds this value, it will be split into segments.

This feature allows to process very large prompts that normally would
cause Out Of Memory (OOM) during forward pass.

Here's how it works. When the input prompt length exceeds this
threshold, the generation process is split into three phases:

1. One large forward pass of input tokens up to the threshold value.

2. The rest of the prompt that exceed the threshold are processed
token-by-token without sampling. This operation executes at the token
generation speed (throughput) as shown.

3. Regular generation, where after the input prompt is fully processed,
normal token generation with sampling resumes.
Logits reporting are required for Evo 2 NIM.
@sbhavani sbhavani added the enhancement New feature or request label Aug 2, 2025
@ko3n1g ko3n1g requested review from a team as code owners February 18, 2026 09:18
@Phlip79
Copy link
Member

Phlip79 commented Mar 4, 2026

We are changing our review process and marking all open, unlabeled PRs as draft. This change will go in effect starting once #3659 is merged.

Moving forward, all PRs will be required to start as draft PRs. If you wish to get your PR merged, mark your PR as “Ready for review”. Read more about the new process at submit.md.

@Phlip79 Phlip79 marked this pull request as draft March 4, 2026 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants