add stop+go tests to llama3 recipe, turn off async checkpointing for fp8#1494
Conversation
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
cspades
left a comment
There was a problem hiding this comment.
Just NaN and infinity right? LGTM, will add golden value parity tests on TE side as well!
async dcp checkpointing is currently not working with fp8 model init, so we need to detect this and switch back to synchronous checkpointing. This also adds tests to ensure the dcp checkpoints are functional