Skip to content

fix cookbook#152

Open
tastelikefeet wants to merge 3 commits intomodelscope:mainfrom
tastelikefeet:fix/0411-1
Open

fix cookbook#152
tastelikefeet wants to merge 3 commits intomodelscope:mainfrom
tastelikefeet:fix/0411-1

Conversation

@tastelikefeet
Copy link
Copy Markdown
Collaborator

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates several cookbook examples with revised GPU configurations and model IDs, and refactors the loss calculation logic to support separate training and evaluation statuses. It also includes bug fixes for variable references in the vLLM engine and sequence length calculations. Reviewers identified an inconsistency where some metrics still hardcode training status during evaluation and suggested more robust error handling when dynamically loading model architectures from configurations.

Comment on lines +479 to +482
if self.model.training:
status = optimizer_config.train_status
else:
status = optimizer_config.eval_status
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The introduction of the status variable to handle both training and evaluation metrics is a good improvement. However, the implementation appears incomplete as the accumulation of num_tokens (at line 506) still hardcodes optimizer_config.train_status.num_tokens. This inconsistency will cause evaluation tokens to be incorrectly added to training metrics when the model is in evaluation mode. Please ensure that all metric updates in this function use the status variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants