-
Notifications
You must be signed in to change notification settings - Fork 199
Changing hybrid_override_pattern to hybrid_layer_pattern to mirror mcore change #2628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
8b20cd9
d0a4549
3f5c0d7
fede9aa
f8ce1e5
0ce7b9f
6093599
51cef9d
010ec73
4dcf187
f7d2f99
8377697
eebd1f8
1055d0d
c19f025
497ffc4
524f2fc
b57274b
33875d7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -624,7 +624,7 @@ def training_log( | |||||||||||||||||
| track_names.append("z_loss") | ||||||||||||||||||
|
|
||||||||||||||||||
| if config.model.is_hybrid_model: | ||||||||||||||||||
| layers = config.model.hybrid_override_pattern.count("E") | ||||||||||||||||||
| layers = config.model.hybrid_layer_pattern.count("E") | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Guard hybrid pattern access during deprecation window. Line 627 assumes Proposed fix- if config.model.is_hybrid_model:
- layers = config.model.hybrid_layer_pattern.count("E")
+ if config.model.is_hybrid_model:
+ hybrid_pattern = getattr(config.model, "hybrid_layer_pattern", None) or getattr(
+ config.model, "hybrid_override_pattern", None
+ )
+ layers = hybrid_pattern.count("E") if hybrid_pattern else config.model.num_layers
else:
layers = config.model.num_layers📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||
| else: | ||||||||||||||||||
| layers = config.model.num_layers | ||||||||||||||||||
|
|
||||||||||||||||||
|
|
||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecation is currently a hard break for provider constructor kwargs.
These renames remove acceptance of legacy kwargs at dataclass init time. Any caller still passing
hybrid_override_pattern(or the other deprecated hybrid knobs mentioned in the PR) will now fail withTypeErrorinstead of getting a deprecation warning.Proposed compatibility bridge for deprecated config fields
`@dataclass` class NemotronHModelProvider(MambaModelProvider): """Configuration for Nemotron-H models.""" + hybrid_layer_pattern: str | None = None + hybrid_override_pattern: str | None = None + hybrid_attention_ratio: float | None = None + hybrid_mlp_ratio: float | None = None + + def __post_init__(self) -> None: + if self.hybrid_override_pattern is not None: + warnings.warn( + "hybrid_override_pattern is deprecated; use hybrid_layer_pattern.", + DeprecationWarning, + stacklevel=2, + ) + if self.hybrid_layer_pattern is not None and self.hybrid_layer_pattern != self.hybrid_override_pattern: + raise ValueError( + "Both hybrid_layer_pattern and hybrid_override_pattern were provided with different values." + ) + self.hybrid_layer_pattern = self.hybrid_override_pattern + + if self.hybrid_attention_ratio is not None or self.hybrid_mlp_ratio is not None: + warnings.warn( + "hybrid_attention_ratio and hybrid_mlp_ratio are deprecated; use hybrid_layer_pattern.", + DeprecationWarning, + stacklevel=2, + ) + + super().__post_init__()Also applies to: 78-78, 91-91, 106-106, 124-124, 140-140, 158-158
🤖 Prompt for AI Agents