-
Notifications
You must be signed in to change notification settings - Fork 2
Create mlflow defaults #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review Summary
Found several issues that need to be addressed:
Critical Issues
- Bare except Exception clauses that may hide bugs (lines 150, 156, 226)
- Mutation of input ListConfig without copying first (line 331)
- Magic string comparisons for default values
Code Quality Issues
- Missing type annotations
- Overly broad exception handling
Positive Aspects
- Comprehensive test coverage (807 unit tests + end-to-end tests)
- Well-structured implementation following Hydra patterns
- Good documentation in LOAD_SUBCONFIGS.md
Please address the inline comments below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
|
@claude I addressed your comments, the two I didn't implement are out of scope for this PR |
|
Claude finished @ealt's task in 54s —— View job Review of Addressed Comments
SummaryI've reviewed the current code against the previous review comments. Here's the status: ✅ Addressed Comments (from claude's review)
🔍 Comments Marked as Out of ScopeBased on the review, I believe the two "out of scope" comments are likely:
🤖 Cursor Bugbot IssuesThe 3 issues flagged by Cursor Bugbot appear to be:
|
Loading Subconfigs from MLflow runs
Goal:
The goal is to be able to compose a config taking parts/entire configs that have been logged as MLflow artifacts from previous runs
Design:
The proposed design is patterned after the Hydra defaults list:
TARGET: OmegaConf dot path within current config pointing to the MLFlowConfig nodePACKAGE: OmegaConf dot path where to place merged content (default: ".", i.e., root)ARTIFACT_PATH: path within the MLflow run’s artifact dir to the source YAML without system extension (default: "config")./, it is treated as anARTIFACT_PATH./, it is ambiguous and defaults toSELECT_PATH. To force it to be anARTIFACT_PATH, append#(e.g.,my_artifact#).SELECT_PATH: OmegaConf dot path within the source YAML to the subconfig to import (default: root)._self_,optional, andoverridefunction the same as their Hydra equivalents._self_determines the composition order. If missing, it is always appended to the end. If_self_is explicitly included, it is processed in the specified order._self_or an MLflow entry._self_is first: MLflow entries merged after_self_will override the original config._self_is last: Original config merged after MLflow entries will override MLflow content._self_, explicitly include_self_first, then useoverrideat root.optionalsuppresses errors if the artifact or selected subconfig is missing.overridecauses the loaded config to completely replace the value at the package path instead of merging. When override is used, any existing content at that path is completely replaced with the loaded content (no deep merge).Merge Semantics for Overlapping Keys
When multiple entries in
mlflow_defaults(or_self_) have overlapping keys, the merge behavior follows Hydra's deep merge semantics:Conflicting keys (same key path): The last entry wins. The value from the later entry completely replaces the value from earlier entries.
Non-conflicting keys at the same level: Both keys are preserved. Nested dictionaries are merged deeply.
Partially overlapping nested structures: Deep merge preserves non-conflicting nested keys.
This matches Hydra's defaults list behavior: entries are merged in order using
OmegaConf.merge(), which performs deep merging of nested dictionaries.OPTION Parsing Logic
To resolve ambiguity in
OPTION:OPTIONcontains#:ARTIFACT_PARTandSELECT_PART.ARTIFACT_PATH=ARTIFACT_PART(if empty, defaults to "config").SELECT_PATH=SELECT_PART(if empty, defaults to root).#):OPTIONcontains/:ARTIFACT_PATH=OPTION.SELECT_PATH= root.ARTIFACT_PATH= "config".SELECT_PATH=OPTION.Examples:
load entire
config.yamland merge at rootload
configs/model.yamland merge as old_modelload
persistencesubconfig fromconfig.yamlif it existsoverride
train.optimizerwith subconfig fromtrain.yamlNote: With
override, the entiretrain.optimizersection is replaced with the content from the artifact. Any existing keys intrain.optimizerthat are not in the artifact will be removed.load
modelandgenerative_processfrom previous runload models from multiple runs
Load a top-level artifact
custom.yamlat rootMerge behavior with overlapping keys
Deep merge semantics apply consistently everywhere, matching Hydra's defaults list behavior:
Loading at root (no package specified): Deep merge - multiple entries merge together, preserving non-conflicting keys.
Loading at a specific package path (e.g.,
@other_section): Deep merge - content merges with any existing content at that package path, preserving non-conflicting keys.Multiple entries at the same package path: Deep merge preserves non-conflicting keys:
Example of multiple entries with overlapping keys:
If
config1.yamlcontains:And
config2.yamlcontains:The final result will be:
If both configs had
other_section.foo, the value fromconfig2would win (last entry wins for conflicting keys).Note: The
overrideflag changes merge behavior: when specified, the loaded config completely replaces the value at the package path instead of merging. This means all existing keys at that path are removed and replaced with the loaded content.Override flag behavior with
_self_When
overrideis used, the order of entries still matters. If_self_is explicitly included in the list, it will be processed in the order specified:In this example:
config1is merged intoother_section(preserving existing keys, merging nested dicts)config2completely replacesother_section(removing all keys from step 1)_self_merges the original config, so any keys in the originalother_sectionwill override/replace what was loaded fromconfig2Important:
_self_is always auto-appended if not explicitly included, regardless of override usage. If you want to override everything from_self_, explicitly include_self_first, then useoverrideat root:Implementation:
The proposed implementation is as a stand alone function that dynamically resolves the config:
This could then be included in the
managed_rundecoratorOr used as by itself, such as in a notebook:
Testing
Test Cases
Entries
ENTRYENTRYitems for different runsENTRYitems for the same runENTRYitems with shared keys but differing valuesENTRYin the list with that key_self_omitted_self_before otherENTRYitem(s)Packages
PACKAGEomittedPACKAGEpathArtifact Paths
ARTIFACT_PATHomittedconfig.yamlloaded by defaultARTIFACT_PATHSelect Paths
SELECT_PATHomittedTARGETconfig loaded by defaultSELECT_PATHSELECT_PATHOptions
optionalomittedOPTIONisnullOPTIONoptionalspecifiednullvalue forOPTIONeffectively skipping thatENTRYOPTIONnormallyENTRYif there is an issue loading givenOPTIONoverrideomittedoverridespecifiedUnit Tests
MlflowClient.download_artifactsshould be mocked with the side effect of saving a yaml file with content needed by the test in a temp directory to avoid this dependency not under testIntegration Tests
sqlite://in a temp directory as thetracking_uri)