Fix PreCommit YAML Xlang job hanging#37685
Conversation
Summary of ChangesHello @aIbrahiim, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the stability and performance of Python dependency resolution within the build system. By upgrading 'pip', integrating the 'uv' package manager, and introducing a 'constraints.txt' file with explicit dependency pinning, it resolves a critical issue causing the PreCommit YAML Xlang job to hang. These changes ensure more reliable and faster package installations, particularly for complex ML-related dependencies. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| ] + ml_base, | ||
| 'p312_ml_test': [ | ||
| 'datatable', | ||
| 'namex==0.0.9', |
There was a problem hiding this comment.
cc: @tvalentyn is it fine to pin versions in "_test" extension? We usually set an upper bound only.
There was a problem hiding this comment.
We can set an upper/lower bound, but agreed that it would be better to at least keep a range of valid values rather than pinning to a specific version.
@aIbrahiim do these actually need to be constrained this tightly?
There was a problem hiding this comment.
@damccorm Agreed, using ranges instead of exact pins i updated to namex>=0.0.9,<0.2.0 and optree>=0.16.0,<0.19.0 so these constrain the resolver and avoid resolution too deep while still allowing minor updates if we hit resolver or compatibility issues we can narrow further
There was a problem hiding this comment.
+1 to use ranges, and constrain the ranges to help the resolver pick the right one. We can use uv locally to find what compatible ranges are if pip doesn't solve this.
Overall, guidance is: https://cwiki.apache.org/confluence/display/BEAM/Dependency+management+guidelines+for+Beam+Python+SDK+maintainers.
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
| ] + ml_base, | ||
| 'p312_ml_test': [ | ||
| 'datatable', | ||
| 'namex==0.0.9', |
There was a problem hiding this comment.
We can set an upper/lower bound, but agreed that it would be better to at least keep a range of valid values rather than pinning to a specific version.
@aIbrahiim do these actually need to be constrained this tightly?
|
|
||
| def pythonSdkDir = project.project(":sdks:python").projectDir | ||
| // Python 3.13 requires numpy>=2.1.0; constraints.txt pins numpy<2 for py310-312. | ||
| def constraintsName = "3.13".equals(project.ext.pythonVersion) ? "constraints_py313.txt" : "constraints.txt" |
There was a problem hiding this comment.
Does constraints_py313.txt exist? Also, you can do this in a single file like https://github.com/aIbrahiim/beam/blob/89b4b5e0b80731943149dd7596feb93768507af3/sdks/python/setup.py#L378
There was a problem hiding this comment.
Ahh yes thast's a better approach, I refactored to use the single file approach. constraints_py313.txt has been removed and Numpy is now specified in setup.py (and pyproject.toml)
There was a problem hiding this comment.
Tried switching namex/optree to version ranges (>=0.0.9,<0.2.0 and >=0.16.0,<0.19.0) but that caused installGcpTest to fail in the Yaml_Xlang_Direct workflow so Im reverting to exact pins (namex==0.0.9, optree==0.16.0) so the ML stack installs reliably and we can revisit ranges once keras pins these dependencies
sdks/python/setup.py
Outdated
| # tensorflow-transform requires dill, but doesn't set dill as a | ||
| # hard requirement in setup.py. | ||
| 'dill', | ||
| # keras deps namex/optree lack version bounds - pin to avoid resolver issues |
There was a problem hiding this comment.
What are the actual issues we're running into here?
There was a problem hiding this comment.
I investigated the install failures and traced them to three main causes:
Resolver backtracking: Pip 25.1/26.0.1 was hitting its resolver limit on the tensorflow keras namex/optree chain amd since Keras doesnt pin namex/optree, adding version bounds reduces excessive backtracking
Also NumPy/Pandas ABI mismatch when uv selected NumPy 2.x but Pandas expected 1.x the tests failed with ValueError numpy.dtype size changed so constraining NumPy to 1.x for Python 3.10–3.12 fixed this
Also for Python 3.13 support NumPy 1.x doesnt support Python 3.13 so we require numpy>=2.1.0 for 3.13 via python_version markers
And i switched to uv since its resolver handles this dependency tree more reliably as the namex/optree version ranges (instead of exact pins) provide enough structure without triggering deep backtracking
| "pip install --pre --retries 10 --upgrade tox --no-cache-dir" | ||
| "pip install --pre --retries 10 --upgrade pip==26.0.1 --no-cache-dir && " + | ||
| "pip install --pre --retries 10 --upgrade tox --no-cache-dir && " + | ||
| "pip install uv" |
There was a problem hiding this comment.
i am vary of using uv in our tests, unless it were so common that our users were doing the same, and docs instructed users to do so.
| // Use uv instead of pip - pip was hitting resolution-too-deep on tensorflow->keras->namex/optree. | ||
| // Include namex/optree as explicit deps to constrain resolution. | ||
| // --prerelease allow: envoy-data-plane depends on betterproto==2.0.0b6 (beta). | ||
| def anchorPkgs = "namex==0.0.9 optree==0.16.0" |
There was a problem hiding this comment.
I think these deps are specific to a certain test suite, and thus configuring them is a matter of the test suite that needs them; BeamModulePlugin should concern with global configuration matters for the project; adding these deps doesn't feel like the right direction to me
| "distlib==0.3.9", | ||
| # Numpy headers | ||
| "numpy>=1.14.3,<2.5.0", # Update setup.py as well. | ||
| # Numpy headers. py313 requires 2.1+; py<3.13 use 1.x for pandas ABI compat. |
There was a problem hiding this comment.
I don't follow why we need to cap numpy to < 2 for Python 3.10-3.12.
<2.5.0 seems like a wide enough bound , and i'd expect each python version to pick the right version that is compatible
| # Use a strict upper bound. | ||
| 'numpy>=1.14.3,<2.5.0', # Update pyproject.toml as well. | ||
| # numpy: py310-312 use 1.x; py313 needs 2.x (1.x unsupported; avoids # pylint: disable=line-too-long | ||
| # pandas ABI mismatch). |
There was a problem hiding this comment.
are there some detaills on this mismatch?
| # hard requirement in setup.py. | ||
| 'dill', | ||
| # namex/optree: pin to avoid resolver issues (lack version bounds) # pylint: disable=line-too-long | ||
| 'namex==0.0.9', |
There was a problem hiding this comment.
not an opinion, just wondering - should this be defined in ml_base instead? will
|
Assigning reviewers: R: @shunping for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
Fixes: #34798
Successful run: https://github.com/aIbrahiim/beam/actions/runs/22308583106
Related to #32603
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.