Include current project when subsetting --venv-repository with --requirements
#2979
-
|
Hello @jsirois ! I have a bit of a complicated use-case. I work in a data engineering team. Among other things, we write internal tooling for data scientists. I use PEX to package python environments to use on a Spark cluster, and wrote a CLI tool around it. Data scientists can use uv or poetry to manage their projects, so resolutions is handled by those tools; I use In order to package only the dependencies useful in a spark session, I use poetry/uv to generate a lockfile, including dependency groups specified by the users through the CLI. I then use Now my issue: users can request that the current project be included in the PEX. As I subset the venv through the lockfile, I need to explicitly tell PEX that I need the project.
pex: Building pex :: Adding distributions built from local projects and collecting their requirements: /home/[...]/work/project :: Resolving requirements. :: Resolving for:
pex: Hashing pex pex: Hashing pex: 104.5ms
pex: Isolating pex: 0.1ms
pex: Building pex :: Adding distributions built from local projects and collecting their requirements: /home/[...]/work/project :: Resolving requirements. :: Building distributions for:
BuildRequest(download_target=DownloadTarget(target=LocalInterpreter('/home/[...]/work/project/.venv/bin/python'), universal_target=None), source_path='/home/[...]/work/project', finpex: Building /home/[...]/work/project to /tmp/tmpzcwmw226/build
pex: Building pex :: Adding distributions built from local projects and collecting their requirements: /home/[...]/work/project :: Resolving requirements. :: Calculating project names for direct requirements:
pex: Building pex :: Resolving distributions for requirements: anywidget<0.10.0,>=0.9.14 corr_module>=1.0.0 marketdata<3.0.0,>=2.0.0 matplotlib<4.0.0,>=3.7.0 numpy<3.0.0,>=2.2.3 openpyxl<4.0.0,>=3.1.5 plotly<6.0.0,>=5.24.1 popex: Building pex :: Resolving distributions for requirements: anywidget<0.10.0,>=0.9.14 corr_module>=1.0.0 marketdata<3.0.0,>=2.0.0 matplotlib<4.0.0,>=3.7.0 numpy<3.0.0,>=2.2.3 openpyxl<4.0.0,>=3.1.5 plotly<6.0.0,>=5.24.1 polars<2.0.0,>=1.32.3 pyyaml<7.0.0,>=6.0.2 scikit-learn<2.0.0,>=1.6.1 scipy<2.0.0,>=1.15.0 statsmodels<0.15.0,>=0.14.4 tsm<3.0.0,>=2.0.11 utils-internal==1.0.11 /home/[...]/work/project/archive-requirements.txtpex: Building pex :: Resolving distributions for requirements: anywidget<0.10.0,>=0.9.14 corr_module>=1.0.0 marketdata<3.0.0,>=2.0.0 matplotlib<4.0.0,>=3.7.0 numpy<3.0.0,>=2.2.3 openpyxl<4.0.0,>=3.1.5 plotly<6.0.0,>=5.24.1 polars<2.0.0,>=1.32.3 pyyaml<7.0.0,>=6.0.2 scikit-learn<2.0.0,>=1.6.1 scipy<2.0.0,>=1.15.0 statsmodels<0.15.0,>=0.14.4 tsm<3.0.0,>=2.0.11 utils-internal==1.0.11 /home/[...]/work/project/archive-requirements.txtException in thread Thread-3 (_handle_results):e/[...]/work/project/.venv. :: Using 4 parallel jobs to process 178 items
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.10/multiprocessing/pool.py", line 579, in _handle_results
task = get()
File "/usr/lib/python3.10/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
File "/home/[...]/work/project/.venv/lib/python3.10/site-packages/pex/sorted_tuple.py", line 63, in new
sorted(iterable, key=key, reverse=reverse), # type: ignore[arg-type, type-var]
File "/home/[...]/work/project/.venv/lib/python3.10/site-packages/pex/vendor/_vendored/attrs/attr/_make.py", line 1848, in lt
return attrs_to_tuple(self) < attrs_to_tuple(other)
TypeError: '<' not supported between instances of 'SpecifierSet' and 'SpecifierSet'I'll open a proper issue for this if I manage to reproduce it in a barebone example Finally, I tried adding the project name to the lockfile directly after its generation, which seems to work. But I ran into issues for package names using hyphens. Should I normalize the name to what's used for wheels (ie what is present inside the venv) ? I would gladly take any advice or comment regarding this setup. Thanks for working on PEX, it's great! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 25 replies
-
|
Thanks @matthieucx , that's probably enough details for me to experiment this afternoon and suggest a workaround or else introduce some Pex fixes to handle this case, which would seem common enough to me. |
Beta Was this translation helpful? Give feedback.
Alright @matthieucx, with #2984 released as Pex 2.67.2 later this evening, Pex will be able to resolve editables from venvs. To do so, it will effectively convert the editable into a
--project ...argument.Since we're in different time zones (I assume), can you gather the following information?:
pex -v --venv-repository <venv dir> <project name> -o <project name>.pex. That should now be all you need to PEX up a project from its venv. The-voutput will include high level timings and separate the 3 major bits: 1. project build time, 2. venv resolve time, 3. PEX zipping time.uv buildorpoetry ...(I…