Skip to content

Conversation

@kmurphy4
Copy link
Collaborator

@kmurphy4 kmurphy4 commented Mar 24, 2025

As discussed in https://podio.com/easyesicom/feature-roadmap/apps/postmortem/items/260, fasttext is currently using the deprecated

description-file: ...

syntax, which will eventually be upgraded to an error in setuptools.

By forking this (archived by upstream) repo and uploading our own wheel to S3, our internal pipelines can install a wheel that doesn't have this issue.

Getting this to all work together requires a few PRs into Everlaw/servers:

See also: https://podio.com/easyesicom/feature-roadmap/apps/security-impact-analysis/items/1120.

It seems like https://github.com/facebookresearch/fastText has no
`0.9.3` commit, even though the latest release on https://pypi.org/project/fasttext/
is on that version...

I manually downloaded 0.9.2 and 0.9.3 from PyPI and checked that
this was the only diff...
@kmurphy4 kmurphy4 force-pushed the feature/build-wheel branch from 6987b63 to 8e48e60 Compare March 24, 2025 20:18
In `setuptools==78.0.1`, they started enforcing that no dependencies
specify this legacy field (pypa/setuptools#4870).

They ended up reverting in pypa/setuptools#4911,
but presumably they'll turn this behavior back on in the future...
@kmurphy4 kmurphy4 force-pushed the feature/build-wheel branch from 8e48e60 to 642ac92 Compare March 30, 2025 18:48
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this implementation is mostly copy-pasted from our existing CI for Everlaw/libpff: https://github.com/Everlaw/libpff/blob/master/.github/workflows/build_wheel.yml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure ya'll considered this when implementing libpff, but I'm curious how we decided what to put in the servers repo vs what we put in the host repo?
For example, the - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 line seems like it would be easier to find/replace if it were in servers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the python and pip versions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 line seems like it would be easier to find/replace if it were in servers.

This will get updated by Renovate like Everlaw/libpff#10... though I'm realizing we probably need to configure the bot to automatically assign reviewers 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm realizing we probably need to configure the bot to automatically assign reviewers 😆

Everlaw/libpff#15

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the "checkout" thing needs to be in this repo, because that's how we actually perform the git checkout https://github.com/actions/checkout?tab=readme-ov-file#checkout-v4.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the python

This one actually needs to be independent of what's in https://github.com/Everlaw/servers. If we were going to upgrade to, say, python3.13, we'd add a new item to

so that we end up pushing a cp313 wheel to S3. Then, https://github.com/Everlaw/servers would be able to download and install it.

and pip versions

These are just matching the original (pre-fork) constraints, so I don't expect them to ever change. In any case, the constraints for this library should be completely independent from the constraints for https://github.com/Everlaw/servers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, the constraints for this library should be completely independent from the constraints for

Would we not want to change this python version along with our global python version whenever we update?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate, sounds like you've got a much better understanding of this as an ecosystem than I do! Just looking out for a situation where, say, someone is updating our python version and forgets to do so here then has to get all the way through some CI build before it reports a certain artifact not being found, have to track down this repo, change it here as well etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

someone is updating our python version and forgets to do so here then has to get all the way through some CI build before it reports a certain artifact not being found, have to track down this repo, change it here as well etc.

Haha yes, that's likely how it will happen.

We just need to be able to support deploying multiple artifacts (one for each downstream Python version) at the same time so that the update PR into this repo can land independent of the "main" PR into https://github.com/Everlaw/servers.

matrix:
python-version:
- 3.12.6
runs-on: [self-hosted, fasttext]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fasttext runner is being added in https://github.com/Everlaw/servers/pull/39904.

@kmurphy4 kmurphy4 force-pushed the feature/build-wheel branch from 642ac92 to 875fb14 Compare March 30, 2025 18:58
Comment on lines +27 to +32
# These requirements were lifted from `install_requires` in `setup.py`.
uv pip install \
'pybind11>=2.2' \
'setuptools>=0.7.0' \
'numpy'
uv pip install --no-index --find-links dist fasttext
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to install the fasttext wheel we just built. We use --no-index --find-links dist to ensure we don't accidentally pull anything unexpected from PyPI.

exit 0
fi

aws s3 cp --only-show-errors "$wheel" "$url"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runner has permission to do this because of https://github.com/Everlaw/servers/pull/39903.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmurphy4 kmurphy4 force-pushed the feature/build-wheel branch from 875fb14 to de9faa0 Compare March 30, 2025 19:05
import io

__version__ = "0.9.3"
__version__ = "0.9.4.rc0"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we're satisfied with testing https://github.com/Everlaw/servers/pull/39905, we should remove the .rc0 suffix.

@kmurphy4 kmurphy4 marked this pull request as ready for review March 30, 2025 19:26
],
"timezone": "America/Los_Angeles",
"includePaths": [".github/**"]
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not that familiar with renovate configuration -- what effect does this have? I assumed it mainly defined where to look for things to upgrade. Are there dependencies in this repo it'll manage?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If renovate will target this repo, does it need to be added here: https://github.com/Everlaw/renovate-bot-config/blob/main/.renovate/config.js ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not that familiar with renovate configuration -- what effect does this have? I assumed it mainly defined where to look for things to upgrade. Are there dependencies in this repo it'll manage?

In this case it'll probably only update

- uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29

If renovate will target this repo, does it need to be added here: https://github.com/Everlaw/renovate-bot-config/blob/main/.renovate/config.js ?

Yep! https://github.com/Everlaw/renovate-bot-config/pull/120

Currently, we have Renovate configured to open PRs against this repo,
but nobody is actually ever reviewing them...

In the future, it would be nice to designate some other group of people
to review the "code" here, but we need *some* catch-all for now.
@jamie-everlaw
Copy link

LGTM! Sorry for the delay.

@jamie-everlaw jamie-everlaw merged commit 6cbc5c6 into main Apr 15, 2025
1 check failed
@jamie-everlaw jamie-everlaw deleted the feature/build-wheel branch April 15, 2025 22:11
@kmurphy4
Copy link
Collaborator Author

After deploying https://github.com/Everlaw/servers/pull/39903 and https://github.com/Everlaw/servers/pull/39904, we got a successful run 🎉 https://github.com/Everlaw/fastText/actions/runs/14480347793/job/40628697131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants