Skip to content

Merge 20251016#2

Merged
frankmolenaar1986 merged 81 commits intomainfrom
merge/20251016
Oct 16, 2025
Merged

Merge 20251016#2
frankmolenaar1986 merged 81 commits intomainfrom
merge/20251016

Conversation

@pgaetani
Copy link
Member

No description provided.

quis and others added 30 commits January 24, 2025 12:30
 ## 94.0.1

* Add `ruff.toml` to `MANIFEST.in`

 ## 94.0.0

* `version_tools.copy_config` will now copy `ruff.toml` instead of `pyproject.toml`. Apps should maintain their own `pyproject.toml`, if required
* Replaces `black` formatter with `ruff format`
* Upgrades `ruff` to version 0.9.2

 ## 93.2.1

* Replaces symlink at `./pyproject.toml` with a duplicate file

 ## 93.2.0

* logging: add verbose eventlet context to app.request logs if request_time is over a threshold

 ## 93.1.0

* Introduce `NOTIFY_LOG_LEVEL_HANDLERS` config variable for separate control of handler log level

 ## 93.0.0

* BREAKING CHANGE: logging: all contents of `logging/__init__.py` have been moved to `logging/flask.py` because they all assume a flask(-like) environment and this way we don't implicitly import all of flask etc. every time anything under `logging` is imported.
* Adds `request_cpu_time` to `app.request` logs when available.
* Adds ability to track eventlet's switching of greenlets and annotate `app.request` logs with useful metrics when the flask config parameter `NOTIFY_EVENTLET_STATS` is set and the newly provided function `account_greenlet_times` is installed as a greenlet trace hook.

***

Complete changes: alphagov/notifications-utils@92.1.1...94.0.1
All our apps are continously deployed now, so we don’t need to
specifically call it out when making PRs against this app.
Copies https://github.com/alphagov/notifications-api/pull/4376/files

Also consolidates environment variables into a single block
 ## 95.1.1

* Add `RUF100` rule to linter config (checks for inapplicable uses of `# noqa`)

 ## 95.1.0

* Adds log message and statsd metric for retried celery tasks

 ## 95.0.0

* Reverts 92.0.0 to restore new validation code

***

Complete changes: alphagov/notifications-utils@94.0.1...95.1.1
Fixes GHSA-g92j-qhmh-64v2

Not something that likely affects us, but clears down another Dependabot
warning.

This fix was released in sentry-sdk==2.8.0, then also backported to
sentry-sdk==1.45.1.

I think we could probably upgrade to Sentry >= 2 without making any
changes, but more testing would be needed to validate this. So just
going with the backport for now.

Specifying an exact version because that’s our convention.
This makes sure that:
- all the imports in `gunicorn_config.py` exist
- it’s exporting the variables we expect it to
 ## 97.0.3

* Bump minimum jinja2 version to 3.1.6

 ## 97.0.2

* Fix external link redirect

 ## 97.0.1

* Fix QR code URL with '&' encoded as '&'

 ## 97.0.0

* for bilingual letters preview, only show barcodes on the first page (patch).
* rename `include_notify_tag` argument to `includes_first_page` (major).

 ## 96.0.0

* BREAKING CHANGE: the `gunicorn_defaults` module has been moved to `gunicorn.defaults` to make space for gunicorn-related utils that don't have the restricted-import constraints of the gunicorn defaults. Imports of `notifications_utils.gunicorn_defaults` should be changed to `notifications_utils.gunicorn.defaults`.
* Added `ContextRecyclingEventletWorker` custom gunicorn worker class.

 ## 95.2.0

* Implement `InsensitiveSet.__contains__`

 ## 95.1.2

* Fix to the number of arguments the `on_retry` of the `NotifyTask` class takes.

***

Complete changes: alphagov/notifications-utils@95.1.1...97.0.3
don't squash all store errors to 400 status code - if the intention
was to prevent information leak, it failed to do this because it
included the error string in the response. do this by making
DocumentStoreError a superclass of other store exception types so
exception handlers can continue to listen for the general
DocumentStoreError.

return 410 for keys that have a delete marker, 404 for documents
with no such key, 403 for documents that are blocked.

notably also catch NoSuchKey from check_for_blocked_document's
get_object_tagging calls and turn this into a DocumentNotFound
error, because in practise this will be the first problem a
request will encounter when trying to access a missing document.
…ation gracefully

instead of causing a crash this should simply return the metadata
with the expiration missing, allowing the client to make up its
own mind what to do. in most normal cases this is only used for
informational display, and in case of a maintenance issue or
misconfiguration it's better to have this information missing
rather than being unable to serve any documents to anyone (a
certain P1)
… app logic and Expiration metadata

this check was previously done in document-download-frontend, but
it really belongs in the document-download-api if anywhere, if
only because the api's download endpoints are publicly accessible
and a document denied access by just the frontend could always
just be downloaded via the api.

note this check still relies on the s3 bucket's lifecycle policies
being in place and implicitly the s3 object's last modified
timestamp. this will only "rescue" us in cases where a delete
marker has not yet been added by the lifecycle rile for some reason.
The version of Gunicorn we are using is more than 18 months out of
date[1] and has a high severity security vulnerabilites[2].

We have not updated the version on the API (and therefore the minimum
version in utils[3]) because last time we tried (while still on PaaS) it
had some performance issues, documented here[4]:

> We originally pinned this due to eventlet v0.33 compatibility issues.
> That was supposedly fixed in version v21.0.0 and we merged v21.2.0 for
> a while. Until we ran a load test again, and identified that the
> bumped version of gunicorn led to a 33%+ drop-off in
> performance/requests per second that the API was able to handle. If a
> version greater than 21.2.0 is released, and it either gives us
> something we need or we think it addresses said performance issues,
> make sure to run a load test in staging before releasing to
> production.

But document download does not serve anywhere near the same number of
requests per second as the API, so we have already upgraded to version
21.2.0.

This pull request just updates from 21.2.0 to 23.0.0 (the latest
version), which resolves the security vulnerability.

***

1. https://github.com/benoitc/gunicorn/tree/21.2.0
2. GHSA-w3h3-4rj7-4ph4
3. https://github.com/alphagov/notifications-utils/blob/main/setup.py#L31
4. https://github.com/alphagov/notifications-api/blob/aff08653d951d6f60dec8d701ae7cf1681b78a27/requirements.in#L10
Bump gunicorn to the latest version
 ## 99.5.1

* Bump minimum Flask version to 3.1.1

 ## 99.5.0

* Update economy letter transit date to 5 days

 ## 99.4.1

* Fix celery beat logging so that it will log in json

 ## 99.4.0

* Add celery logging configuration for json logging

 ## 99.3.4

* `celery.NotifyTask`: don't emit early log if called synchronously

 ## 99.3.3

* Fix bug with non-SMS templates considering international SMS limits

 ## 99.3.2

* Stops counting Crown Dependency phone numbers in CSV file as international

 ## 99.3.1

* Bump `python-json-logger` to version 3

 ## 99.3.0

* Adds `RecipientCSV.international_sms_count`
* Adds `RecipientCSV.more_international_sms_than_can_send` (initialise per `RecipientCSV(remaining_international_sms_messages=123)` to enable this check)

 ## 99.2.0

* Add s3 multipart upload lifecycle (create, upload, complete, abort)

 ## 99.1.2

* Normalise unusual dashes and spacing characters in phone numbers

 ## 99.1.1

* Bump ruff to latest version

 ## 99.1.0

* Adds support for economy mail in get_min_and_max_days_in_transit, with delivery estimated between 2 and 6 days

 ## 99.0.0

* NOTABLE CHANGE: Celery tasks emit an "early" log message when starting, with a customisable log level to allow this to be disabled for high-volume tasks. When upgrading past this version you may want to pass the extra argument `early_log_level=logging.DEBUG` to the task decorator of your most heavily-called tasks to reduce the effect on log volume
* Annotates automatic celery task log messages with `celery_task_id`

 ## 98.0.1

* Moves from `setup.py` to `pyproject.toml` for package configuation
* Changes ruff config from `exclude` to `extend-exclude`

 ## 98.0.0

* Update rate multipliers for international SMS to have values for 1 April 2025 onwards. When used in notifications-api and
  notifications-admin this change will affect the billing of text messages the pricing shown.

 ## 97.0.5

* Fix mailto: URLs in Markdown links

 ## 97.0.4

* Fix inset text block styling

***

Complete changes: alphagov/notifications-utils@97.0.3...99.5.1
to an extent this is a "backup" of the creation-time information
we usually refer to s3's native timestamp for in case some
maintenance-related reason causes us to lose it, but we will
probably add logic to the app further down the line to use
the data from this tag for additional access enforcement.

timestamp will probably not match s3's native timestamp all
the time.
DocumentStore: add `created-at` tag when uploading documents
The versions we are using do not install on newer versions of Python
because of the way they are packaged
including minimal test changes intending to "prove" that everything
still works more-or-less the same without tagging present
quis and others added 23 commits September 29, 2025 10:42
These can control the mime type so we shouldn’t cache under the same key
if they change.
It’s what we do elsewhere
We want to re-check files for viruses after 24 hours, in case the
definitions have changed.
We were a bit light on test coverage here
Either the format is `{"success": …}` or {`error`: …}.

`{"success": {"virus_free": False}}` is ambiguous so it shouldn’t be a
structure we allow to be possible.
This is a bit more robust than just relying on execution order.
This prevents timing attacks, where one service could see if any other
service has uploaded a document by seeing if the checks for that
document run more quickly (because they have been cached).
This is more robust because it shows that the difference in. behaviour
is entirely dependent on file size
The file is always marked as virus-free when antivirus is disabled.

This makes local development easier.

This behaviour was accidentally switched in the process of refactoring,
because
- the original code was hard to read
- there were no tests
Before, virus free could theoretically:
- return `True`
- return `False`
- raise

In practice the way it was structured meant it could never return False.

This means calling code had to account for the `raise`/`False`
possibilities, even though the latter would never happen.

Changing it to a method makes it clearer that it’s doing work. It also
means it less unexpected when we change it to either:
- return `None` (success)
- raise (failed)
This is a better separation of the control flow because it groups the
exception handler with the calls which can raise the exception.
Add a file check endpoint and cache results
Only the file extension can change what MIME type is detected for the
file. The rest of the filename doesn’t matter.

This probably won’t result in a speedup (the dict lookup from extention
to MIME type is cheap) but it could result in slightly fewer cache keys
being used.
The `is_csv` argument is ignored if a filename is provided. So we can
reduce the space of possible hash keys by not including it
Co-authored-by: Mervi Tyczynska <CrystalPea@users.noreply.github.com>
Co-authored-by: Mervi Tyczynska <CrystalPea@users.noreply.github.com>
…ilename

Optimise hash key for MIME type check
@pgaetani pgaetani requested a review from a team October 16, 2025 09:27
@pgaetani pgaetani self-assigned this Oct 16, 2025
@pgaetani pgaetani added enhancement New feature or request upstream merge Changes from alphagov labels Oct 16, 2025
@frankmolenaar1986 frankmolenaar1986 merged commit 3328457 into main Oct 16, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request upstream merge Changes from alphagov

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants