Skip to content

Fix updateinfo v2 timeout for RL8 AppStream by filtering prefetch#78

Open
rockythorn wants to merge 1 commit intoresf:mainfrom
rockythorn:fix/updateinfo-appstream-timeout
Open

Fix updateinfo v2 timeout for RL8 AppStream by filtering prefetch#78
rockythorn wants to merge 1 commit intoresf:mainfrom
rockythorn:fix/updateinfo-appstream-timeout

Conversation

@rockythorn
Copy link
Collaborator

@rockythorn rockythorn commented Mar 18, 2026

The get_updateinfo_v2 endpoint was loading all advisory_packages rows for every matching advisory (all repos, arches, mirrors), then discarding ~95% of them in Python. For RL8 AppStream without a minor_version filter, this meant ~300k rows loaded to produce ~20k, consistently exceeding the 30-second server worker timeout.

Changes

  • Use Prefetch with a filtered queryset on advisory__packages so only packages matching the requested repo and supported_product are loaded from the DB
  • Add composite index on advisory_packages(advisory_id, repo_name, supported_product_id) to support the filtered prefetch query
  • Add composite index on advisory_affected_products(supported_product_id, major_version, arch) to speed up the initial filter query
  • Add migration 20260318000000_add_updateinfo_perf_indexes.sql to apply the new indexes to production

Closes #77

Testing

Confirmed the endpoint was timing out on production before this fix:

Endpoint Status Time
RL8 AppStream x86_64 (production, pre-fix) ❌ 500 28.9s
RL8 AppStream aarch64 (production, pre-fix) ❌ 500 29.4s
RL8 BaseOS x86_64 (production, pre-fix) ✅ 200 8.4s

Local benchmarks were run against a DB restored from the April 2025 production dump (6,905 advisories, 384,990 packages) across three scenarios:

Scenario RL8 AppStream x86_64 RL8 AppStream aarch64 RL8 BaseOS x86_64
main, no indexes (current production state) ✅ 200 @ 8.3s ✅ 200 @ 8.3s ✅ 200 @ 1.6s
main, with indexes only ✅ 200 @ 8.2s ✅ 200 @ 8.2s ✅ 200 @ 1.4s
fix branch, Prefetch + indexes ✅ 200 @ 8.1s ✅ 200 @ 8.1s ✅ 200 @ 1.4s

Local benchmarks did not reproduce the same timing differences seen in production. We suspect this is primarily due to local hardware being significantly faster than the Kubernetes cluster production runs on — the local machine's NVMe SSD likely masks the I/O cost that would be more pronounced on production storage under concurrent load.

EXPLAIN ANALYZE gives a more hardware-independent view, showing the new indexes reduce buffer reads by 12x:

Buffers read Buffers from cache
Without indexes 5,604 11,529
With indexes 468 3,768

On production storage where data is less likely to be fully cached in memory, that 12x I/O reduction should translate directly into wall-clock time savings.

The get_updateinfo_v2 endpoint was loading all advisory_packages rows
for every matching advisory (all repos, arches, mirrors), then
discarding ~95% of them in Python. For RL8 AppStream without a
minor_version filter, this meant ~300k rows loaded to produce ~20k,
consistently exceeding the 30-second server worker timeout.

Fixes:
- Use Prefetch with a filtered queryset on advisory__packages so only
  packages matching the requested repo and supported_product are loaded
  from the DB
- Add composite index on advisory_packages(advisory_id, repo_name,
  supported_product_id) to support the filtered prefetch query
- Add composite index on advisory_affected_products(supported_product_id,
  major_version, arch) to speed up the initial filter query

Closes resf#77
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Updateinfo.xml Timeouts

2 participants