Skip to content

feat(preprod): Use sequential (n-1) comparisons for size analysis diff monitors#111482

Merged
mtopo27 merged 5 commits intomasterfrom
mtopo27/sequential-diff-monitors
Mar 30, 2026
Merged

feat(preprod): Use sequential (n-1) comparisons for size analysis diff monitors#111482
mtopo27 merged 5 commits intomasterfrom
mtopo27/sequential-diff-monitors

Conversation

@mtopo27
Copy link
Copy Markdown
Contributor

@mtopo27 mtopo27 commented Mar 24, 2026

Change size analysis diff monitors to use sequential (n-1) comparisons instead of git-based (merge-base) comparisons. When a new build arrives, each diff-threshold monitor (absolute_diff, relative_diff) now resolves its base by finding the most recent prior artifact matching the detector's query filters and structural identity (app_id, artifact_type, build_configuration), ordered by date_added.

Previously, monitors were evaluated inside the git-based comparison path (_run_size_analysis_comparisonmaybe_emit_issues), which meant:

  • Artifacts without git metadata (no commit_comparison) never triggered monitors
  • All diff detectors shared a single git-derived base, even if they had different query filters
  • Monitor evaluation was coupled to the status check comparison flow

Now the two paths are independent:

  • Git path — persists PreprodArtifactSizeComparison records and triggers status checks. No longer evaluates monitors.
  • Sequential pathmaybe_emit_issues_from_diff_size_results runs unconditionally after the git block (even without git metadata). Each detector resolves its own base via get_sequential_base_artifact, with batch optimization (one DB lookup per unique query string across detectors).

absolute threshold monitors are unaffected — they're already handled separately by maybe_emit_issues_from_absolute_size_results.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 24, 2026
Comment thread src/sentry/preprod/size_analysis/tasks.py Outdated
@mtopo27 mtopo27 force-pushed the mtopo27/sequential-diff-monitors branch from 2b05752 to 32989ef Compare March 25, 2026 00:10
@mtopo27 mtopo27 changed the base branch from master to mtopo27/refactor-commit-comparison-conditional March 25, 2026 00:10
@mtopo27 mtopo27 marked this pull request as ready for review March 25, 2026 20:07
@mtopo27 mtopo27 requested a review from a team as a code owner March 25, 2026 20:07
Comment thread src/sentry/preprod/artifact_search.py
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

search_filters = parse_search_query(
query, config=search_config, get_field_type=get_field_type
)
queryset = apply_filters(queryset, search_filters, organization)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing queryset annotations for annotation-dependent search fields

Medium Severity

get_sequential_base_artifact creates a bare queryset via PreprodArtifact.objects.get_queryset() without calling the annotation methods that several valid search filter fields depend on. The existing queryset_for_query applies annotate_download_count, annotate_installable, annotate_main_size_metrics, and annotate_platform_name before calling apply_filters. Without those annotations, any detector query using install_size, download_size, download_count, installable, or platform_name will raise a Django FieldError at query execution time. That exception isn't caught by the InvalidSearchQuery handler in the calling loop, so it propagates to the top-level catch-all, silently aborting evaluation for all monitors on that artifact.

Additional Locations (1)
Fix in Cursor Fix in Web


queryset = queryset.exclude(pk=artifact.pk)

return queryset.order_by("-date_added").first()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No date constraint allows returning newer artifacts as base

Low Severity

get_sequential_base_artifact excludes the current artifact by PK and picks the most recent remaining one, but never filters date_added__lt=artifact.date_added. If an older artifact is reprocessed or tasks run out of order, the function can return a newer artifact as the "base," inverting the comparison direction. This contradicts the docstring's guarantee of finding "the most recent prior artifact."

Fix in Cursor Fix in Web

Comment thread src/sentry/preprod/size_analysis/tasks.py
Flip the early-return guard `if not artifact.commit_comparison: return`
into an `if/else` block. This is a pure structural change with no
behavior difference, preparing for a follow-up that adds logic to the
else branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mtopo27 mtopo27 force-pushed the mtopo27/refactor-commit-comparison-conditional branch from f20d53c to 21a3f8c Compare March 26, 2026 21:58
mtopo27 and others added 3 commits March 26, 2026 15:09
…f monitors

Change size analysis diff monitors to use sequential comparisons instead
of git-based (merge-base) comparisons. When a new build arrives, each
diff-threshold monitor now resolves its base by finding the most recent
prior artifact matching the detector's query filters and structural
identity (app_id, artifact_type, build_configuration).

- Add get_sequential_base_artifact() to artifact_search.py for n-1
  base lookup with batch optimization per unique query string
- Refactor maybe_emit_issues into maybe_emit_issues_from_diff_size_results
  which owns all diff monitor evaluation via sequential comparison
- Remove maybe_emit_issues call from git-based _run_size_analysis_comparison
  so the git path only persists comparison records for status checks
- Restructure compare_preprod_artifact_size_analysis so the commit_comparison
  check guards only the git block; sequential monitor eval runs unconditionally

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…uations

Wire _run_size_analysis_comparison into the sequential (n-1) monitor
path so that PreprodArtifactSizeComparison records are created and
file-level diffs are persisted, matching the git-based status check
path. Base metrics are now fetched once per unique base artifact and
cached, removing redundant per-detector DB queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `detector_base_metric` to avoid shadowing the outer
`base_metric` variable, which caused an incompatible assignment
and unreachable statement error in mypy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@chromy chromy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Base automatically changed from mtopo27/refactor-commit-comparison-conditional to master March 30, 2026 14:58
…diff-monitors

# Conflicts:
#	src/sentry/preprod/size_analysis/tasks.py
@mtopo27 mtopo27 merged commit 54d7c86 into master Mar 30, 2026
65 checks passed
@mtopo27 mtopo27 deleted the mtopo27/sequential-diff-monitors branch March 30, 2026 17:29
@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants