Skip to content

fix(integrations): Cache missing GitHub repo tree lookups#113113

Draft
armenzg wants to merge 1 commit intomasterfrom
more_endpoints
Draft

fix(integrations): Cache missing GitHub repo tree lookups#113113
armenzg wants to merge 1 commit intomasterfrom
more_endpoints

Conversation

@armenzg
Copy link
Copy Markdown
Member

@armenzg armenzg commented Apr 15, 2026

Cache GitHub repo tree 404 responses as empty results in repo tree fetching.

Auto source code config was repeatedly requesting the same missing repos/refs and
producing high-volume errors. Negative-caching these not-found responses for one day
reduces repeated API calls and issue noise while preserving current handling for other
API failures.

I considered applying this to all API errors, but that would hide actionable failures.
The change is intentionally scoped to not-found responses and keeps other errors raising
as before.

Tests cover the new 404 cache behavior and confirm non-404 ApiError responses still raise.

Fixes SENTRY-5K7G

Made with Cursor

Cache GitHub repo tree 404 responses as empty results to avoid repeated
failing requests from auto source code config. Keep non-404 API errors
raising normally and cover both behaviors with tests.

Fixes SENTRY-5K7G
Co-Authored-By: Codex <noreply@openai.com>

Made-with: Cursor
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 15, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: 404 cache missing staggered expiration unlike other cases
    • Added shifted_seconds to the NOT_FOUND_CACHE_SECONDS duration at line 221 to stagger cache expirations across repos, consistent with the ApiConflictError and success path handling.

Create PR

Or push these changes by commenting:

@cursor push 521812cc91
Preview (521812cc91)
diff --git a/src/sentry/integrations/source_code_management/repo_trees.py b/src/sentry/integrations/source_code_management/repo_trees.py
--- a/src/sentry/integrations/source_code_management/repo_trees.py
+++ b/src/sentry/integrations/source_code_management/repo_trees.py
@@ -218,7 +218,7 @@
                         "Caching empty files result for missing repo or ref",
                         extra={"repo": repo_full_name},
                     )
-                    cache.set(key, [], NOT_FOUND_CACHE_SECONDS)
+                    cache.set(key, [], NOT_FOUND_CACHE_SECONDS + shifted_seconds)
                     tree = None
                 else:
                     raise

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit c29b65f. Configure here.

"Caching empty files result for missing repo or ref",
extra={"repo": repo_full_name},
)
cache.set(key, [], NOT_FOUND_CACHE_SECONDS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

404 cache missing staggered expiration unlike other cases

Low Severity

The 404 cache uses a fixed NOT_FOUND_CACHE_SECONDS duration without adding shifted_seconds, while the ApiConflictError (409) and success paths both use self.CACHE_SECONDS + shifted_seconds to stagger cache expirations across repos. When _populate_trees processes many missing repos, all their negative-cache entries expire at the exact same time, causing a burst of repeated 404 API calls on the next run instead of spreading them out.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c29b65f. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

Backend Test Failures

Failures on 4884e64 in this run:

tests/sentry/integrations/github/test_integration.py::GitHubIntegrationTest::test_get_trees_for_org_rate_limit_401log
[gw1] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/github/test_integration.py:1124: in test_get_trees_for_org_rate_limit_401
    assert trees == self._expected_trees(
E   AssertionError: assert {'Test-Organi...src/xyz.py'])} == {'Test-Organi...src/xyz.py'])}
E     
E     Omitting 3 identical items, use -vv to show
E     Left contains 1 more item:
E     {'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[])}
E     
E     Full diff:
E       {
E           'Test-Organization/bar': RepoTree(repo=RepoAndBranch(name='Test-Organization/bar', branch='main'), files=[]),
E     +     'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[]),
E           'Test-Organization/foo': RepoTree(repo=RepoAndBranch(name='Test-Organization/foo', branch='master'), files=['src/sentry/api/endpoints/auth_login.py']),
E           'Test-Organization/xyz': RepoTree(repo=RepoAndBranch(name='Test-Organization/xyz', branch='master'), files=['src/xyz.py']),
E       }
tests/sentry/integrations/github/test_integration.py::GitHubIntegrationTest::test_get_trees_for_org_makes_API_requests_before_MAX_CONNECTION_ERRORS_is_hitlog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/github/test_integration.py:1170: in test_get_trees_for_org_makes_API_requests_before_MAX_CONNECTION_ERRORS_is_hit
    assert trees == self._expected_trees(
E   AssertionError: assert {'Test-Organi...h_login.py'])} == {'Test-Organi...h_login.py'])}
E     
E     Omitting 2 identical items, use -vv to show
E     Left contains 1 more item:
E     {'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[])}
E     
E     Full diff:
E       {
E           'Test-Organization/bar': RepoTree(repo=RepoAndBranch(name='Test-Organization/bar', branch='main'), files=[]),
E     +     'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[]),
E           'Test-Organization/foo': RepoTree(repo=RepoAndBranch(name='Test-Organization/foo', branch='master'), files=['src/sentry/api/endpoints/auth_login.py']),
E       }
tests/sentry/integrations/github/test_integration.py::GitHubIntegrationTest::test_get_trees_for_org_prevent_exhaustion_some_reposlog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/github/test_integration.py:1091: in test_get_trees_for_org_prevent_exhaustion_some_repos
    assert trees == self._expected_trees(
E   AssertionError: assert {'Test-Organi...src/xyz.py'])} == {'Test-Organi...src/xyz.py'])}
E     
E     Omitting 3 identical items, use -vv to show
E     Left contains 1 more item:
E     {'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[])}
E     
E     Full diff:
E       {
E           'Test-Organization/bar': RepoTree(repo=RepoAndBranch(name='Test-Organization/bar', branch='main'), files=[]),
E     +     'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[]),
E           'Test-Organization/foo': RepoTree(repo=RepoAndBranch(name='Test-Organization/foo', branch='master'), files=['src/sentry/api/endpoints/auth_login.py']),
E           'Test-Organization/xyz': RepoTree(repo=RepoAndBranch(name='Test-Organization/xyz', branch='master'), files=['src/xyz.py']),
E       }
tests/sentry/integrations/github/test_integration.py::GitHubIntegrationTest::test_get_trees_for_org_workslog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/github/test_integration.py:1048: in test_get_trees_for_org_works
    assert trees == expected_trees
E   AssertionError: assert {'Test-Organi...src/xyz.py'])} == {'Test-Organi...src/xyz.py'])}
E     
E     Omitting 3 identical items, use -vv to show
E     Left contains 1 more item:
E     {'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[])}
E     
E     Full diff:
E       {
E           'Test-Organization/bar': RepoTree(repo=RepoAndBranch(name='Test-Organization/bar', branch='main'), files=[]),
E     +     'Test-Organization/baz': RepoTree(repo=RepoAndBranch(name='Test-Organization/baz', branch='master'), files=[]),
E           'Test-Organization/foo': RepoTree(repo=RepoAndBranch(name='Test-Organization/foo', branch='master'), files=['src/sentry/api/endpoints/auth_login.py']),
E           'Test-Organization/xyz': RepoTree(repo=RepoAndBranch(name='Test-Organization/xyz', branch='master'), files=['src/xyz.py']),
E       }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant