Skip to content

fix(aws-serverless): Add timeout to _endSpan forceFlush to prevent Lambda hanging#20064

Merged
logaretm merged 1 commit intodevelopfrom
awad/js-2033-aws-can-block-handler-indefinitely
Apr 1, 2026
Merged

fix(aws-serverless): Add timeout to _endSpan forceFlush to prevent Lambda hanging#20064
logaretm merged 1 commit intodevelopfrom
awad/js-2033-aws-can-block-handler-indefinitely

Conversation

@logaretm
Copy link
Copy Markdown
Member

@logaretm logaretm commented Mar 31, 2026

The vendored AwsLambdaInstrumentation._endSpan calls tracerProvider.forceFlush() without any timeout.

Because _endSpan blocks the promise chain before wrapHandler's flush(2000) can run, a hung forceFlush() prevents the Lambda from ever returning, causing it to sit idle until the runtime kills it at its configured timeout.

This was reported by a user whose AppSync authorizer Lambda consistently timed out at 15s despite the handler completing in ~178ms.

The fix wraps the flush in a Promise.race with a 2s timeout, matching wrapHandler's existing flush timeout.

closes #20063

Copilot AI review requested due to automatic review settings March 31, 2026 20:46
@linear-code
Copy link
Copy Markdown

linear-code bot commented Mar 31, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 31, 2026

Semver Impact of This PR

🟢 Patch (bug fixes)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

  • (core, node) Portable Express integration by isaacs in #19928
  • (deno) Add denoRuntimeMetricsIntegration by chargome in #20023
  • (deps) Bump @xmldom/xmldom from 0.8.3 to 0.8.12 by dependabot in #20066

Bug Fixes 🐛

  • (aws-serverless) Add timeout to _endSpan forceFlush to prevent Lambda hanging by logaretm in #20064
  • (cloudflare) Ensure every request instruments functions by JPeer264 in #20044
  • (gatsby) Fix errorHandler signature to match bundler-plugin-core API by JPeer264 in #20048

Internal Changes 🔧

Core

  • Extract shared endStreamSpan for AI integrations by nicohrubec in #20021
  • Remove provider-specific AI span attributes in favor of gen_ai attributes in sentry conventions by nicohrubec in #20011

Other

  • Update validate-pr workflow by stephanie-anderson in #20072
  • Remove unused tsconfig-template folder by mydea in #20067

🤖 This preview updates automatically when you update the PR.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a bounded timeout around OpenTelemetry forceFlush() in the vendored AwsLambdaInstrumentation._endSpan to prevent AWS Lambda invocations from hanging indefinitely when flush never resolves, and introduces regression tests to verify callback behavior in both hanging and normal flush scenarios.

Changes:

  • Wrap _endSpan flusher waiting in a Promise.race with a 2s timeout.
  • Add new AwsLambdaInstrumentation._endSpan unit tests covering hanging flush, normal flush, and error-to-span metadata behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
packages/aws-serverless/src/integration/instrumentation-aws-lambda/instrumentation.ts Adds a 2s timeout guard around OTel forceFlush waiting in _endSpan.
packages/aws-serverless/test/instrumentation.test.ts New tests validating _endSpan callback timing and error span annotations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@logaretm logaretm force-pushed the awad/js-2033-aws-can-block-handler-indefinitely branch from 7eb6fef to b289419 Compare March 31, 2026 20:49
@logaretm logaretm requested review from JPeer264 and chargome March 31, 2026 20:50
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 31, 2026

size-limit report 📦

⚠️ Warning: Base artifact is not the latest one, because the latest workflow run is not done yet. This may lead to incorrect results. Try to re-run all tests to get up to date results.

Path Size % Change Change
@sentry/browser 25.65 kB +0.02% +5 B 🔺
@sentry/browser - with treeshaking flags 24.14 kB +0.03% +5 B 🔺
@sentry/browser (incl. Tracing) 42.16 kB +0.02% +7 B 🔺
@sentry/browser (incl. Tracing, Profiling) 46.77 kB +0.02% +9 B 🔺
@sentry/browser (incl. Tracing, Replay) 80.94 kB +0.01% +5 B 🔺
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 70.56 kB +0.01% +5 B 🔺
@sentry/browser (incl. Tracing, Replay with Canvas) 85.66 kB +0.01% +8 B 🔺
@sentry/browser (incl. Tracing, Replay, Feedback) 97.92 kB +0.01% +5 B 🔺
@sentry/browser (incl. Feedback) 42.42 kB +0.02% +6 B 🔺
@sentry/browser (incl. sendFeedback) 30.31 kB +0.02% +6 B 🔺
@sentry/browser (incl. FeedbackAsync) 35.29 kB +0.02% +6 B 🔺
@sentry/browser (incl. Metrics) 26.96 kB +0.03% +7 B 🔺
@sentry/browser (incl. Logs) 27.11 kB +0.03% +7 B 🔺
@sentry/browser (incl. Metrics & Logs) 27.78 kB +0.03% +7 B 🔺
@sentry/react 27.41 kB +0.03% +6 B 🔺
@sentry/react (incl. Tracing) 44.48 kB +0.02% +5 B 🔺
@sentry/vue 30.08 kB +0.02% +5 B 🔺
@sentry/vue (incl. Tracing) 44.05 kB +0.02% +8 B 🔺
@sentry/svelte 25.67 kB +0.02% +5 B 🔺
CDN Bundle 28.32 kB +0.03% +6 B 🔺
CDN Bundle (incl. Tracing) 43.11 kB +0.02% +6 B 🔺
CDN Bundle (incl. Logs, Metrics) 29.68 kB +0.02% +4 B 🔺
CDN Bundle (incl. Tracing, Logs, Metrics) 44.16 kB +0.02% +7 B 🔺
CDN Bundle (incl. Replay, Logs, Metrics) 68.48 kB +0.01% +6 B 🔺
CDN Bundle (incl. Tracing, Replay) 80.01 kB +0.01% +7 B 🔺
CDN Bundle (incl. Tracing, Replay, Logs, Metrics) 81.05 kB +0.01% +6 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback) 85.55 kB +0.01% +5 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) 86.58 kB +0.01% +6 B 🔺
CDN Bundle - uncompressed 82.68 kB +0.03% +22 B 🔺
CDN Bundle (incl. Tracing) - uncompressed 127.83 kB +0.02% +22 B 🔺
CDN Bundle (incl. Logs, Metrics) - uncompressed 86.83 kB +0.03% +22 B 🔺
CDN Bundle (incl. Tracing, Logs, Metrics) - uncompressed 131.24 kB +0.02% +22 B 🔺
CDN Bundle (incl. Replay, Logs, Metrics) - uncompressed 209.81 kB +0.02% +22 B 🔺
CDN Bundle (incl. Tracing, Replay) - uncompressed 244.7 kB +0.01% +22 B 🔺
CDN Bundle (incl. Tracing, Replay, Logs, Metrics) - uncompressed 248.1 kB +0.01% +22 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 257.62 kB +0.01% +22 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) - uncompressed 261 kB +0.01% +22 B 🔺
@sentry/nextjs (client) 46.9 kB +0.02% +7 B 🔺
@sentry/sveltekit (client) 42.62 kB +0.02% +7 B 🔺
@sentry/node-core 55.77 kB +0.04% +17 B 🔺
@sentry/node 172.13 kB -0.37% -625 B 🔽
@sentry/node - without tracing 96.05 kB +0.05% +40 B 🔺
@sentry/aws-serverless 112.85 kB +0.07% +78 B 🔺

View base workflow run

Comment on lines +514 to +522
() => {
clearTimeout(timeoutId);
callback();
},
() => {
clearTimeout(timeoutId);
callback();
},
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: I think we could replace that with just a .finally?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need the catch for handling rejections, not sure if that will ever happen here tho. But finally can reduce the duplication at the very least. 👍

…mbda hanging

The vendored AwsLambdaInstrumentation._endSpan calls tracerProvider.forceFlush()
without any timeout. Because _endSpan blocks the promise chain before wrapHandler's
flush(2000), a hung forceFlush() prevents the Lambda from ever returning — causing
it to sit idle until the runtime kills it at its configured timeout.

Wrap the flush in a Promise.race with a 2s timeout to match wrapHandler's flush
timeout, ensuring the callback always fires within a bounded time.

Co-Authored-By: Claude <noreply@anthropic.com>
Made-with: Cursor
@logaretm logaretm force-pushed the awad/js-2033-aws-can-block-handler-indefinitely branch from b289419 to b517c7e Compare April 1, 2026 13:12
@logaretm logaretm merged commit 6a397a3 into develop Apr 1, 2026
45 checks passed
@logaretm logaretm deleted the awad/js-2033-aws-can-block-handler-indefinitely branch April 1, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AWS Lambda _endSpan/forceFlush() can block handler indefinitely

3 participants