-
-
Notifications
You must be signed in to change notification settings - Fork 6
fix: use parent spanId for request logs to nest under Cloud Run traces #1209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Custom TraceAwareCloudLoggingFilter fixes spanId mismatch where app logs were detached from parent HTTP requests in Cloud Logging. - Request-level logs use parent spanId (nests under Cloud Run logs) - Custom span logs use child spanId (correlates with Cloud Trace spans) - Detection via span.kind: SERVER = root request, INTERNAL = custom span - Removed LoggingInstrumentor (custom filter handles trace context)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a spanId mismatch issue where application logs were detached from parent HTTP request logs in Cloud Logging. The solution introduces a custom logging filter that intelligently chooses between parent and child span IDs based on span type.
Key changes:
- Custom
TraceAwareCloudLoggingFilteruses parent spanId for request-level logs (nesting under Cloud Run logs) and child spanId for custom span logs (correlating with Cloud Trace spans) - Detection based on
span.kind: SERVER spans (from DjangoInstrumentor) vs INTERNAL spans (custom application spans) - Removed
LoggingInstrumentorsince the custom filter now handles trace context injection
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| gyrinx/logging_filter.py | New module implementing TraceAwareCloudLoggingFilter and TraceAwareStructuredLogHandler for proper spanId correlation |
| gyrinx/tests/test_logging_filter.py | Comprehensive test suite covering filter logic, span detection, and fallback scenarios |
| gyrinx/tracing.py | Removed LoggingInstrumentor import and usage, added comment explaining custom filter handles trace context |
| gyrinx/settings_prod.py | Updated logging handler class to use custom TraceAwareStructuredLogHandler |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class TestTraceAwareStructuredLogHandler: | ||
| """Tests for TraceAwareStructuredLogHandler.""" | ||
|
|
||
| def test_uses_custom_filter(self): | ||
| """Handler should use TraceAwareCloudLoggingFilter.""" | ||
| handler = TraceAwareStructuredLogHandler( | ||
| project_id="test-project", labels={"env": "test"} | ||
| ) | ||
|
|
||
| # Should have exactly one filter of our custom type | ||
| custom_filters = [ | ||
| f for f in handler.filters if isinstance(f, TraceAwareCloudLoggingFilter) | ||
| ] | ||
| assert len(custom_filters) == 1 | ||
|
|
||
| # Check filter configuration | ||
| custom_filter = custom_filters[0] | ||
| assert custom_filter.project == "test-project" | ||
| assert custom_filter.default_labels == {"env": "test"} | ||
|
|
||
| def test_stores_project_id(self): | ||
| """Handler should store project_id attribute.""" | ||
| handler = TraceAwareStructuredLogHandler(project_id="my-project") | ||
| assert handler.project_id == "my-project" | ||
|
|
||
| def test_stores_json_encoder(self): | ||
| """Handler should store json_encoder_cls attribute.""" | ||
| import json | ||
|
|
||
| handler = TraceAwareStructuredLogHandler( | ||
| project_id="test", json_encoder_cls=json.JSONEncoder | ||
| ) | ||
| assert handler._json_encoder_cls == json.JSONEncoder |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests for TraceAwareStructuredLogHandler only verify that attributes are stored correctly, but don't test the actual logging functionality. There's no test that verifies the handler can successfully format and emit log records, which is critical given that the handler bypasses StructuredLogHandler's init method.
Consider adding a test that creates a log record, passes it through the handler, and verifies the output is correctly formatted as JSON. This would catch any issues with the missing initialization or formatting setup.
| is_custom_span = ( | ||
| getattr(span, "kind", None) != opentelemetry.trace.SpanKind.SERVER | ||
| ) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span kind comparison uses getattr with a default of None, which could cause unexpected behavior. If span.kind is None (attribute doesn't exist), the comparison None != opentelemetry.trace.SpanKind.SERVER will be True, meaning the span will be incorrectly classified as a custom span.
This could lead to logs using the wrong span ID. Consider using a more explicit check or handling the case where kind attribute doesn't exist differently. For example, you could default to treating missing kind as SERVER (request-level) rather than as a custom span, since that's likely the safer default for logging correlation.
| is_custom_span = ( | |
| getattr(span, "kind", None) != opentelemetry.trace.SpanKind.SERVER | |
| ) | |
| # If `kind` is missing, default to SERVER (treat as request-level span). | |
| kind = getattr(span, "kind", opentelemetry.trace.SpanKind.SERVER) | |
| is_custom_span = kind != opentelemetry.trace.SpanKind.SERVER |
|
|
||
| def test_returns_none_when_no_opentelemetry(self, filter_instance): | ||
| """Returns None values when OpenTelemetry is not available.""" | ||
| with patch("gyrinx.logging_filter.HAS_OPENTELEMETRY", False): |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test mocks gyrinx.logging_filter.HAS_OPENTELEMETRY but this approach may not work as expected in Python. The module-level constant HAS_OPENTELEMETRY is evaluated once when the module is imported, and patching it in a test might not affect the already-imported value used by the function.
A more reliable approach would be to patch the import check at the point where it's used in _get_otel_context(), or to patch opentelemetry.trace to be None/unavailable. The current approach might lead to flaky tests or tests that don't actually validate the intended behavior.
| with patch("gyrinx.logging_filter.HAS_OPENTELEMETRY", False): | |
| # Patch the value at the point where _get_otel_context reads it, | |
| # rather than patching the module-level constant directly. | |
| with patch.dict( | |
| filter_instance._get_otel_context.__globals__, {"HAS_OPENTELEMETRY": False} | |
| ): |
| with patch("gyrinx.logging_filter.opentelemetry.trace") as mock_trace: | ||
| mock_trace.get_current_span.return_value = mock_trace.span.INVALID_SPAN |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test compares the return value of get_current_span() with mock_trace.span.INVALID_SPAN, but this creates an inconsistent mock structure. The code being tested accesses opentelemetry.trace.span.INVALID_SPAN (line 111 in logging_filter.py), but the mock sets mock_trace.span.INVALID_SPAN.
This means the comparison span == opentelemetry.trace.span.INVALID_SPAN in the actual code won't match the mocked value correctly because the mock paths don't align. The test should ensure that mock_trace.span.INVALID_SPAN is the same object that get_current_span() returns, or it should properly mock the full path gyrinx.logging_filter.opentelemetry.trace.span.INVALID_SPAN.
| with patch("gyrinx.logging_filter.opentelemetry.trace") as mock_trace: | |
| mock_trace.get_current_span.return_value = mock_trace.span.INVALID_SPAN | |
| with patch("gyrinx.logging_filter.opentelemetry") as mock_otel: | |
| invalid_span = MagicMock() | |
| mock_otel.trace.span.INVALID_SPAN = invalid_span | |
| mock_otel.trace.get_current_span.return_value = invalid_span |
| span_id = otel_span_id # Correlate with custom span in Cloud Trace | ||
| else: | ||
| span_id = header_span_id # Nest under Cloud Run request log | ||
| trace_id = otel_trace_id | ||
| sampled = otel_sampled |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for determining which span ID to use has a potential edge case. When both otel_trace_id and header_span_id are present (line 62), but is_custom_span is False (line 67-68), the code uses header_span_id for the span but otel_trace_id for the trace.
However, if the trace IDs don't match (header_trace_id != otel_trace_id), this creates an inconsistency where the span ID comes from one trace context but the trace ID comes from another. This could happen if there's a propagation issue or if the trace contexts are out of sync.
Consider adding validation to ensure trace IDs match when both contexts are available, or at minimum add a comment explaining why using mixed IDs from different sources is safe in this scenario.
| span_id = otel_span_id # Correlate with custom span in Cloud Trace | |
| else: | |
| span_id = header_span_id # Nest under Cloud Run request log | |
| trace_id = otel_trace_id | |
| sampled = otel_sampled | |
| # Custom span logs: use OTel span/trace so they correlate with Cloud Trace spans. | |
| span_id = otel_span_id | |
| trace_id = otel_trace_id | |
| sampled = otel_sampled | |
| else: | |
| # Request-level logs: prefer parent spanId from header to nest under Cloud Run logs. | |
| if header_trace_id and header_trace_id != otel_trace_id: | |
| # If header and OTel trace IDs disagree, keep span/trace consistent by | |
| # using the header trace context for request-level logs. | |
| span_id = header_span_id | |
| trace_id = header_trace_id | |
| sampled = header_sampled | |
| else: | |
| # Normal case: trace IDs match or header_trace_id is unavailable. | |
| # Use header spanId for nesting, but keep OTel traceId for cross-service correlation. | |
| span_id = header_span_id | |
| trace_id = otel_trace_id | |
| sampled = otel_sampled |
| # Call StreamHandler init directly to skip StructuredLogHandler's filter setup | ||
| logging.StreamHandler.__init__(self, stream=stream) | ||
| self.project_id = project_id | ||
|
|
||
| # Use our custom filter instead of default CloudLoggingFilter | ||
| log_filter = TraceAwareCloudLoggingFilter( | ||
| project=project_id, default_labels=labels | ||
| ) | ||
| self.addFilter(log_filter) | ||
|
|
||
| # Store json encoder class (used by StructuredLogHandler.emit) | ||
| self._json_encoder_cls = json_encoder_cls |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TraceAwareStructuredLogHandler bypasses StructuredLogHandler's init method by calling StreamHandler.init directly. This means it doesn't inherit important initialization logic from StructuredLogHandler, particularly the setFormatter() call that configures JSON formatting.
Without calling super().init() or StructuredLogHandler.init(), the handler may not properly format log records as JSON for Cloud Logging. The parent class typically sets up a StructuredLogFormatter which is critical for proper JSON output.
Consider calling the parent class's init method and then replacing the filter, rather than bypassing it entirely. Alternatively, ensure all necessary initialization from StructuredLogHandler is replicated here, including formatter setup.
| # Call StreamHandler init directly to skip StructuredLogHandler's filter setup | |
| logging.StreamHandler.__init__(self, stream=stream) | |
| self.project_id = project_id | |
| # Use our custom filter instead of default CloudLoggingFilter | |
| log_filter = TraceAwareCloudLoggingFilter( | |
| project=project_id, default_labels=labels | |
| ) | |
| self.addFilter(log_filter) | |
| # Store json encoder class (used by StructuredLogHandler.emit) | |
| self._json_encoder_cls = json_encoder_cls | |
| # Initialize using StructuredLogHandler to ensure formatter and internals are set up | |
| super().__init__( | |
| labels=labels, | |
| stream=stream, | |
| project_id=project_id, | |
| json_encoder_cls=json_encoder_cls, | |
| **kwargs, | |
| ) | |
| # Replace any default filters with our custom TraceAwareCloudLoggingFilter | |
| self.filters.clear() | |
| log_filter = TraceAwareCloudLoggingFilter( | |
| project=project_id, default_labels=labels | |
| ) | |
| self.addFilter(log_filter) |
| # Call StreamHandler init directly to skip StructuredLogHandler's filter setup | ||
| logging.StreamHandler.__init__(self, stream=stream) | ||
| self.project_id = project_id | ||
|
|
||
| # Use our custom filter instead of default CloudLoggingFilter | ||
| log_filter = TraceAwareCloudLoggingFilter( | ||
| project=project_id, default_labels=labels | ||
| ) | ||
| self.addFilter(log_filter) | ||
|
|
||
| # Store json encoder class (used by StructuredLogHandler.emit) | ||
| self._json_encoder_cls = json_encoder_cls |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TraceAwareStructuredLogHandler doesn't override the emit() method from StructuredLogHandler. While it inherits emit() from the parent class, the parent's emit() method expects certain attributes and formatting to be set up during initialization.
Since this handler bypasses StructuredLogHandler.init() and only calls StreamHandler.init(), the emit() method may not work correctly because it won't have access to the formatter or other setup that StructuredLogHandler normally provides.
Verify that the inherited emit() method works correctly with this initialization approach, or override emit() to ensure proper JSON formatting for Cloud Logging.
| # Call StreamHandler init directly to skip StructuredLogHandler's filter setup | |
| logging.StreamHandler.__init__(self, stream=stream) | |
| self.project_id = project_id | |
| # Use our custom filter instead of default CloudLoggingFilter | |
| log_filter = TraceAwareCloudLoggingFilter( | |
| project=project_id, default_labels=labels | |
| ) | |
| self.addFilter(log_filter) | |
| # Store json encoder class (used by StructuredLogHandler.emit) | |
| self._json_encoder_cls = json_encoder_cls | |
| # Initialize StructuredLogHandler normally so that emit() has all expected state | |
| super().__init__( | |
| labels=labels, | |
| stream=stream, | |
| project_id=project_id, | |
| json_encoder_cls=json_encoder_cls, | |
| **kwargs, | |
| ) | |
| # Replace the default CloudLoggingFilter with our custom trace-aware filter | |
| for existing_filter in list(self.filters): | |
| if isinstance(existing_filter, CloudLoggingFilter): | |
| self.removeFilter(existing_filter) | |
| log_filter = TraceAwareCloudLoggingFilter( | |
| project=project_id, | |
| default_labels=labels, | |
| ) | |
| self.addFilter(log_filter) |
Custom TraceAwareCloudLoggingFilter fixes spanId mismatch where app logs were detached from parent HTTP requests in Cloud Logging.