-
-
Notifications
You must be signed in to change notification settings - Fork 82
Description
Problem
When using apalis with OpenTelemetry/tracing, background jobs executed by workers appear as separate traces rather than being linked to the trace of the HTTP request (or other context) that enqueued them.
Current behavior:
- HTTP request creates a trace with spans for the request handler
- When a job is enqueued, a new trace is created for the worker's
taskspan - These traces are disconnected in Jaeger/other tracing backends
Expected behavior:
- The
taskspan from the worker should be a child (or linked via span link) of the span that was active when the job was enqueued - This would allow tracing the full flow from HTTP request → job enqueue → job execution as a single trace
Root Cause
Looking at src/layers/tracing/make_span.rs, the DefaultMakeSpan implementation uses Span::current() at job execution time:
impl<B, Ctx> MakeSpan<B, Ctx> for DefaultMakeSpan {
fn make_span(&mut self, req: &Request<B, Ctx>) -> Span {
let span = Span::current(); // This is the current span when the WORKER executes, not when the job was ENQUEUED
// ...
tracing::span!(
parent: span,
// ...
)
}
}Since the worker executes in a different async context than the original HTTP request handler, Span::current() returns either no span or a different root span, breaking the trace continuity.
Proposed Solution
Support OpenTelemetry context propagation by:
- Capture trace context at enqueue time: Extract
trace_idandspan_idfrom the current span when callingenqueue() - Store with the request: Include trace context in
Request<T, Ctx>- either via theParts.contextfield or a new dedicated field - Restore at execution time: Use
OpenTelemetrySpanExt::set_parent()to link the task span to the original context
This is the standard pattern for distributed tracing across async boundaries, as described in the OpenTelemetry documentation.
Minimal Reproduction
See the attached apalis-trace-context-repro.zip.
To reproduce:
- Start Jaeger:
docker run -d -p 16686:16686 -p 4317:4317 jaegertracing/all-in-one:latest - Extract and run:
unzip apalis-trace-context-repro.zip && cd apalis-trace-context-repro && cargo run - Make request:
curl -X POST http://localhost:3000/send-email - View traces in Jaeger at http://localhost:16686
You'll see two separate traces instead of one connected trace.
Alternative: Span Links
For truly async operations where the job might execute long after the request completes, Span Links could be used instead of parent-child relationships. This creates an association without implying the request duration includes the job duration.
Environment
- apalis version: 0.7.4
- Rust version: 1.84
- OS: macOS