Skip to content

Support trace context propagation from job producer to consumer #664

@IgnisDa

Description

@IgnisDa

Problem

When using apalis with OpenTelemetry/tracing, background jobs executed by workers appear as separate traces rather than being linked to the trace of the HTTP request (or other context) that enqueued them.

Current behavior:

  • HTTP request creates a trace with spans for the request handler
  • When a job is enqueued, a new trace is created for the worker's task span
  • These traces are disconnected in Jaeger/other tracing backends

Expected behavior:

  • The task span from the worker should be a child (or linked via span link) of the span that was active when the job was enqueued
  • This would allow tracing the full flow from HTTP request → job enqueue → job execution as a single trace

Root Cause

Looking at src/layers/tracing/make_span.rs, the DefaultMakeSpan implementation uses Span::current() at job execution time:

impl<B, Ctx> MakeSpan<B, Ctx> for DefaultMakeSpan {
    fn make_span(&mut self, req: &Request<B, Ctx>) -> Span {
        let span = Span::current();  // This is the current span when the WORKER executes, not when the job was ENQUEUED
        // ...
        tracing::span!(
            parent: span,
            // ...
        )
    }
}

Since the worker executes in a different async context than the original HTTP request handler, Span::current() returns either no span or a different root span, breaking the trace continuity.

Proposed Solution

Support OpenTelemetry context propagation by:

  1. Capture trace context at enqueue time: Extract trace_id and span_id from the current span when calling enqueue()
  2. Store with the request: Include trace context in Request<T, Ctx> - either via the Parts.context field or a new dedicated field
  3. Restore at execution time: Use OpenTelemetrySpanExt::set_parent() to link the task span to the original context

This is the standard pattern for distributed tracing across async boundaries, as described in the OpenTelemetry documentation.

Minimal Reproduction

See the attached apalis-trace-context-repro.zip.

To reproduce:

  1. Start Jaeger: docker run -d -p 16686:16686 -p 4317:4317 jaegertracing/all-in-one:latest
  2. Extract and run: unzip apalis-trace-context-repro.zip && cd apalis-trace-context-repro && cargo run
  3. Make request: curl -X POST http://localhost:3000/send-email
  4. View traces in Jaeger at http://localhost:16686

You'll see two separate traces instead of one connected trace.

Alternative: Span Links

For truly async operations where the job might execute long after the request completes, Span Links could be used instead of parent-child relationships. This creates an association without implying the request duration includes the job duration.

Environment

  • apalis version: 0.7.4
  • Rust version: 1.84
  • OS: macOS

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions