Describe the bug
A severe Unintentional ThreadLocal Leak (UTL) - specifically Type III: Context/Identity Pollution - exists in com.nike.wingtips.Tracer. The tracing context is bound to a ThreadLocal but lacks a fail-safe cleanup mechanism during abnormal execution flows.
Root Cause & Location
- Location:
private static final ThreadLocal<Deque<Span>> currentSpanStackThreadLocal = new ThreadLocal<>();
- Mechanism: The system heavily relies on developers manually invoking
completeRequestSpan() or completeSubSpan() (ideally in finally blocks). If an unhandled exception occurs in the application logic and bypasses these cleanup calls, the Deque<Span> remains indefinitely bound to the thread.
Impact
In thread-pool environments (like Tomcat or Netty), this leads to catastrophic observability failures:
- Trace Topology Corruption: When a "dirty" thread is reused for a new incoming request, the subsequent request incorrectly inherits the stale trace context.
- Infinite Garbage Data: Countless unrelated requests get forcefully attached to a single leaked Trace ID, creating massive, deeply nested "garbage traces".
- APM Backend Crash: These artificially bloated trace trees can quickly overwhelm and crash downstream observability platforms (e.g., memory exhaustion in collector nodes or storage backends).
Suggested Fix
Provide a guaranteed, framework-level interceptor/filter (e.g., a Servlet Filter or Netty ChannelHandler) that strictly enforces currentSpanStackThreadLocal.remove() in a finally block at the absolute boundary of the request lifecycle, rather than relying exclusively on application-level developer compliance.
Describe the bug
A severe Unintentional ThreadLocal Leak (UTL) - specifically Type III: Context/Identity Pollution - exists in
com.nike.wingtips.Tracer. The tracing context is bound to aThreadLocalbut lacks a fail-safe cleanup mechanism during abnormal execution flows.Root Cause & Location
private static final ThreadLocal<Deque<Span>> currentSpanStackThreadLocal = new ThreadLocal<>();completeRequestSpan()orcompleteSubSpan()(ideally infinallyblocks). If an unhandled exception occurs in the application logic and bypasses these cleanup calls, theDeque<Span>remains indefinitely bound to the thread.Impact
In thread-pool environments (like Tomcat or Netty), this leads to catastrophic observability failures:
Suggested Fix
Provide a guaranteed, framework-level interceptor/filter (e.g., a Servlet Filter or Netty ChannelHandler) that strictly enforces
currentSpanStackThreadLocal.remove()in afinallyblock at the absolute boundary of the request lifecycle, rather than relying exclusively on application-level developer compliance.