-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Description
What needs to happen?
In ReadFromKafkaDoFn.java, a Guava Stopwatch is currently used to measure the latency of consumer.poll() and report it to the RpcLatency metric.
As noted in an existing TODO comment in the codebase, this timer uses System.nanoTime(). When a consumer has prefetches waiting to be returned immediately, the overhead of System.nanoTime() can contribute more latency than it actually measures (see nanotrusting-nanotime).
To fix this, we should:
- Remove the
pollTimer(Stopwatch) from thewhileloop inReadFromKafkaDoFn. - Stop manually reporting to
updateSuccessfulRpcMetricsin this hot path, and instead rely on Kafka's nativefetch-latency-avgJMX metric which users can already monitor. - Replace the
remainingTimeoutcalculation with a lower-overheadSystem.currentTimeMillis()check to ensure we still respect theconsumerPollingTimeoutwithout thenanoTimepenalty.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner
Reactions are currently unavailable