-
Notifications
You must be signed in to change notification settings - Fork 175
Open
Description
rocketmq-exporter 采集的 rocketmq_group_diff 指标出现大量数据断点,有的甚至几天时间才一个数据点,grafana 截图:

rocketmq-exporter 启动日志中疑似相关的错误:
(错误信息大致指向broker通信失败,若真的有网络问题,业务方早就受影响了,但目前仅发现监控数据残缺;因此不知如何继续跟进)
[2025-11-03 10:39:55.140] ERROR get topic's(paas_oplog_****) consumer-stats(oplog-****-***) exception
org.apache.rocketmq.remoting.exception.RemotingSendRequestException: send request to <172.17.41.89:10911> failed
at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract.invokeSyncImpl(NettyRemotingAbstract.java:441)
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:390)
at org.apache.rocketmq.client.impl.MQClientAPIImpl.getConsumeStats(MQClientAPIImpl.java:1220)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExtImpl.examineConsumeStats(DefaultMQAdminExtImpl.java:315)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExt.examineConsumeStats(DefaultMQAdminExt.java:258)
at org.apache.rocketmq.exporter.service.client.MQAdminExtImpl.examineConsumeStats(MQAdminExtImpl.java:232)
at org.apache.rocketmq.exporter.task.MetricsCollectTask.collectConsumerOffset(MetricsCollectTask.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
除此之外还有很多其他错误:
[2025-11-03 11:09:20.003] WARN ClientMetricTask-exception.ignore. group=paas-****-*****-consumer,client id=10.128.217.24@172.17.41.79:9876;172.17.41.80:9876, client addr=172.17.45.10:55377, language=JAVA,version=477
org.apache.rocketmq.remoting.exception.RemotingSendRequestException: send request to <172.17.41.99:10911> failed
at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract.invokeSyncImpl(NettyRemotingAbstract.java:441)
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:390)
at org.apache.rocketmq.client.impl.MQClientAPIImpl.getConsumerRunningInfo(MQClientAPIImpl.java:1917)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExtImpl.getConsumerRunningInfo(DefaultMQAdminExtImpl.java:842)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExt.getConsumerRunningInfo(DefaultMQAdminExt.java:469)
at org.apache.rocketmq.exporter.service.client.MQAdminExtImpl.getConsumerRunningInfo(MQAdminExtImpl.java:407)
at org.apache.rocketmq.exporter.task.ClientMetricTaskRunnable.run(ClientMetricTaskRunnable.java:64)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
[2025-11-03 11:09:20.006] INFO closeChannel: close the connection to remote address[172.17.41.99:10911] result: true
[2025-11-03 11:09:20.007] WARN ClientMetricTask-exception.ignore. group=oplog-***-***,client id=10.120.241.231@172.17.41.79:9876;172.17.41.80:9876, client addr=172.17.5.14:58456, language=JAVA,version=477
org.apache.rocketmq.remoting.exception.RemotingSendRequestException: send request to <172.17.41.99:10911> failed
at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract.invokeSyncImpl(NettyRemotingAbstract.java:441)
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:390)
at org.apache.rocketmq.client.impl.MQClientAPIImpl.getConsumerRunningInfo(MQClientAPIImpl.java:1917)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExtImpl.getConsumerRunningInfo(DefaultMQAdminExtImpl.java:842)
at org.apache.rocketmq.tools.admin.DefaultMQAdminExt.getConsumerRunningInfo(DefaultMQAdminExt.java:469)
at org.apache.rocketmq.exporter.service.client.MQAdminExtImpl.getConsumerRunningInfo(MQAdminExtImpl.java:407)
at org.apache.rocketmq.exporter.task.ClientMetricTaskRunnable.run(ClientMetricTaskRunnable.java:64)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
[2025-11-03 11:09:25.003] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-k, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:24:23.454] INFO Completed initialization in 1 ms
[2025-11-03 10:25:15.000] INFO broker stats collection task starting....
[2025-11-03 10:25:15.000] INFO broker runtime stats collection task starting....
[2025-11-03 10:25:15.000] INFO consumer offset collection task starting....
[2025-11-03 10:25:15.001] INFO broker topic stats collection task starting....
[2025-11-03 10:25:15.001] INFO producer metric collection task starting....
[2025-11-03 10:25:15.639] INFO broker runtime stats collection task finished....639
[2025-11-03 10:25:15.639] INFO topic offset collection task starting....
[2025-11-03 10:25:15.644] INFO broker stats collection task finished....644
[2025-11-03 10:25:16.554] WARN collectTopicOffset-getting topic(%RETRY%oplog-object-change) stats error. the namesrv address is ["172.17.41.80:9876"]
[2025-11-03 10:25:20.079] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-j, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:20.079] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-j, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:20.081] INFO closeChannel: close the connection to remote address[172.17.41.99:10911] result: true
[2025-11-03 10:25:25.085] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-k, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:25.085] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-k, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:25.085] INFO closeChannel: close the connection to remote address[172.17.41.101:10911] result: true
[2025-11-03 10:25:30.086] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-h, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:30.086] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-h, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:30.089] INFO closeChannel: close the connection to remote address[172.17.41.95:10911] result: true
[2025-11-03 10:25:30.089] INFO closeChannel: close the connection to remote address[172.17.41.95:10911] result: true
[2025-11-03 10:25:35.088] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-i, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:35.088] INFO closeChannel: close the connection to remote address[172.17.41.97:10911] result: true
[2025-11-03 10:25:35.088] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-i, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:40.089] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-f, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:40.089] INFO closeChannel: close the connection to remote address[172.17.41.91:10911] result: true
[2025-11-03 10:25:40.089] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-f, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:45.089] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-g, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:45.090] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-g, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:45.096] INFO closeChannel: close the connection to remote address[172.17.41.93:10911] result: true
[2025-11-03 10:25:45.495] INFO topic offset collection task finished....29856
[2025-11-03 10:25:50.090] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-d, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:50.090] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-d, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:50.091] INFO closeChannel: close the connection to remote address[172.17.41.87:10911] result: true
[2025-11-03 10:25:55.092] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-e, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:55.092] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-e, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:25:55.094] INFO closeChannel: close the connection to remote address[172.17.41.89:10911] result: true
[2025-11-03 10:26:00.092] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-b, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:00.093] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-b, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:00.093] INFO closeChannel: close the connection to remote address[172.17.41.83:10911] result: true
[2025-11-03 10:26:05.096] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-c, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:05.096] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-c, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:05.285] INFO closeChannel: close the connection to remote address[172.17.41.85:10911] result: true
[2025-11-03 10:26:10.096] WARN collectProducer. should not be here. cluster=**-****, brokerName=broker-a, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:10.097] WARN collectProducer. there are no producers in cluster=**-****, brokerName=broker-a, name srv= ["172.17.41.80:9876"]
[2025-11-03 10:26:10.097] INFO closeChannel: close the connection to remote address[172.17.41.81:10911] result: true
[2025-11-03 10:26:15.000] INFO topic offset collection task starting....
[2025-11-03 10:26:15.000] INFO broker runtime stats collection task starting....
[2025-11-03 10:26:15.045] INFO broker runtime stats collection task finished....44
请问我们可以如何解决这个问题?或者可以向哪些方向排查?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels