Currently, we only track when an Infura RPC endpoint becomes degraded or
unavailable. Now, we would like to have similar insights about custom
RPC endpoints so that we can take more informed decisions to improve
reliability for other chains. We'd also like to improve the tracking for
Infura endpoints so that we can understand failures better.
This commit updates the handlers for the
`NetworkController:rpcEndpointDegraded` and
`NetworkController:rpcEndpointUnavailable` messenger events so that they
create a Segment event regardless of the type of endpoint. The event now
includes the HTTP status code.
While making these changes, it was noticed that the sampling rate for
these Segment event was incorrect. It should have been 1%, not 10%. This
has also been corrected. This ensures that we don't store more data in
Segment and our downstream services than necessary.
Description
Currently, we only track when an Infura RPC endpoint becomes degraded or unavailable. Now, we would like to have similar insights about custom RPC endpoints so that we can take more informed decisions to improve reliability for other chains. We'd also like to improve the tracking for Infura endpoints so that we can understand failures better.
This commit updates the handlers for the
NetworkController:rpcEndpointDegradedandNetworkController:rpcEndpointUnavailablemessenger events so that they create a Segment event regardless of the type of endpoint. The event now includes the HTTP status code.While making these changes, it was noticed that the sampling rate for these Segment event was incorrect. It should have been 1%, not 10%. This has also been corrected. This ensures that we don't store more data in Segment and our downstream services than necessary.
Changelog
CHANGELOG entry: null
Related issues
Closes #17089.
Manual testing steps
yarn setup:expo, runyarn watch:clean.node_modules/@metamask/network-controller/dist/rpc-service/rpc-service.cjs, look forasync function _RpcService_processRequestand make these changes:async function _RpcService_processRequest(fetchOptions) { let response; try { return await __classPrivateFieldGet(this, _RpcService_policy, "f").execute(async () => { + console.log('[REQUEST]', this.endpointUrl.toString(), 'with', fetchOptions); + if ( + this.endpointUrl.toString().includes("linea-mainnet.infura.io") || + this.endpointUrl.toString().includes("mainnet.era.zksync.io") + ) { + console.log('[RESPONSE]', this.endpointUrl.toString(), '=> 502'); + throw new controller_utils_1.HttpError(502); + } response = await __classPrivateFieldGet(this, _RpcService_fetch, "f").call(this, this.endpointUrl, fetchOptions); + console.log('[RESPONSE]', this.endpointUrl.toString(), '=>', response.status); if (!response.ok) { throw new controller_utils_1.HttpError(response.status); } return await response.json(); }); }app/core/Engine/Engine.ts, look fornew NetworkController, and make these changes:return { ...commonOptions, policyOptions: { maxRetries, - maxConsecutiveFailures: (maxRetries + 1) * 7, + maxConsecutiveFailures: (maxRetries + 1) * 4, }, }; }, additionalDefaultNetworks, }; const networkController = new NetworkController(networkControllerOptions);Creating Segment event "RPC Service Degraded" with {"chain_id_caip":"eip155:59144","rpc_endpoint_url":"linea-mainnet.infura.io","http_status":500}. After about 10 seconds or so, you should seeCreating Segment event "RPC Service Unavailable" with {"chain_id_caip":"eip155:59144","rpc_endpoint_url":"linea-mainnet.infura.io","http_status":500}.{"chain_id_caip":"eip155:324","rpc_endpoint_url":"mainnet.era.zksync.io","http_status":500}.https://flare-api.flare.network/ext/C/rpcas the RPC endpoint.node_modules/@metamask/network-controller/dist/rpc-service/rpc-service.cjsagain with:async function _RpcService_processRequest(fetchOptions) { let response; try { return await __classPrivateFieldGet(this, _RpcService_policy, "f").execute(async () => { console.log('[REQUEST]', this.endpointUrl.toString(), 'with', fetchOptions); if ( this.endpointUrl.toString().includes("linea-mainnet.infura.io") || - this.endpointUrl.toString().includes("mainnet.era.zksync.io") + this.endpointUrl.toString().includes("mainnet.era.zksync.io") || + this.endpointUrl.toString().includes("flare-api.flare.network") ) { console.log('[RESPONSE]', this.endpointUrl.toString(), '=> 502'); throw new controller_utils_1.HttpError(502); } response = await __classPrivateFieldGet(this, _RpcService_fetch, "f").call(this, this.endpointUrl, fetchOptions); console.log('[RESPONSE]', this.endpointUrl.toString(), '=>', response.status); if (!response.ok) { throw new controller_utils_1.HttpError(response.status); } return await response.json(); }); }{"chain_id_caip":"eip155:14","rpc_endpoint_url":"flare-api.flare.network","http_status":500}.Screenshots/Recordings
(N/A)
Before
After
Pre-merge author checklist
Pre-merge reviewer checklist