Skip to content

Discovery splitout #595

@Savid

Description

@Savid

The problem

Current discovery command does a few things that are conflated and complicated:

  • p2p static mode:
    • discv4/v5 on set of static nodes
    • execution service blindly dials node records discovered to get status records
    • consensus service dials node records matching the upstream beacon node to get status records
  • p2p xatu mode:
    • execution service gets a list of previously discovered status records matching network ids and fork id hashes
      • discv4/v5 on the records
      • blindly dials node records found to get new status records
    • consensus service dials node records matching the upstream beacon node to get status records

As you can see, there are double-ups and confusing logic about what is actually happening.

Ideally, we want to break this down and simplify:

node discovery:

  • discv4/v5 on set of static nodes
  • discv4/v5 on previously found execution/consensus records

execution service:

  • gets a list of node records to dial
  • only generates events based on optional filters:
    • network id(s)
    • fork id hash(es)

consensus service:

  • requires upstream beacon node(s) to target a network/fork digest
  • gets a list of node records to dial
  • generates events if successfully connected and fork digest matches

Role of the coordinator

The coordinator currently has a few jobs;

  • persistence in postgres
  • stores all discovery node records (node_record table)
  • stores dialed nodes status records (node_record_execution/node_record_consensus tables)
  • provides status records to current discovery command to do discovery and status checking against previously dialed nodes
  • provides records to mimicry clients to dial, at the same time storing activity record of which mimicry clients are dialing which nodes (node_record_activity table)

Solution

Split the discovery command into multiple commands:

discovery

This command will do only node discovery. It will add records to the node_record table. Its job is only to find and add records to the coordinator database. There will be no attempt to connect to the node or verify its status. There will be two jobs for this command:

Static discovery

Continuously iterate over a list of node records, enode or enr format, for discovery.

related config;

bootNodes:
  - enode:123
  - enr:abc

Dynamic discovery

It is beneficial to run discovery against known nodes to have greater success in finding similar nodes on the same network.

This job will get records from the node_record_execution/node_record_consensus tables filtered by the configured execution and beacon node upstream.

  • node_record_execution records will be filtered by the execution node fork ID hash (see Fork ID hash)
  • node_record_consensus records will be filtered by the beacon node fork digest

related config;

ethereum:
  beaconNodeAddress: http://localhost:5052
  executionNodeAddress: http://localhost:8545 # must be erigon node for "erigon_forks" method
  # networkOverride: fusaka-devnet-2 # optional

Full config

logging: "info" # panic,fatal,warn,info,debug,trace
metricsAddr: ":9090"
# pprofAddr: ":6060" # optional. if supplied it enables pprof server

coordinator:
  address: localhost:8080
  tls: false
  headers:
    authorization: Someb64Value
  maxQueueSize: 51200
  batchTimeout: 5s
  exportTimeout: 30s
  maxExportBatchSize: 512
  concurrentExecutionPeers: 100

# Note: both Node Discovery Protocol v4 and v5 can be enabled at the same time
# enable Node Discovery Protocol v4
discV4: true
# enable Node Discovery Protocol v5
discV5: true
# time between initiating discovery scans, will generate a fresh private key each time
restart: 2m

bootNodes:
  - enode:123
  - enr:abc

ethereum:
  beaconNodeAddress: http://localhost:5052
  executionNodeAddress: http://localhost:8545 # must be erigon node for "erigon_forks" method
  # networkOverride: fusaka-devnet-2 # optional

Changes needed from current discovery

We need to add an additional column to the node_record to store the fork ID hash of the dynamic discovery upstream execution node (if it was used to find the record). This will be used later in the status command to better filter records to dial, where fork ID hash matches take priority over none.

status

This command will do only status checking against a node. Its job is to dial nodes, both execution and beacon nodes, to get their status.

It will get records from the node_record table and dial the nodes to get their status. It will then update the coordinator database (node_record_execution/node_record_consensus) with the status and also output the status to the configured outputs.

There are some differences between how execution and beacon nodes are handled:

Beacon nodes

Fortunately, ENRs contain the fork digest of the node, so we can instantly filter the node_record table for beacon nodes to dial.

Once we can dial a node, we confirm the fork digest matches and do the following:

  • update the node_record_consensus coordinator database table
  • output the status event to the configured outputs

Execution nodes

Unfortunately, execution nodes require the node to be dialed to get the fork ID hash (and network ID).

In the discovery command, we've added an additional column to the node_record table to store the fork ID hash of the dynamic discovery upstream execution node (if it was used to find the record). When this command gets a list of records to dial from the coordinator, it will prioritize records that have a fork ID hash match to what is configured.

Once we can dial a node:

  • update the node_record_execution coordinator database table, even if the fork ID hash doesn't match the configured fork ID hash, as this can be used to filter future records to dial.
  • if the fork ID hash matched, output the status event to the configured outputs

Full config

logging: "info" # panic,fatal,warn,info,debug,trace
metricsAddr: ":9090"
# pprofAddr: ":6060" # optional. if supplied it enables pprof server

coordinator:
  address: localhost:8080
  tls: false
  headers:
    authorization: Someb64Value
  maxQueueSize: 51200
  batchTimeout: 5s
  exportTimeout: 30s
  maxExportBatchSize: 512
  concurrentExecutionPeers: 100

ethereum:
  beaconNodeAddress: http://localhost:5052
  executionNodeAddress: http://localhost:8545 # must be erigon node for "erigon_forks" method
  # networkOverride: fusaka-devnet-2 # optional

outputs:
# - name: local-stdout
#   type: stdout
- name: xatu-server
  type: xatu
  config:
    address: localhost:8080
    tls: false
    headers:
      authorization: Someb64Value
    maxQueueSize: 51200
    batchTimeout: 5s
    exportTimeout: 30s

Fork ID hash

To calculate the correct fork ID hash, you need all the previous fork hashes and the genesis hash. Erigon provides the erigon_forks method to get this information. Here is a Go example to calculate the fork ID hash:

Golang example
package main

import (
	"bytes"
	"encoding/binary"
	"encoding/hex"
	"encoding/json"
	"flag"
	"fmt"
	"hash/crc32"
	"io"
	"net/http"
	"strings"
)

type JSONRPCRequest struct {
	JSONRPC string        `json:"jsonrpc"`
	Method  string        `json:"method"`
	Params  []interface{} `json:"params"`
	ID      int           `json:"id"`
}

type ForksResult struct {
	Genesis     string `json:"genesis"`
	HeightForks []int  `json:"heightForks"`
	TimeForks   []int  `json:"timeForks"`
}

type JSONRPCResponse struct {
	JSONRPC string      `json:"jsonrpc"`
	ID      int         `json:"id"`
	Result  ForksResult `json:"result"`
	Error   *struct {
		Code    int    `json:"code"`
		Message string `json:"message"`
	} `json:"error"`
}

func checksumUpdate(hash uint32, fork uint64) uint32 {
	var blob [8]byte
	binary.BigEndian.PutUint64(blob[:], fork)
	return crc32.Update(hash, crc32.IEEETable, blob[:])
}

func main() {
	elURL := flag.String("el-url", "http://localhost:8545", "Execution layer URL")
	flag.Parse()

	request := JSONRPCRequest{
		JSONRPC: "2.0",
		Method:  "erigon_forks",
		Params:  []interface{}{},
		ID:      1,
	}

	jsonData, err := json.Marshal(request)
	if err != nil {
		fmt.Printf("Error marshaling request: %v\n", err)
		return
	}

	resp, err := http.Post(*elURL, "application/json", bytes.NewBuffer(jsonData))
	if err != nil {
		fmt.Printf("Error making request: %v\n", err)
		return
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		fmt.Printf("Error reading response: %v\n", err)
		return
	}

	var rpcResponse JSONRPCResponse
	err = json.Unmarshal(body, &rpcResponse)
	if err != nil {
		fmt.Printf("Error unmarshaling response: %v\n", err)
		return
	}

	if rpcResponse.Error != nil {
		fmt.Printf("RPC Error: %s (code: %d)\n", rpcResponse.Error.Message, rpcResponse.Error.Code)
		return
	}

	// Calculate CRC32 hash of genesis
	genesisHex := strings.TrimPrefix(rpcResponse.Result.Genesis, "0x")
	genesisBytes, err := hex.DecodeString(genesisHex)
	if err != nil {
		fmt.Printf("Error decoding genesis hex: %v\n", err)
		return
	}

	// Start with genesis hash
	hash := crc32.ChecksumIEEE(genesisBytes)

	// Iterate through all heightForks
	for _, fork := range rpcResponse.Result.HeightForks {
		hash = checksumUpdate(hash, uint64(fork))
	}

	// Iterate through all timeForks
	for _, fork := range rpcResponse.Result.TimeForks {
		hash = checksumUpdate(hash, uint64(fork))
	}

	// Output final hash
	fmt.Printf("0x%x\n", hash)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions