Skip to content

Mesh Node Reset: Network DB and Library State out of sync when Peer disconnects before ConfigNodeResetStatus #635

@mpiffari

Description

@mpiffari

Description

We are observing a race condition during the Node Reset procedure. While the command is correctly sent, some devices terminate the GATT connection (GATT CONN TERMINATE PEER USER) before the SDK can receive or process the expected ConfigNodeResetStatus (Opcode 804A).

The Core Problem: Mesh State Alignment

The primary issue is the synchronization between the physical device state and the internal Library/Database state.

When the connection is lost immediately after sending the reset command:

  1. The SDK does not trigger onMeshMessageReceived for the Reset Status.
  2. The Node remains present in the Mesh Network Database, even though the physical hardware has already performed a factory reset and is no longer part of the network.
  3. This leads to an inconsistent network topology where the App believes a node is still provisioned, but the node is actually in an unprovisioned state.

We are seeking guidance on the best practice to manually align the Mesh Network and DB when a disconnection occurs specifically during a Reset step without receiving the status PDU.


Comparative Logs

1. Correct Flow (Successful Alignment)

The node sends the status PDU (804A) before resetting. The library updates the DB and notifies the application layer.

// Send Reset Command (Opcode 8049)
[MeshTransport] Access message opcode: 8049
[AccessLayer] Created Access PDU 8049
[MeshManagerApi] MeshNetwork pdu sent: 0x0063D200863EE0B15F6D613A98A984A22D57DB3CA7
[MESH] MeshStatusCallbacks - onMeshMessageProcessed - Destination address: 2

// Receive Response (Opcode 804A) - DB Alignment happens here
[BLE] onCharacteristicChanged: [DEVICE_ID] - charact: 00002ADE-...
[MeshManagerApi] Received network pdu: 0x00633E42E0715E776EC3FF08BF0CCB72525A34AE49
[AccessLayer] Received Access PDU 804A
[MESH] MeshStatusCallbacks - onMeshMessageReceived - Source address: 2
[APP] onProcedureChanged - STEP: Reset - State Completed

2. Error Flow

The node disconnects at 13:20:26.533, roughly 212ms after the write. The SDK never receives the 804A PDU, leaving the node "ghosting" in the database.

// Send Reset Command (Opcode 8049)
[MeshTransport] Access message opcode: 8049
[BLE] Send (#21 bytes) --> 00-68-2E-9B-D1-F2-A0-47-66-4F-40-15-21-05-FD-B5-80-AA-0D-9E-C5
[BLE] writeCharacteristic status SUCCESS (0)
[MeshManagerApi] MeshNetwork pdu sent: 0x00682E9BD1F2A047664F40152105FDB580AA0D9EC5
[MESH] MeshStatusCallbacks - onMeshMessageProcessed - Destination address: 2

// Race Condition: Peer disconnects before sending Status PDU
[BLE] onConnectionStateChange - status 19 (GATT CONN TERMINATE PEER USER) --> DISCONNECTED
[MESH] onChannelClosed - channel type Ble closed

RESULT: The library/DB still considers the node as part of the mesh.

Technical Questions

  1. Manual DB Cleanup: What is the recommended sequence to manually remove the node from the MeshNetwork and persist this change to the Database if the ConfigNodeResetStatus is never received?

  2. Implicit Success Handling: Is there a standard way within the SDK to treat a GATT disconnection (Status 19) as a "Reset Success" if it occurs immediately after a ConfigNodeReset message was processed?

  3. Internal State Integrity: Are there risks to calling meshNetwork.deleteNode(node) manually during a disconnection event regarding sequence numbers or other network parameters?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions