Add ChaChaPoly AEAD-4 encryption with nonce persistence#12
Open
Add ChaChaPoly AEAD-4 encryption with nonce persistence#12
Conversation
8ae955c to
7397837
Compare
prefs is 5 char length :nerd:
Add ChaCha20-Poly1305 AEAD decryption with 4-byte auth tag for peer messages and group channels, falling back to ECB for backward compatibility. Sending remains ECB-only in this phase. - Per-message key derivation: HMAC-SHA256(secret, nonce||dest||src) - Direction-dependent keys prevent bidirectional keystream reuse - 12-byte IV from nonce + dest_hash + src_hash - Advertise AEAD capability via feat1 bit 0 in adverts - Track peer AEAD support in ContactInfo.flags - Seed aead_nonce from HW RNG on contact creation and load
Send ChaChaPoly-encrypted messages to peers with CONTACT_FLAG_AEAD set, and try AEAD decode first for those peers (avoiding 1/65536 ECB false-positive). Legacy peers continue to use ECB in both directions. - Add aead_nonce parameter to createDatagram/createPathReturn (default 0 = ECB) - Add getPeerFlags/getPeerNextAeadNonce virtual methods for decode-order selection - Add ContactInfo::nextAeadNonce() helper (returns nonce++ if AEAD, 0 otherwise) - Update all BaseChatMesh send paths to pass nonce for AEAD-capable peers - Adaptive decode order: AEAD-first for known AEAD peers, ECB-first for others
The header's route type bits (PH_ROUTE_MASK) are zero when createDatagram/createPathReturn encrypt with AEAD, but get changed to ROUTE_TYPE_FLOOD (1) or ROUTE_TYPE_DIRECT (2) by sendFlood/sendDirect afterwards. The receiver builds assoc from the received header (with route bits set), so the tag check always fails and every AEAD packet is silently dropped. Mask out route type bits in assoc data on all 5 encrypt/decrypt sites. Also track AEAD decode success to enable peer capability auto-detection.
- Fix potential unsigned overflow in createDatagram size check by subtracting constants from MAX_PACKET_PAYLOAD instead of adding to data_len - Add upper-bound validation on src_len and assoc_len in aeadEncrypt and aeadDecrypt - Log peer name on AEAD nonce wraparound for debug builds
Prevent nonce reuse after reboots by persisting per-peer nonce counters to a dedicated /nonces (companion) or /s_nonces (server) file. On dirty reset (power-on, watchdog, brownout), nonces are bumped by NONCE_BOOT_BUMP (100) to cover any unpersisted messages. Clean wakes (deep sleep, software restart) load nonces as-is. - Add nonce persistence to BaseChatMesh (companion) and ClientACL (server) - Add wasDirtyReset() helper to ArduinoHelpers.h for platform-specific reset reason detection (ESP32/NRF52) - Add onBeforeReboot() callback to CommonCLI for pre-reboot nonce flush - Wire nonce persistence into all firmware variants: companion radio, repeater, room server, and sensor - Only clear dirty flag on successful file write
Co-authored-by: J.C. Jones <james.jc.jones@gmail.com>
81bad32 to
b8147e8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Build firmware: Build from this branch
Testing
Summary
Adds ChaCha20-Poly1305 (AEAD-4) encryption alongside the existing AES-128-ECB + HMAC-2 scheme, plus session key negotiation for Perfect Forward Secrecy. Updated nodes send AEAD-4 to peers that advertise support and fall back to ECB for legacy peers. All nodes can decode both formats. Old nodes continue to work unchanged.
Nonces are persisted to flash so they survive reboots without risk of reuse. Session keys are negotiated via ephemeral X25519 Diffie-Hellman and persisted immediately on establishment.
Relates to meshcore-dev#259.
What This Means in Practical Terms
The current encryption has a few weaknesses that this PR addresses:
Message tampering is too easy to attempt. The existing 2-byte authentication code means an attacker only needs about 65,000 guesses to forge a valid-looking message. At LoRa speeds that's roughly 9 hours of continuous attempts. The new 4-byte tag raises this to over 4 billion guesses — at LoRa rates, that would take over a century.
Identical messages look identical on the air. The current block cipher (ECB mode) produces the same ciphertext for the same plaintext, which can reveal patterns — for example, you could tell when someone sends the same message twice. The new scheme produces completely different ciphertext every time, even for identical messages.
Addressing fields are now protected. Currently, only the message body is authenticated. With AEAD, the payload type and addressing hashes (which identify sender and recipient) are included in the authentication check, so an attacker cannot swap or modify them without detection. Outer routing fields like TTL and hop path are intentionally left unauthenticated so repeaters can still forward packets through the mesh.
Messages get slightly smaller. ECB pads every message up to a 16-byte boundary, wasting airtime. The new scheme has no padding, so most messages shrink by a few bytes on the wire.
Compromise of a node doesn't reveal past messages. Session key negotiation establishes fresh shared secrets via ephemeral key exchange. Even if a node's long-term private key is later compromised, previously recorded traffic cannot be decrypted (Perfect Forward Secrecy).
Nothing breaks. Updated nodes send AEAD-4 to peers that advertise support, and fall back to ECB for legacy peers. Old nodes are completely unaffected — they never receive AEAD-4 messages because the sender checks their capability first.
Nodes advertise their capabilities. Updated nodes include a flag in their advertisements saying "I understand the new encryption." When two updated nodes discover each other, they automatically start using AEAD-4 for their communication.
Nonces survive reboots. Per-peer nonce counters are saved to flash periodically and before clean reboots. After a dirty reset (power loss, watchdog, brownout), nonces are bumped forward by a safety margin to guarantee no reuse.
Wire Format
Current ECB:
New AEAD-4 (same position in payload):
Average overhead: ~6 bytes (AEAD) vs ~9.5 bytes (ECB). Most messages get smaller.
Cryptographic Design
Per-message key derivation (eliminates nonce-reuse catastrophe):
The
shared_secretis either the static ECDH secret or a session key (see Session Key Negotiation below).Including
dest_hash || src_hashmakes keys direction-dependent — Alice→Bob and Bob→Alice derive different keys even with the same nonce value (for 255/256 peer pairs; the 1/256 where dest_hash == src_hash is a residual limitation of 1-byte hashes).IV construction (12 bytes, from on-wire fields):
Associated data (authenticated but not encrypted):
header || dest_hash || src_hashheader || dest_hashheader || channel_hashRoute type bits are masked out of the header in associated data (
header & ~PH_ROUTE_MASK), since routing mode changes per hop as repeaters forward packets.Nonce management: 16-bit counter per peer, persisted to flash. See "Nonce Persistence" section below.
Session Key Negotiation (Perfect Forward Secrecy)
Session keys provide Perfect Forward Secrecy by establishing fresh shared secrets via ephemeral X25519 Diffie-Hellman. Compromise of either node's long-term private key cannot recover traffic encrypted with a session key.
Protocol (2 messages + implicit confirmation)
The INIT is encrypted with AEAD-4 (static ECDH or existing session key). The ACCEPT is always encrypted with the static ECDH secret, because the initiator hasn't derived the session key yet.
Key Derivation
Uses existing
ed25519_key_exchange()(X25519 Montgomery ladder) fromlib/ed25519. No new dependencies.Who Initiates
Repeaters, room servers, and sensors only implement the responder role — they never initiate session key negotiation.
Automatic Triggers
Session key negotiation is triggered automatically based on message count. The trigger check runs inside
getEncryptionNonceFor()— the single funnel all encrypted sends pass through — so no send path can silently skip it. Negotiation is deferred to the nextloop()tick to avoid re-entrancy.3 INIT attempts per negotiation (3-minute timeout each).
Nonce Lifecycle
Encryption Key Selection
All node types use paired
getEncryptionKey()/getEncryptionNonce()functions that return the correct key and nonce based on current session state:Decode Order
Dual-Decode Window
When the responder accepts a session key INIT, it enters DUAL_DECODE state: the new session key is active for sending, but both old and new keys are accepted for decoding. Once the initiator sends a message encrypted with the new session key (message 3), the responder confirms the transition and drops the old key.
This makes ACCEPT packet loss safe — the responder stays in dual-decode, the initiator times out and retries, and no messages are lost.
Stale Session Detection
If a node sends 50 consecutive messages without receiving any session-key-encrypted reply, it falls back to static ECDH for sending (the peer may have lost the session key). At 100 unanswered sends, falls back to ECB. At 255, clears the AEAD capability flag and removes the session key entirely. The counter resets to 0 on any successful session-key-encrypted message from the peer.
Session Key Persistence
Session keys use a two-tier storage model: a small RAM pool for active sessions and a larger flash-backed store for less recently used entries.
RAM pool: 8 slots (
MAX_SESSION_KEYS_RAM), managed as an LRU cache. Each access touches a counter so the least-recently-used entry can be evicted when the pool is full. Entries inINIT_SENTstate (ephemeral keys only) are never evicted — they must complete or time out.Flash store: Up to 48 entries (
MAX_SESSION_KEYS_FLASH), persisted to/sess_keys(companion) or/s_sess_keys(server firmware).Variable-length records: Entries without a previous session key (no dual-decode) use 39 bytes (
SESSION_KEY_RECORD_MIN_SIZE); entries with a previous key use 71 bytes (SESSION_KEY_RECORD_SIZE). TheSESSION_FLAG_PREV_VALIDflag bit distinguishes the two.On-demand flash lookup: When
findSessionKey()misses the RAM pool, it reads the flash file to look for a matching entry. If found, the entry is loaded into RAM (evicting LRU if needed) and returned.Merge-save strategy: When persisting, the code reads existing flash entries, filters out any that are already in the RAM pool or have been explicitly removed, then writes the merged result (RAM entries + surviving flash-only entries). This prevents flash from resurrecting deleted entries while preserving entries that were evicted from RAM.
Removed-entry tracking: When a session key is explicitly removed (e.g., invalidation after static ECDH fallback), its prefix is recorded in a small tracking array. The merge-save step skips these prefixes so the deleted entry doesn't reappear from stale flash data. The tracking array is cleared after each successful save.
Nonce Persistence
Nonces are persisted to a dedicated file on flash (
/noncesfor companion radios,/s_noncesfor server firmware).Periodic saves: After every
NONCE_PERSIST_INTERVAL(50) messages to a given peer, the nonce file is written. A dirty flag tracks whether any nonce has advanced since the last save.Clean reboot: Software restarts and deep sleep wakes load the persisted nonces as-is. A
onBeforeReboot()callback in CommonCLI flushes any dirty nonces before the restart.Dirty reboot: Power-on, watchdog, and brownout resets are detected via
wasDirtyReset()(platform-specific:esp_reset_reason()on ESP32,RESETREASregister on NRF52). After a dirty reset, all loaded nonces are bumped forward byNONCE_BOOT_BUMP(100), which is at least 2× the persist interval, guaranteeing that even the worst-case unpersisted nonce is safely skipped. Session key nonces also receive the boot bump; if the bump causes a wrap, the nonce is forced to 65535 to trigger renegotiation.Format: Simple array of
{pub_key_prefix[6], nonce[2]}entries, matched to in-memory contacts/clients on load.Security Comparison
memcmp(timing side-channel)secure_compare(constant-time)Scope
All node types (companion radio, repeater, room server, sensor) support AEAD-4 decode, AEAD-4 send, and session key negotiation (companion initiates or responds; server firmware responds only).
Group Message Considerations
Group channels share a single key among all members. With a 2-byte nonce and multiple senders, cross-sender nonce collisions follow the birthday bound (~300 messages for 50% probability on an active channel). A collision leaks
P1 ⊕ P2for that specific message pair via crib-dragging, but:This is mainly beneficial for public/hashtag channels where the PSK is already widely known and the ECB pattern leakage and weak MAC are a greater concern than the bounded nonce collision risk.
Potential future mitigations explored and deferred:
HMAC(channel_secret, sender_pub_key)) — eliminates cross-sender collisions but requires receivers to know all senders' public keys, changing the group security model from "know the PSK = full access" to "know the PSK + sender discovery = access." Ruled out as a usability regression.Decode Order
Adaptive per-peer: for peers with
CONTACT_FLAG_AEADset, try AEAD-4 first then ECB fallback. For unknown/legacy peers, try ECB first then AEAD-4 fallback. When a session key exists, decode order is: session key → prev session key (dual-decode window) → static ECDH → ECB. This avoids the 1/65536 ECB false-positive rate on AEAD packets (nonce bytes matching truncated HMAC) for known AEAD peers, while minimizing wasted CPU for legacy peers.Capability Advertisement
feat1bit 0 (FEAT1_AEAD_SUPPORT) is set in adverts for all node types (chat, repeater, room, sensor)ContactInfo.flagsbit 1 (CONTACT_FLAG_AEAD)feat1but ignore the value (forward-compatible via existingAdvertDataParser)Files Changed
Core Library
src/MeshCore.h— AEAD constants, session key constants (SESSION_KEY_SIZE,REQ_TYPE_SESSION_KEY_INIT,RESP_TYPE_SESSION_KEY_ACCEPT,NONCE_REKEY_THRESHOLD,SESSION_KEY_*thresholds and limits), two-tier pool sizing (MAX_SESSION_KEYS_RAM=8,MAX_SESSION_KEYS_FLASH=48), variable-length record sizes (SESSION_KEY_RECORD_SIZE,SESSION_KEY_RECORD_MIN_SIZE),SESSION_FLAG_PREV_VALIDsrc/Utils.h/src/Utils.cpp—aeadEncrypt()andaeadDecrypt()using ChaChaPolysrc/Mesh.h—getPeerFlags(),getPeerNextAeadNonce(),getPeerSessionKey(),getPeerPrevSessionKey(),onSessionKeyDecryptSuccess(),getPeerEncryptionKey(),getPeerEncryptionNonce()virtuals;aead_nonceparam oncreateDatagram/createPathReturnsrc/Mesh.cpp— AEAD send path increateDatagram/createPathReturn; session key → prev session key → static ECDH → ECB adaptive decode ordersrc/helpers/ContactInfo.h—uint16_t aead_noncefield,nextAeadNonce()helpersrc/helpers/SessionKeyPool.h—SessionKeyEntrystruct andSessionKeyPoolclass (LRU-managed RAM pool withlast_usedtracking, eviction that skipsINIT_SENTentries, removed-entry tracking for merge-save safety)Companion Radio (BaseChatMesh)
src/helpers/BaseChatMesh.h/BaseChatMesh.cpp— Advertise AEAD, track peer capability, AEAD send for all peer message types, nonce persistence, session key negotiation (both initiator and responder roles), encryption key/nonce funnel (getEncryptionKeyFor/getEncryptionNonceFor), deferred rekey trigger via_pending_rekey_idxServer-Side (ClientACL + examples)
src/helpers/ClientACL.h/ClientACL.cpp— Server-side AEAD nonce tracking and persistence, session key responder (handleSessionKeyInit), paired encryption key/nonce selection (getEncryptionKey/getEncryptionNonce), flash-backed session key wrappers with merge-save, peer-index forwarding helperssrc/helpers/CommonCLI.h/CommonCLI.cpp— Advertise AEAD for repeaters/rooms/sensors;onBeforeReboot()callback for nonce/session key flushexamples/simple_repeater/MyMesh.h/MyMesh.cpp— AEAD + session key support, nonce persistence, session key INIT handling inonPeerDataRecvexamples/simple_room_server/MyMesh.h/MyMesh.cpp— Sameexamples/simple_sensor/SensorMesh.h/SensorMesh.cpp— SamePlatform Support
src/helpers/ArduinoHelpers.h—wasDirtyReset()helper (ESP32/NRF52 reset reason detection)examples/companion_radio/DataStore.h/DataStore.cpp— Nonce and session key file I/O, variable-length session key records, merge-save with flash-backed lookup (loadSessionKeyByPrefix)examples/companion_radio/MyMesh.h/MyMesh.cpp— Wire up nonce/session key persistence and reboot callback, flash-backed session key overrides (loadSessionKeyRecordFromFlash,mergeAndSaveSessionKeys)Build Verification
Heltec_v3_companion_radio_ble): builds successfullyHeltec_v3_repeater): builds successfullyHeltec_v3_room_server): builds successfullyXiao_nrf52_companion_radio_ble): builds successfullyFuture Work
rekey <peer>CLI command for manual session key renegotiationBuild firmware: Build from this branch
Mirror of meshcore-dev#1677