Small device that can fit in palm and pocket capable of serving company goals of: #32
Replies: 2 comments · 7 replies
-
|
Based on your description, here’s what the device could function like if it uses the Tpico C3, which combines the RP2040 and ESP32 microcontrollers: Functional Description1. Work Process Control via Terminal:
2. Enhanced File Security—RP2040 for File Scanning and Tagging:
3. Credential Safety—ESP32 for API Keys and Access Brokering:
4. Potential for Secure Communication:
Summary of Advantages
|
Beta Was this translation helpful? Give feedback.
All reactions
-
|
Lilygo Tpico |
Beta Was this translation helpful? Give feedback.
All reactions
-
|
Good question. These two chips aren't redundant — they're complementary in a way that matters for a power-conscious security device. Let me explain what each one is actually good at and why the C3 is doing what it's doing. The RP2040: Deterministic WorkhorseThe RP2040 is a dual-core ARM Cortex-M0+ running up to 133MHz. Its strengths are very specific: Native USB controller. This is the big one. The RP2040 has a built-in USB 1.1 PHY — the actual analog circuitry that talks USB protocol on the wire. It doesn't need an external USB-to-serial chip like an FTDI or CH340. It is the USB device. This means it can present itself as any USB device class: CDC serial, HID keyboard, mass storage, MIDI, or a composite device that's several of these at once. For the BABB dongle, this is foundational — the dongle's entire relationship with the host computer flows through USB, and the RP2040 handles that natively with microsecond-level timing precision. Dual cores with no OS overhead. The M0+ cores run bare metal. There's no scheduler, no context switching, no kernel. Core 0 runs the USB stack in a tight loop. Core 1 runs the monitoring loop. They never compete for scheduling time. When the tamper GPIO fires an interrupt on Core 1, the response time is measured in nanoseconds, not milliseconds. For a security device, that determinism matters — you can't have a tamper event waiting in a queue behind a WiFi callback. PIO (Programmable I/O). The RP2040 has eight PIO state machines — tiny programmable cores dedicated entirely to I/O timing. We're not using them in v1, but they're available for future features like driving NeoPixel status LEDs, implementing custom serial protocols, or bit-banging interfaces that would normally require dedicated hardware. Each PIO state machine runs independently of the main cores and can handle precise timing down to individual clock cycles. 264KB SRAM, no flash. The RP2040 has no built-in flash — it boots from an external SPI flash chip (4MB on the T-PicoC3). But it has a generous 264KB of SRAM split across six banks that the two cores and DMA can access simultaneously. For our firmware, which uses static buffers and no heap allocation, this is more than enough. The separate memory banks mean Core 0 can read USB data while Core 1 writes UART data without bus contention. What it can't do: No wireless radio of any kind. No hardware crypto accelerator. No secure key storage. Limited to USB 1.1 speeds (12 Mbps). These are exactly the gaps the ESP32-C3 fills. The ESP32-C3: The Brains Behind the MuscleThis is where your question really points. The C3 looks like the junior partner — single-core, no USB host capability, sitting behind a UART link. But it's carrying the most security-critical and network-critical responsibilities on the dongle. Here's why. Hardware CryptographyThe ESP32-C3 has a dedicated hardware AES accelerator. This isn't software AES running on the CPU — it's a silicon block that accepts plaintext in one register and outputs ciphertext from another. The CPU feeds it data and collects results, but the actual mathematical operations (the SubBytes, ShiftRows, MixColumns, AddRoundKey transformations that make AES work) happen in dedicated logic gates. Why this matters for the BABB dongle: when the host sends an The C3 also has hardware SHA-256 acceleration, which is how auth tokens get verified. The token arrives, gets hashed in hardware, and the hash is compared against the stored value using constant-time comparison (every byte is checked regardless of whether an early mismatch is found, preventing timing side-channel attacks). The plaintext token is immediately discarded from memory. The hardware crypto also matters for power. Software AES on the RP2040's Cortex-M0+ would take thousands of clock cycles per block, with the CPU fully active and drawing current the whole time. The C3's hardware accelerator completes a block in a fraction of that time and at lower power draw because dedicated silicon is always more efficient than general-purpose computation for a fixed operation. eFuse Secure StorageThe ESP32-C3 has one-time-programmable eFuse memory — physical fuses on the silicon that are burned permanently. You can write a 256-bit key into the eFuse block and then set a read-protection bit that makes the key accessible only to the hardware crypto engine. After that, no software on any processor — not even code running on the C3 itself — can read the raw key bytes back. The AES accelerator can use the key, but it can't be extracted. This is the foundation of the dongle's security model. In production, the BABB dongle would have its device-unique AES key burned into eFuse during manufacturing. The The RP2040 has no equivalent feature. Its flash is readable, its SRAM is readable, it has no hardware key storage. It's the wrong chip to trust with secrets. WiFi and BLE RadioThe ESP32-C3 has an integrated 2.4GHz radio supporting WiFi 4 (802.11b/g/n) and Bluetooth 5.0 LE. This gives the BABB dongle capabilities that go well beyond what a dumb USB stick can do: Remote alerting. When the RP2040 detects something — tamper event, suspicious host process, auth lockout — it sends a frame to the C3, which can relay that alert over WiFi to a remote server, MQTT broker, or push notification service. The operator doesn't need to be sitting at the laptop to know their dongle just got tampered with. For tradesmen who might leave a laptop at a job site, this is a meaningful security feature. BLE proximity. The C3 can run a BLE beacon or connect to a phone app. Use cases: the dongle could auto-lock when the operator's phone moves out of BLE range (maybe 10 meters). Or it could require BLE proximity plus the auth token for a two-factor unlock — something you have (the dongle plugged in) plus something near you (your phone). This doesn't require WiFi infrastructure at all, which matters on construction sites or in field environments without reliable internet. OTA firmware updates. The C3 can download new firmware over WiFi and flash itself. More importantly, it can also relay firmware updates to the RP2040. The update flow would be: C3 downloads the new RP2040 UF2 image over WiFi, verifies its signature using the hardware crypto engine, then streams it to the RP2040 over UART with a command that triggers the RP2040 to enter its bootloader. Both chips stay current without the operator needing to manually re-flash anything. WiFi power management is where the C3's role intersects with your core concern about electricity. The ESP32-C3 supports several power saving modes that the firmware coordinates with the RP2040:
The firmware already coordinates this — when the RP2040's power state machine transitions to IDLE (after 30 seconds of no USB activity), it sends RISC-V ArchitectureThe C3 runs a single-core 32-bit RISC-V processor at up to 160MHz. Compared to the RP2040's ARM Cortex-M0+, the RISC-V core has a more capable instruction set — it supports hardware multiply/divide and has a deeper pipeline. For cryptographic operations that involve a lot of integer math (even when hardware-accelerated, there's setup and data marshaling), the RISC-V core handles it more efficiently per clock cycle than the M0+. The C3 also runs FreeRTOS, which is appropriate here because its tasks have different priorities and timing requirements. The UART RX task runs at priority 10 (highest), so incoming frames from the RP2040 are never delayed by WiFi callbacks or LED blinking. The TX task runs at 9, and the LED task at 5. FreeRTOS adds some overhead compared to bare metal, but it lets the C3 juggle wireless events, crypto operations, and UART communication without any of them starving. The C3 as a Security BoundaryArchitecturally, the most important thing about the C3's role is that it creates a trust boundary. The RP2040 faces the host computer — it's exposed to whatever the USB bus sends it. A malicious host could send crafted USB packets, inject commands, try to overflow buffers. The RP2040 is the attack surface. The C3 sits behind the UART bridge. It only accepts well-formed BABB protocol frames. It never touches USB. It holds the keys. Even if an attacker fully compromised the RP2040 and took control of its firmware, the most they could do is send UART frames to the C3 — and the C3 validates every frame, checks every CRC, and makes its own decisions about whether to encrypt, authenticate, or transmit. The attacker couldn't extract the AES key because it's in eFuse. They couldn't bypass authentication because the SHA-256 comparison happens on the C3. They couldn't send fake alerts that bypass the C3's validation. This is the same architectural pattern used in hardware security modules in banking and military applications — an untrusted front-end processor handles the messy real-world interface, and a trusted back-end processor handles secrets and makes security decisions, with a narrow, well-defined protocol between them. The Split Summarized
The RP2040 is the hands and ears. The ESP32-C3 is the vault and the voice. |
Beta Was this translation helpful? Give feedback.
All reactions
-
|
This is one of the most important questions for the whole BABB product. API keys are the hardest security problem on an embedded device because unlike a server where keys sit behind firewalls and access controls, a dongle is a physical object someone can hold, probe, freeze, and disassemble. Let me walk through every layer. The Fundamental ProblemAn API key is a secret string. Your dongle needs to use it — to authenticate with a remote server, sign a request, decrypt a response. But "using" a key means it has to exist somewhere in silicon at some point. The question is: where does it live, who can read it, and what happens if someone tries to extract it? On a normal computer, API keys sit in environment variables, config files, or keychains. All of these are software constructs stored on readable storage. Anyone with root access can dump them. On the BABB dongle, you have something better available — hardware isolation. But only on the right chip. Where Keys Cannot LiveNot on the RP2040. This is the critical rule. The RP2040 has no secure storage of any kind. Its 4MB external SPI flash can be read by desoldering the chip and putting it in a programmer. Its 264KB SRAM can be dumped via the SWD debug port (two pins on the board) unless you permanently disable SWD by writing the security bits in flash — but even then, cold boot attacks can recover SRAM contents by freezing the chip and reading the capacitor charge states that represent stored bits. The RP2040 is the public-facing side of the dongle. It touches USB. It parses host commands. It's the attack surface. No secret should ever exist in its memory, not even transiently. Not in firmware source code. The placeholder Not in NVS unencrypted. The ESP32-C3 has NVS (Non-Volatile Storage) — a key-value store in flash. You can write API keys there, but flash is readable with physical access. NVS alone is not secure storage. It's a convenience layer, not a security layer. Where Keys Should LiveTier 1: eFuse (Highest Security)The ESP32-C3 has 4096 bits of one-time-programmable eFuse memory organized into blocks. Here's how it works at the physical level: Each eFuse bit is literally a tiny fuse on the silicon die. In its manufactured state, it reads as 0. You can "burn" it to 1 by applying a higher-than-normal voltage that physically breaks the fuse conductor. Once burned, it cannot be unburned. This is not flash memory that can be erased — it's a permanent physical change to the chip. The relevant eFuse blocks for key storage: After burning a key into an eFuse block, you then burn the read-protection bit for that block. Once set, the key becomes what Espressif calls "software readable but not software extractable" — the hardware crypto engine can access the key to perform AES operations, but no software running on the RISC-V core can read the raw key bytes back. The CPU literally cannot see the key. It can say "encrypt this block using Key 1" and get ciphertext back, but it cannot say "give me the bytes of Key 1." This is where your master device key goes. The one that derives or decrypts everything else. Tier 2: NVS with Flash EncryptionThe ESP32-C3 supports transparent flash encryption. When enabled, a flash encryption key is stored in eFuse, and all reads/writes to external flash pass through a hardware encryption layer. The CPU sees plaintext; the physical flash contains ciphertext. Someone desoldering the flash chip gets encrypted garbage. API keys that change over the device's lifetime (like tokens that rotate, OAuth credentials, per-service API keys) go in NVS with flash encryption enabled. They're protected at rest, and the flash encryption key that protects them is itself locked in eFuse. Tier 3: RAM (Transient Only)When an API key is actively being used — say, it's been decrypted from NVS and is being included in an HTTPS request header — it exists briefly in SRAM. The discipline here is: /* Correct pattern: minimize key exposure in RAM */
void use_api_key(const char *endpoint) {
char api_key[64];
/* Decrypt from NVS into RAM */
decrypt_from_nvs("service_api_key", api_key, sizeof(api_key));
/* Use immediately */
https_request(endpoint, api_key);
/* Wipe from RAM as soon as possible */
mbedtls_platform_zeroize(api_key, sizeof(api_key));
/* ^ This is NOT memset. Compilers can optimize away memset
on memory that isn't read afterward. mbedtls_platform_zeroize
uses volatile writes that the compiler cannot remove. */
}The key exists in RAM for milliseconds. It's still a theoretical exposure window, but exploiting it requires either SWD debug access (which should be disabled in production) or a cold boot attack with precise timing. The Key HierarchyProduction BABB dongles wouldn't store raw API keys directly. They'd use a key hierarchy where one master key derives everything else: The derivation uses HKDF (HMAC-based Key Derivation Function). You feed the master key and a context string like "babb-api-key-encryption-v1" into HKDF, and it produces a derived key. Different context strings produce different derived keys, but all trace back to the single master in eFuse. Compromise of one derived key doesn't expose the others because HKDF is a one-way function. Here's what the provisioning and runtime flow looks like in C: /* ── Manufacturing provisioning (runs ONCE per device) ──── */
/*
* This runs on a secure provisioning station, not on the
* dongle itself. The station generates the key and burns
* it via the espefuse tool or JTAG.
*
* Command line:
* espefuse.py burn_key BLOCK_KEY0 device_master.bin AES_256_KEY
* espefuse.py burn_key_digest BLOCK_KEY1 signing_key.pem
* espefuse.py --port /dev/ttyUSB0 burn_efuse WR_DIS
*
* After this, the key exists ONLY in silicon. The
* device_master.bin file is securely destroyed.
*/
/* ── Runtime key derivation on ESP32-C3 ─────────────────── */
#include "mbedtls/hkdf.h"
#include "mbedtls/md.h"
#include "esp_efuse.h"
#include "esp_flash_encrypt.h"
/*
* The master key never appears in RAM. Instead, we use the
* hardware AES engine to perform HKDF indirectly.
*
* For keys where we CAN use eFuse-based hardware AES
* (encrypting/decrypting data blocks), the key never
* leaves the crypto engine.
*
* For keys where we need the derived bytes in RAM
* (like HMAC for auth verification), we derive them
* from a secondary key stored in flash-encrypted NVS.
*/
typedef struct {
uint8_t api_enc_key[32]; /* Decrypted into RAM only when needed */
uint8_t auth_hmac_key[32];
bool loaded;
} babb_key_ring_t;
static babb_key_ring_t g_keys = { .loaded = false };
esp_err_t babb_load_key_ring(void) {
nvs_handle_t nvs;
esp_err_t err;
/* NVS is flash-encrypted — reads go through HW decrypt */
err = nvs_open("babb_keys", NVS_READONLY, &nvs);
if (err != ESP_OK) return err;
size_t len = 32;
err = nvs_get_blob(nvs, "api_enc", g_keys.api_enc_key, &len);
if (err != ESP_OK) { nvs_close(nvs); return err; }
len = 32;
err = nvs_get_blob(nvs, "auth_hmac", g_keys.auth_hmac_key, &len);
if (err != ESP_OK) { nvs_close(nvs); return err; }
nvs_close(nvs);
g_keys.loaded = true;
return ESP_OK;
}
void babb_wipe_key_ring(void) {
mbedtls_platform_zeroize(&g_keys, sizeof(g_keys));
g_keys.loaded = false;
}
/* ── API key decrypt and use ────────────────────────────── */
esp_err_t babb_get_api_key(const char *service_name,
char *out_key, size_t out_len) {
if (!g_keys.loaded) return ESP_ERR_INVALID_STATE;
nvs_handle_t nvs;
esp_err_t err = nvs_open("babb_api", NVS_READONLY, &nvs);
if (err != ESP_OK) return err;
/* Read the encrypted API key blob from NVS */
uint8_t encrypted[128];
size_t enc_len = sizeof(encrypted);
err = nvs_get_blob(nvs, service_name, encrypted, &enc_len);
nvs_close(nvs);
if (err != ESP_OK) return err;
/*
* Decrypt using the API encryption key (which is itself
* stored in flash-encrypted NVS, derived from master).
*
* encrypted blob format:
* [16-byte IV][ciphertext][16-byte auth tag]
*
* Using AES-128-GCM for authenticated encryption.
*/
uint8_t *iv = encrypted;
uint8_t *ct = encrypted + 16;
size_t ct_len = enc_len - 32; /* minus IV and tag */
uint8_t *tag = encrypted + enc_len - 16;
mbedtls_gcm_context gcm;
mbedtls_gcm_init(&gcm);
mbedtls_gcm_setkey(&gcm, MBEDTLS_CIPHER_ID_AES,
g_keys.api_enc_key, 256);
err = mbedtls_gcm_auth_decrypt(&gcm, ct_len,
iv, 16,
NULL, 0, /* no AAD */
tag, 16,
ct,
(uint8_t *)out_key);
mbedtls_gcm_free(&gcm);
if (err != 0) {
mbedtls_platform_zeroize(out_key, out_len);
return ESP_ERR_INVALID_RESPONSE; /* Tag mismatch = tampered */
}
return ESP_OK;
}
/* ── Auth token verification using HMAC ─────────────────── */
bool babb_verify_auth(const uint8_t *token, size_t token_len) {
if (!g_keys.loaded) return false;
uint8_t hmac[32];
const mbedtls_md_info_t *md = mbedtls_md_info_from_type(MBEDTLS_MD_SHA256);
mbedtls_md_hmac(md,
g_keys.auth_hmac_key, 32, /* key from key ring */
token, token_len, /* user's token */
hmac); /* output */
/* Load expected HMAC from NVS (flash-encrypted) */
nvs_handle_t nvs;
if (nvs_open("babb_auth", NVS_READONLY, &nvs) != ESP_OK)
return false;
uint8_t expected[32];
size_t len = 32;
esp_err_t err = nvs_get_blob(nvs, "token_hmac", expected, &len);
nvs_close(nvs);
if (err != ESP_OK) return false;
/* Constant-time comparison */
uint8_t diff = 0;
for (int i = 0; i < 32; i++) {
diff |= hmac[i] ^ expected[i];
}
mbedtls_platform_zeroize(hmac, sizeof(hmac));
return (diff == 0);
}How the RP2040 Uses Keys It Never SeesThis is the architectural trick. The RP2040 never handles any key material. It handles the intent to use keys. Here's the flow when the host needs to make an authenticated API call: The RP2040 is a relay for encrypted payloads. It sees plaintext go in and ciphertext come out (or vice versa), but the transformation happens on the other side of the UART bridge, inside a chip with hardware key isolation. The RP2040 is like a bank teller window — you hand documents through the slot, things happen in the vault behind the wall, and results come back through the slot. The teller never enters the vault. The Tamper Response and Key WipeWhen the conductive mesh breaks and the RP2040's GPIO interrupt fires, the sequence for key protection is: The eFuse master key surviving isn't a problem — without the NVS data that contained the encrypted API keys, the master key has nothing to decrypt. And the eFuse read-protection means the master key can't be extracted anyway. The device is effectively bricked for secret operations until it's re-provisioned by a trusted provisioning station that has the authority to write new encrypted API key blobs to NVS. For Your Use Case: Field TradesmenFor the operators carrying these dongles on job sites, the practical API key flow would look like this: Initial setup (at the office): Operator plugs the BABB dongle into a provisioning station. The station authenticates with your cloud service, generates device-specific API credentials, encrypts them with the dongle's public key (derived from the eFuse master), and writes the encrypted blobs to NVS. The dongle now has credentials for your logging service, alert endpoint, and OTA update server — all encrypted at rest, all decryptable only by that specific dongle's hardware. Daily use (on site): Operator plugs dongle into their field laptop. Authenticates with their token. The laptop's tools can request encrypted operations and send/receive data through the dongle, but the API keys themselves never appear on the laptop. If the laptop is compromised — malware, stolen, whatever — the attacker gets access to the dongle's CDC interface but not to the keys inside the C3. They'd need to also steal the physical dongle AND know the auth token AND not trigger tamper detection. Key rotation: Your cloud service pushes a key rotation command. The C3 receives it over WiFi, verifies the command is signed by your server's key (using a public key stored in eFuse), decrypts the new API credentials using the master key, writes them to flash-encrypted NVS, and wipes the old ones. The operator doesn't do anything. The RP2040 doesn't know it happened. Loss or theft: Operator reports dongle lost. Your server revokes that device's API credentials server-side. Even if someone finds the dongle and somehow bypasses the auth token, the revoked credentials won't authenticate with your services. And the eFuse master key is unique per device, so compromising one dongle tells an attacker nothing about any other dongle. |
Beta Was this translation helpful? Give feedback.
All reactions
-
|
Let me break down the complete NVS (Non-Volatile Storage) layout for the BABB dongle — every partition, every namespace, every key-value pair, what it stores, how it's protected, and the total byte budget. ESP32-C3 Flash Memory MapThe ESP32-C3 on the T-PicoC3 has 4MB of external SPI flash. Here's how it's partitioned: The NVS partition is 640 KB. That's our entire runtime storage budget. Every API key, certificate, log buffer, configuration setting, and operational state lives in that space. How NVS Actually Works InternallyNVS isn't a filesystem. It's a wear-leveled key-value store organized into pages and entries. Understanding the structure matters because it determines how much data you can actually store: Each NVS entry is 32 bytes. A simple integer or small string fits in one entry. Larger blobs span multiple entries. The effective storage capacity after overhead is roughly 500-550 KB of actual data, with the rest consumed by page headers, wear leveling reserves, and entry metadata. The Complete NVS Namespace LayoutHere's every namespace and every key the BABB dongle uses: Total NVS BudgetThe log ring buffer is the biggest consumer by far. At maximum capacity (500 entries × 128 bytes), it takes 62.5 KB. Everything else combined is about 6 KB. You have roughly 570 KB of headroom for future features. NVS Entry OverheadThe raw data sizes above don't account for NVS internal overhead. Each key-value pair costs: With NVS overhead factored in: Flash Wear AnalysisNVS uses wear leveling — it doesn't overwrite pages in place. When a value changes, the new value is written to the next free entry slot. When a page fills up, it's compacted (live entries moved, dead entries discarded) and erased. Each flash page can endure roughly 100,000 erase cycles. The values that change most frequently: The log buffer is the heaviest writer. At 50,000 writes per year across a 640 KB NVS with 160 pages, wear leveling distributes writes so each page sees roughly 312 erases per year. At 100,000 cycle endurance, that's a 320-year theoretical flash lifetime. Even with a 10x safety margin, flash wear is not a concern. What Gets Wiped on TamperAfter a tamper wipe, the dongle still knows who it is (device ID, certificates) and where to go (server URLs), but it's lost everything needed to do anything (keys, credentials, auth data). It can re-attest with the cloud server using its certificate and eFuse-derived private key, receive fresh credentials, and become operational again — but only after being physically inspected and cleared by an administrator who resets the tamper flag. That's the complete NVS picture. 640 KB of flash, using about 1% in normal operation, capable of storing everything the dongle needs to operate autonomously, all transparently encrypted by hardware using a key that exists nowhere except in silicon. |
Beta Was this translation helpful? Give feedback.
All reactions
-
|
This is a really interesting architectural question — you're essentially asking whether the RP2040 can become an on-device inference engine that monitors LLM output flowing through the dongle's USB connection, using the microSD for model storage. Let me think through what's actually feasible and where the C3's capabilities extend this. The Reality Check FirstThe RP2040 has 264KB of SRAM. TensorFlow Lite Micro (TFLM) needs a minimum arena allocation to run — the runtime itself consumes about 20-40KB, and then the model and its intermediate tensors need the rest. That leaves you roughly 200KB for model + working memory. This is tight but not impossible. People have run small TFLM models on Cortex-M0+ cores, but we're talking about models with tens of thousands of parameters, not millions. You won't be running anything resembling a language model on the RP2040. What you can run is a classifier — something that takes a feature vector in and outputs a category. The microSD changes the storage constraint dramatically but doesn't change the RAM constraint. You can store a 32GB library of models, logs, reference data, and cached results on the card, but at inference time, the active model still has to fit in SRAM (or be streamed in chunks, which I'll get to). What the Architecture Looks LikeThe RP2040 Side: What It Can Actually ClassifyYou're not going to do NLP on a Cortex-M0+. But you can do feature-based text classification if you restructure the problem. Instead of feeding raw text into a neural network, you extract numerical features from the text on Core 0 and feed those features into a small classifier on Core 1. Here's the approach that fits in 264KB: /*
* Feature extraction runs on Core 0.
* This converts LLM text output into a fixed-size
* numerical vector that the classifier can process.
*
* The features are statistical/structural, not semantic.
* We're detecting PATTERNS in output, not understanding MEANING.
*/
#define NUM_FEATURES 32
#define WINDOW_SIZE 512 /* Analyze 512 bytes at a time */
typedef struct {
float features[NUM_FEATURES];
} feature_vector_t;
typedef enum {
CAT_NORMAL = 0, /* Ordinary informational response */
CAT_CODE = 1, /* Contains code/scripts */
CAT_SENSITIVE = 2, /* Contains PII, credentials, secrets */
CAT_HARMFUL = 3, /* Matches harmful content patterns */
CAT_STRUCTURED = 4, /* JSON, XML, structured data */
CAT_FINANCIAL = 5, /* Financial data, numbers, transactions */
CAT_MEDICAL = 6, /* Medical terminology detected */
CAT_UNCERTAIN = 7, /* Low confidence, needs C3 review */
} content_category_t;
/*
* Extract features from a text window.
* These are cheap to compute — no floating point math
* until the final normalization.
*/
void extract_features(const char *text, size_t len,
feature_vector_t *out) {
uint32_t char_freq[128] = {0}; /* ASCII frequency table */
uint32_t digit_count = 0;
uint32_t upper_count = 0;
uint32_t special_count = 0;
uint32_t whitespace_count = 0;
uint32_t line_count = 1;
uint32_t max_line_len = 0;
uint32_t cur_line_len = 0;
uint32_t word_count = 0;
uint32_t in_word = 0;
/* Bigram entropy estimation */
uint32_t bigram_unique = 0;
uint8_t bigram_seen[256] = {0}; /* Hash table for common bigrams */
/* Pattern markers */
uint32_t brace_depth = 0;
uint32_t max_brace_depth = 0;
uint32_t bracket_count = 0;
uint32_t at_signs = 0;
uint32_t dollar_signs = 0;
uint32_t semicolons = 0;
uint32_t equals_signs = 0;
uint32_t colons = 0;
uint32_t quotes = 0;
uint32_t angle_brackets = 0;
uint32_t has_http = 0;
uint32_t has_email_pattern = 0;
uint32_t consecutive_digits_max = 0;
uint32_t consecutive_digits_cur = 0;
uint32_t has_ssn_pattern = 0; /* NNN-NN-NNNN */
uint32_t has_cc_pattern = 0; /* 16+ consecutive digits */
uint32_t indentation_variance = 0;
for (size_t i = 0; i < len && text[i]; i++) {
char c = text[i];
if (c >= 0 && c < 128) char_freq[(int)c]++;
if (c >= '0' && c <= '9') {
digit_count++;
consecutive_digits_cur++;
if (consecutive_digits_cur > consecutive_digits_max)
consecutive_digits_max = consecutive_digits_cur;
} else {
if (consecutive_digits_cur >= 16) has_cc_pattern = 1;
consecutive_digits_cur = 0;
}
if (c >= 'A' && c <= 'Z') upper_count++;
if (c == '\n') {
line_count++;
if (cur_line_len > max_line_len) max_line_len = cur_line_len;
cur_line_len = 0;
} else {
cur_line_len++;
}
if (c == ' ' || c == '\t' || c == '\n') {
whitespace_count++;
if (in_word) { word_count++; in_word = 0; }
} else {
in_word = 1;
}
if (c == '{') { brace_depth++; if (brace_depth > max_brace_depth) max_brace_depth = brace_depth; }
if (c == '}' && brace_depth > 0) brace_depth--;
if (c == '[' || c == ']') bracket_count++;
if (c == '@') at_signs++;
if (c == '$') dollar_signs++;
if (c == ';') semicolons++;
if (c == '=') equals_signs++;
if (c == ':') colons++;
if (c == '"' || c == '\'') quotes++;
if (c == '<' || c == '>') angle_brackets++;
/* Simple bigram hash for entropy estimation */
if (i > 0) {
uint8_t h = ((uint8_t)text[i-1] * 31 + (uint8_t)c) & 0xFF;
if (!bigram_seen[h]) { bigram_unique++; bigram_seen[h] = 1; }
}
/* SSN pattern: NNN-NN-NNNN */
if (i >= 10) {
const char *p = &text[i - 10];
if (p[0] >= '0' && p[0] <= '9' &&
p[1] >= '0' && p[1] <= '9' &&
p[2] >= '0' && p[2] <= '9' &&
p[3] == '-' &&
p[4] >= '0' && p[4] <= '9' &&
p[5] >= '0' && p[5] <= '9' &&
p[6] == '-' &&
p[7] >= '0' && p[7] <= '9' &&
p[8] >= '0' && p[8] <= '9' &&
p[9] >= '0' && p[9] <= '9' &&
p[10] >= '0' && p[10] <= '9') {
has_ssn_pattern = 1;
}
}
/* HTTP pattern */
if (i >= 3 && text[i-3] == 'h' && text[i-2] == 't' &&
text[i-1] == 't' && text[i] == 'p') {
has_http = 1;
}
}
if (in_word) word_count++;
/* Email heuristic: has @ with dots nearby */
has_email_pattern = (at_signs > 0 && char_freq['.'] > 0) ? 1 : 0;
/* Now build the normalized feature vector.
* Each feature is scaled to [0.0, 1.0] range.
* These are carefully chosen to be discriminative
* for the categories we care about. */
float flen = (float)(len > 0 ? len : 1);
/* f[0]: Digit density — high for financial, CC numbers */
out->features[0] = (float)digit_count / flen;
/* f[1]: Uppercase density — high for constants, acronyms */
out->features[1] = (float)upper_count / flen;
/* f[2]: Special char density — high for code */
out->features[2] = (float)(semicolons + equals_signs + dollar_signs
+ angle_brackets) / flen;
/* f[3]: Whitespace ratio — high for prose, low for minified code */
out->features[3] = (float)whitespace_count / flen;
/* f[4]: Average word length — technical text has longer words */
out->features[4] = word_count > 0 ?
(float)(len - whitespace_count) / (float)word_count / 20.0f : 0;
/* f[5]: Lines per character — high for code (many short lines) */
out->features[5] = (float)line_count / flen * 10.0f;
/* f[6]: Max line length / window — long lines = prose */
out->features[6] = (float)max_line_len / (float)WINDOW_SIZE;
/* f[7]: Brace depth — high for JSON, code blocks */
out->features[7] = (float)max_brace_depth / 10.0f;
/* f[8]: Bracket density — high for arrays, JSON, markdown */
out->features[8] = (float)bracket_count / flen;
/* f[9]: Quote density — high for strings in code, JSON */
out->features[9] = (float)quotes / flen;
/* f[10]: Colon density — high for JSON, YAML, dicts */
out->features[10] = (float)colons / flen;
/* f[11]: Bigram entropy (normalized) — low for repetitive text */
out->features[11] = (float)bigram_unique / 256.0f;
/* f[12]: Has HTTP URLs */
out->features[12] = (float)has_http;
/* f[13]: Has email pattern */
out->features[13] = (float)has_email_pattern;
/* f[14]: Has SSN pattern */
out->features[14] = (float)has_ssn_pattern;
/* f[15]: Has credit card pattern (16+ consecutive digits) */
out->features[15] = (float)has_cc_pattern;
/* f[16]: Max consecutive digits / window — CC, phone, SSN */
out->features[16] = (float)consecutive_digits_max / (float)WINDOW_SIZE;
/* f[17]: At-sign density — email, decorators, social handles */
out->features[17] = (float)at_signs / flen;
/* f[18]: Dollar sign density — shell scripts, financial */
out->features[18] = (float)dollar_signs / flen;
/* f[19]: Semicolon density — C/Java/SQL code */
out->features[19] = (float)semicolons / flen;
/* f[20]: Equals density — assignments, config files */
out->features[20] = (float)equals_signs / flen;
/* f[21]: Angle bracket density — HTML, XML */
out->features[21] = (float)angle_brackets / flen;
/* f[22-31]: Top 10 ASCII character frequencies
* (normalized by total length).
* These capture the "texture" of different content types.
* Code has high '{', '}', ';'. Prose has high 'e', 't', 'a'.
* Financial data has high '0'-'9', '.', ','. */
{
/* Find top 10 most frequent printable chars */
uint8_t top[10] = {0};
for (int t = 0; t < 10; t++) {
uint32_t max_f = 0;
for (int c = 32; c < 127; c++) {
if (char_freq[c] > max_f) {
int already = 0;
for (int p = 0; p < t; p++)
if (top[p] == c) { already = 1; break; }
if (!already) { max_f = char_freq[c]; top[t] = c; }
}
}
out->features[22 + t] = (float)char_freq[top[t]] / flen;
}
}
}That feature extraction runs in about 50-100 microseconds for a 512-byte window on the RP2040 at 133MHz. No heap allocation, no floating point until the final normalization (and even that could be replaced with fixed-point if needed). The result is a 32-element float vector that captures the structural fingerprint of the text. The Classifier ModelThe model that runs on Core 1 is a small dense neural network. For 32 input features and 8 output categories: That's about 52KB out of 264KB. Core 0 has the USB stack (~30KB), feature extraction buffers, and microSD SPI driver. Plenty of room. But 2.6KB is a tiny model. It catches structural patterns — it can tell code from prose, spot digit-heavy content, flag things that look like PII formats. It cannot understand meaning. It doesn't know if the text says "here's how to build a bomb" versus "here's how to build a birdhouse." For semantic understanding, you need the C3. Where the ESP32-C3 Extends ThisThe C3 has 400KB of SRAM and — critically — the hardware crypto engine and WiFi. Here's how it adds semantic depth: The C3 maintains a policy engine — rules that determine what happens when specific categories are detected: /*
* Policy engine on ESP32-C3.
* Maps classification results to actions.
* Policies are stored on microSD (loaded at boot)
* and cached in C3 RAM.
*/
typedef struct {
content_category_t category;
float min_confidence; /* Threshold to act */
uint8_t action; /* What to do */
uint8_t alert_level; /* 0=log, 1=warn, 2=critical */
char alert_msg[64]; /* Template for notifications */
} policy_rule_t;
/*
* Actions:
* 0x00 LOG_ONLY — Record on microSD, no interruption
* 0x01 FLAG — Log + mark in stream for host review
* 0x02 ALERT_LOCAL — Flash LED pattern, buzz (if piezo)
* 0x03 ALERT_REMOTE — Send WiFi alert to cloud/phone
* 0x04 BLOCK — Tell RP2040 to halt USB data stream
* 0x05 ENCRYPT_LOG — Encrypt the flagged content before
* writing to microSD (for compliance)
* 0x06 QUARANTINE — Copy to encrypted quarantine folder
* on microSD for later review
* 0x07 C3_REVIEW — Run secondary analysis on C3
* (keyword matching, more complex model)
*/
static const policy_rule_t default_policies[] = {
/* Normal content — just log statistics */
{ CAT_NORMAL, 0.0, 0x00, 0, "" },
/* Code detected — log it, useful for audit trail */
{ CAT_CODE, 0.70, 0x00, 0, "" },
/* Sensitive content (PII patterns) — encrypt and alert */
{ CAT_SENSITIVE, 0.80, 0x05, 2, "PII pattern detected in LLM output" },
/* Harmful content patterns — block and alert */
{ CAT_HARMFUL, 0.60, 0x04, 2, "Harmful content flagged" },
/* Structured data — log, often benign but track it */
{ CAT_STRUCTURED, 0.75, 0x00, 0, "" },
/* Financial data — encrypt log, warn */
{ CAT_FINANCIAL, 0.80, 0x05, 1, "Financial data in LLM output" },
/* Medical terminology — encrypt, compliance requirement */
{ CAT_MEDICAL, 0.75, 0x05, 1, "Medical data detected" },
/* Low confidence — send to C3 for secondary analysis */
{ CAT_UNCERTAIN, 0.0, 0x07, 0, "" },
};The C3's Secondary AnalysisWhen the RP2040 flags something as /*
* C3 secondary analysis.
* Runs when RP2040 is uncertain or policy demands deeper review.
*
* The C3 doesn't get the raw text — it gets the feature vector
* and a hash. If the C3 needs the raw text, it requests it
* from the RP2040, which reads it back from microSD.
*
* This keeps the UART bandwidth low during normal operation.
*/
/* Keyword lists stored in flash-encrypted NVS or loaded from
* encrypted file on microSD. These are the "watch lists." */
typedef struct {
const char *keyword;
content_category_t override_category;
uint8_t severity;
} keyword_entry_t;
/*
* These would be loaded from microSD at boot:
* /config/keywords_sensitive.txt
* /config/keywords_medical.txt
* /config/keywords_financial.txt
* /config/keywords_harmful.txt
*
* Stored encrypted, decrypted into RAM by C3.
*/
/* When C3 needs raw text for keyword scan: */
static esp_err_t request_text_from_rp(uint32_t window_id,
char *buf, size_t buflen) {
/*
* Send CMD_REQUEST_TEXT(window_id) to RP2040 over UART.
* RP2040 reads the window back from microSD log file
* and sends it over UART in chunks.
*
* This is the slow path — only triggered on uncertain
* classifications, maybe 5-10% of windows.
*/
uint8_t payload[4];
payload[0] = (window_id >> 24) & 0xFF;
payload[1] = (window_id >> 16) & 0xFF;
payload[2] = (window_id >> 8) & 0xFF;
payload[3] = window_id & 0xFF;
babb_send_frame(CMD_REQUEST_TEXT, payload, 4);
/* Wait for response chunks, reassemble */
size_t received = 0;
while (received < buflen) {
babb_frame_t frame;
if (babb_receive_frame(&frame, 1000) != 0) break;
if (frame.cmd != CMD_TEXT_CHUNK) break;
size_t chunk_len = frame.length;
if (received + chunk_len > buflen) chunk_len = buflen - received;
memcpy(buf + received, frame.payload, chunk_len);
received += chunk_len;
if (frame.payload[chunk_len] == 0x04) break; /* EOT marker */
}
return (received > 0) ? ESP_OK : ESP_ERR_TIMEOUT;
}The MicroSD LayoutThe microSD card serves as the RP2040's extended memory — model storage, log persistence, and configuration: SPI Bus Sharing: MicroSD and the ChallengeThe RP2040 has two SPI peripherals (SPI0 and SPI1). SPI0 is already used for the external flash that holds the RP2040's own firmware. SPI1 is available for the microSD card. The pin assignment: SPI to microSD at the RP2040's max SPI clock of 62.5MHz (though most microSD cards negotiate down to 25MHz in SPI mode) gives you about 3MB/s read throughput. That's fast enough to stream model chunks but you wouldn't want to do it during real-time inference. The pattern is: load model into SRAM at boot, run inference from SRAM, write logs to SD asynchronously. /*
* MicroSD access patterns optimized for power.
*
* The SD card draws 50-100mA during active read/write
* and ~0.2mA in sleep. We want to minimize active time.
*/
/* Batch writes: accumulate log entries in RAM buffer,
* flush to SD every N entries or every M seconds */
#define SD_FLUSH_ENTRIES 16
#define SD_FLUSH_INTERVAL 30000 /* 30 seconds */
static struct {
char buffer[SD_FLUSH_ENTRIES * 128]; /* 2KB write buffer */
uint16_t count;
uint32_t last_flush;
} sd_log_buf;
/*
* Writing one byte to SD costs almost as much power as
* writing 512 bytes (one sector). So we always write
* in sector-aligned chunks. 16 log entries × 128 bytes
* = 2048 bytes = 4 sectors. One SPI transaction.
* Card active for ~1ms instead of 16 × 1ms.
*/
void sd_flush_logs(void) {
if (sd_log_buf.count == 0) return;
/* Wake SD card from sleep */
sd_card_wake();
/* Write entire buffer as one operation */
FRESULT fr = f_write(&log_file,
sd_log_buf.buffer,
sd_log_buf.count * 128,
NULL);
f_sync(&log_file);
/* Put SD card back to sleep */
sd_card_sleep();
sd_log_buf.count = 0;
sd_log_buf.last_flush = time_ms();
}The Full Data FlowHere's what happens when an operator is using Claude on their laptop and the BABB dongle is monitoring: The round trip from text window to alert takes about 50-100ms. Fast enough that the operator barely notices any latency in the text stream. Model Training and UpdatesThe 2.7KB classifier model gets trained offline on a real computer, not on the dongle. The workflow: What This Buys YouFor tradesmen and industrial operators, this on-device scanning means: Compliance without cloud dependency. If your operator is using an LLM to draft reports, look up codes, or analyze data, and the LLM accidentally includes client PII, SSNs, or financial data in its response, the dongle catches it before it gets logged, copied, or shared. The operator doesn't need internet for this — the RP2040 classifier runs entirely local. The WiFi alert is a bonus, not a requirement. Audit trail on microSD. Every piece of LLM output that passed through the dongle is logged. The classified results are timestamped. If there's ever a compliance question — "did we accidentally process client medical data through an AI?" — the answer is on the microSD card, encrypted, with classification metadata. Content-aware power management. When the classifier sees long stretches of The C3 as network policy enforcer. The RP2040 does the fast local detection. The C3 decides what to do about it — and the C3 has the keys, the WiFi, and the policy engine. This separation means an operator can't just disable the monitoring by patching the RP2040's firmware, because the C3 independently verifies that classifications are arriving at expected frequency and raises an alert if the RP2040 goes silent. The limitation is clear: this is pattern detection, not comprehension. The dongle catches structural red flags — digit patterns that look like SSNs, code blocks, high-entropy strings that might be API keys. It doesn't understand meaning. If an LLM outputs "the patient has diabetes" in a sentence that looks structurally like ordinary prose, the RP2040 classifier might miss it. The C3's keyword scan catches some of these, but true semantic understanding would require a much larger model than either chip can run. That's an acceptable tradeoff for a device that draws 100mA and fits on a keychain. |
Beta Was this translation helpful? Give feedback.
All reactions
-
|
Let me fetch that article to see what PicoCLAW is. This is a really good find. PicoClaw fits into the BABB architecture in a very specific and powerful way. Let me break down what it is and where it slots in. What PicoClaw Actually IsPicoClaw is an ultra-lightweight personal AI assistant that runs on less than 10MB of RAM, designed for resource-constrained embedded boards. cnx-software It's a Go binary — single executable, no interpreter, no runtime dependencies. It boots in about 1 second on a 600MHz core and cross-compiles to RISC-V, ARM, and x86 from a single codebase. cnx-software The key thing to understand: PicoClaw is not an LLM itself. It's an AI agent — it clears your inbox, sends emails, manages your calendar, checks you in for flights. cnx-software It connects to an external LLM provider via API (Claude, OpenAI, etc.) and orchestrates actions based on the responses. You configure your LLM provider API key in It evolved from the nanobot project (about 4,000 lines of Python), and was refactored from the ground up in Go through a self-bootstrapping process where the AI agent itself drove the architectural migration. cnx-software The result is roughly 99% smaller than the full OpenClaw it's derived from. Why This Matters for BABBPicoClaw doesn't run on the RP2040 — 264KB SRAM is far too little for even 10MB. But it runs beautifully on the ESP32-C3's big sibling, or more practically, on the class of hardware your operators already carry. Here's where it fits in the BABB ecosystem: Scenario 1: PicoClaw on the Operator's Phone (via Termux)You've already been exploring Termux for mobile AI tooling. PicoClaw's ARM64 Linux binary would run in Termux on the operator's Android phone. The BABB dongle becomes the security layer between PicoClaw and the outside world: This is the most immediate integration. PicoClaw needs an API key to function. Instead of storing that key in Scenario 2: PicoClaw on a Cheap Field SBCPicoClaw runs on $10 hardware like the Sipeed LicheeRV Nano with a SOPHGO SG2002 RISC-V SoC and 256MB DDR3. cnx-software Picture a small Linux SBC mounted in the operator's tool bag or clipped to a belt, running PicoClaw as an always-on field assistant. The BABB dongle plugs into it via USB: The SBC runs PicoClaw. The dongle provides the security layer (key storage, crypto) and the network layer (WiFi via the ESP32-C3). The SBC itself doesn't need WiFi hardware — it communicates through the dongle's USB CDC interface, and the dongle's C3 handles the actual HTTPS call to the LLM provider, injecting the API key that only the C3 knows. Scenario 3: PicoClaw as the Orchestration Layer for the ScannerThis is where it gets interesting for what we discussed about TF-Lite scanning. PicoClaw is an agent framework — it takes natural language commands and executes multi-step tool workflows. You could extend PicoClaw with a custom tool that interfaces with the BABB dongle's scanning pipeline: PicoClaw becomes the natural language interface to the dongle's scanning and security features. The operator doesn't need to remember shell commands like The Modified Architecture StackWith PicoClaw in the picture, the full BABB field system looks like this: The Critical Integration Point: API Key ProxyThe most important architectural decision is how PicoClaw gets the API key from the dongle. There are two approaches: Approach A: Key injection. PicoClaw asks the dongle for the decrypted API key, holds it in memory, makes the API call itself, wipes it. Simple but the key exists in PicoClaw's process memory briefly. If the phone/SBC is rooted with a memory dumper running, the key could be captured. Approach B: Request proxy. PicoClaw builds the API request body but sends it to the dongle's C3 instead of the API directly. The C3 injects the Authorization header and makes the HTTPS call itself, returning only the response body. The key never leaves the dongle. This is more secure but adds latency (the C3's WiFi stack handles the HTTP) and makes the dongle a network bottleneck. For your use case — tradesmen in the field, moderate threat model, not defending against nation-states — Approach A with a short exposure window is probably the right tradeoff. The key lives in PicoClaw's Go process memory for the duration of an API call (milliseconds), and Go's garbage collector will eventually overwrite it (though you'd want to explicitly zero the byte slice for defense in depth). Approach B is better for high-security deployments where you absolutely cannot tolerate key exposure outside the dongle. What You'd Need to BuildThe PicoClaw-to-BABB integration is a surprisingly small amount of code: On the PicoClaw side (Go):
On the dongle side (C firmware):
The heavy lifting — the security architecture, the NVS layout, the TF-Lite scanning, the USB stack — is all already done. PicoClaw sits on top as the user-facing layer that makes all of it accessible through natural language instead of shell commands. It's a really clean fit. The fact that it's a single self-contained binary across RISC-V, ARM, and x86 cnx-software means you don't care what the operator's host device is — Android phone, Raspberry Pi, x86 laptop, even one of those $15 RISC-V boards. Same PicoClaw binary, same dongle interface, same security model everywhere. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Critical data portability, autonomy to design work processes and sovereignty to end them.
Beta Was this translation helpful? Give feedback.
All reactions