Hi TinyMaix folks,
I wanted to share a small MCU language-runtime experiment and ask whether systems like this feel adjacent to the kind of tiny inference TinyMaix represents.
We built a public demo line called Engram and deployed it on a commodity ESP32-C3.
Current public numbers:
Important scope note:
This is not presented as unrestricted open-input native LLM generation on MCU.
The board-side path is closer to a flash-resident, table-driven runtime with:
- packed token weights
- hashed lookup structures
- fixed compiled probe batches
- streaming fold / checksum style execution over precompiled structures
So this is not a standard tiny dense model path. It is closer to a task-specialized language runtime whose behavior has been pushed into a compact lookup-heavy execution form.
Repo:
https://github.com/Alpha-Guardian/Engram
I’d be curious whether you would see this as:
- adjacent to TinyML-style MCU inference
- a different compression endpoint
- or a separate class of language-runtime specialization