NanoMamba

Noise-Robust Ultra-Compact Keyword Spotting via Dual-Expert Normalization and Adaptive State Space Dynamics

IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2026

Highlights

4,957 parameters — 4.8x smaller than DS-CNN-S (23.7K), yet more noise-robust
DualPCEN: Two complementary PCEN experts with signal-based routing (Spectral Flatness + Tilt + TMI)
SA-SSM: SNR-conditioned Mamba — delta-modulation + B-gating adapt temporal dynamics to noise level
Zero-overhead noise robustness: No separate enhancement module, no extra inference cost
FPGA-ready: Complete Verilog RTL for Xilinx Artix-7, INT8 datapath, ~15mW

Architecture

Raw Audio (16kHz, 1s)
  --> STFT (512-FFT, 160-hop)
  --> SNR Estimator (per-mel-band, running EMA)
  --> DualPCEN (Expert1: babble | Expert2: stationary, SF+Tilt+TMI routing)
  --> Instance Norm
  --> Patch Projection (n_mels -> d_model)
  --> N x SA-SSM Block
  |     LayerNorm -> in_proj -> DWConv1d -> SA-SSM -> Gate -> out_proj + Residual
  |     SA-SSM: SNR modulates delta (step size) and B (input gate)
  --> Global Average Pooling
  --> 12-class Classifier

Key Results

Model	Params	Clean	Factory 0dB	White 0dB	Babble 0dB
NanoMamba-Tiny-DualPCEN (ours)	4,957	-	-	-	-
NanoMamba-Matched-DualPCEN (ours)	7,402	-	-	-	-
BC-ResNet-1	7,464	-	-	-	-
DS-CNN-S	23,700	-	-	-	-

Results will be filled after full evaluation. Run the Colab notebook to reproduce.

Quick Start

Google Colab (Recommended)

Click the badge above or open directly. The notebook handles everything: dataset download, training, noise evaluation.

Local

git clone https://github.com/DrJinHoChoi/NanoMamba-TASLP.git
cd NanoMamba-TASLP
pip install -r requirements.txt

# Train NanoMamba-Tiny-DualPCEN (4,957 params, ~8min on GPU)
python train_colab.py \
    --models NanoMamba-Tiny-DualPCEN \
    --epochs 30 --noise_aug --calibrate

# Evaluate on 5 noise types x 7 SNR levels
python train_colab.py \
    --models NanoMamba-Tiny-DualPCEN \
    --eval_only --calibrate \
    --noise_types factory,white,babble,street,pink \
    --snr_range=-15,-10,-5,0,5,10,15

Project Structure

NanoMamba-TASLP/
  nanomamba.py             # Core: SA-SSM + DualPCEN + MultiPCEN + SNR Estimator
  train_colab.py           # Training pipeline, noise evaluation, calibration
  model.py                 # CNN baselines (DS-CNN-S, BC-ResNet-1)
  paper_models.py          # Additional model variants for ablation
  proposed_model.py        # DualPCEN proposed architecture
  train_all_models.py      # Multi-model training orchestrator
  measure_efficiency.py    # Latency & memory benchmarks
  arm_analysis.py          # ARM Cortex-M deployment analysis
  NanoMamba_Train.ipynb    # Colab notebook (one-click training)
  requirements.txt
  checkpoints_full/        # Pre-trained model weights
  rtl/                     # FPGA/ASIC Verilog implementation
    src/                   #   10 RTL modules (SSM, PCEN, STFT, classifier, ...)
    tb/                    #   Testbench
    mem/                   #   LUT & weight memory files
    fpga/                  #   Xilinx Artix-7 constraints & wrapper
    Makefile               #   Simulation & synthesis automation
  scripts/                 # Weight export & LUT generation utilities

Model Variants

Model	Params	Description
`NanoMamba-Tiny`	4,634	SA-SSM baseline (d=16, s=4, 2 layers)
`NanoMamba-Tiny-DualPCEN`	4,957	+ Dual-PCEN with SF+Tilt routing
`NanoMamba-Tiny-TriPCEN`	5,118	+ 3-expert PCEN (factory/street specialist)
`NanoMamba-Matched-DualPCEN`	7,402	Param-matched to BC-ResNet-1 (d=21, s=5)
`NanoMamba-*-v2`	same	+ TMI + SNR-conditioned temp + temporal smoothing
`NanoMamba-*-v2-SSMv2`	same	+ SA-SSM v2 (Michaelis-Menten + PCEN gate conditioning)

SA-SSM: How It Works

Standard Mamba treats all input frames equally. SA-SSM conditions the selection mechanism on per-mel-band SNR:

Delta modulation: Low SNR -> smaller step size -> longer temporal memory (average out noise)
B-gating: Low SNR -> attenuate noisy input -> preserve state from cleaner frames
Runtime calibration: Noise profile estimated during silence -> adaptive buffer parameters

FPGA Implementation

Spec	Value
Target	Xilinx Artix-7 (XC7A35T)
Resources	~2,500 LUTs, 4 DSP48, 3 BRAM
Power	~15mW (FPGA), ~0.08mW (ASIC 28nm estimate)
Datapath	INT8 weights, 32-bit accumulator
Clock	50MHz, real-time processing

Dataset

Google Speech Commands V2 (12-class: yes, no, up, down, left, right, on, off, stop, go + silence + unknown). Automatically downloaded by the training script.

Citation

@article{choi2026nanomamba,
  author  = {Choi, Jin Ho},
  title   = {{NanoMamba}: Noise-Robust Ultra-Compact Keyword Spotting via
             Dual-Expert Normalization and Adaptive State Space Dynamics},
  journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  year    = {2026},
  volume  = {},
  number  = {},
  pages   = {},
  doi     = {}
}

License

Dual license: Free for academic/research use. Commercial use requires a separate license. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoMamba

Highlights

Architecture

Key Results

Quick Start

Google Colab (Recommended)

Local

Project Structure

Model Variants

SA-SSM: How It Works

FPGA Implementation

Dataset

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
checkpoints_full		checkpoints_full
paper		paper
rtl		rtl
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
NanoMamba_Train.ipynb		NanoMamba_Train.ipynb
README.md		README.md
arm_analysis.py		arm_analysis.py
measure_efficiency.py		measure_efficiency.py
model.py		model.py
nanomamba.py		nanomamba.py
paper_models.py		paper_models.py
proposed_model.py		proposed_model.py
requirements.txt		requirements.txt
train_all_models.py		train_all_models.py
train_colab.py		train_colab.py

Folders and files

Latest commit

History

Repository files navigation

NanoMamba

Highlights

Architecture

Key Results

Quick Start

Google Colab (Recommended)

Local

Project Structure

Model Variants

SA-SSM: How It Works

FPGA Implementation

Dataset

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages