This repository is released under a custom Academic Evaluation License to facilitate peer review and reproducibility. Commercial deployment, hardware synthesis integration, or utilization in proprietary architectures requires a separate Commercial IP license. See the LICENSE file for details. Algorithms and architectures described herein are patent pending (U.S. App. No. 63/987,398 and supplemental filings).
Companion code for the paper:
The AetherFloat Family: Block-Scale-Free Quad-Radix Floating-Point Architectures for AI Accelerators
This repository contains all scripts needed to reproduce the figures, tables, and validations reported in the paper.
uv syncdocker build -t aetherfloat-synth -f Dockerfile .g++ -O2 -o aether_core src/aether_core.cppCompares quantization noise (SQNR) between bfloat16 and AetherFloat-16.
uv run src/wobble_plot.py
# → wobble_plot.pdfQAT ablation on Qwen2.5-7B comparing stochastic rounding chunk sizes (1, 16, 256) against a bfloat16 baseline (300 steps).
uv run src/train_ablation.py
# → sr_ablation_qwen_real_7b.pdfQuantization-aware training comparing 8-bit AF8 (scale-free) vs FP8 E4M3 vs bfloat16 baseline on Qwen2.5-7B (200 steps).
uv run src/train_qat_af8.py
# → qat_8bit_convergence_ste_7b.pdfPost-training quantization evaluation on WikiText-2, PIQA, and HellaSwag.
uv run src/eval_ptq_7b.py --fmt allRun a single format with --fmt bf16, --fmt fp8, --fmt af8, or --fmt af16.
Synthesizes FP8 Base-2 and AF8 Base-4 MAC datapaths and compares area, delay, and power using Yosys and OpenSTA.
Note: This script requires Yosys and OpenSTA, which are EDA tools with complex build dependencies (Tcl, Boost, CUDD, etc.). The Dockerfile packages the entire toolchain so you don't need to install them on your host.
docker build -t aetherfloat-synth -f Dockerfile .
docker run --rm -v "$(pwd):/workspace" -w /workspace \
aetherfloat-synth python3 src/synth_mac_datapath.pyValidates AetherFloat-16 encoding/decoding with 1 million random floats and verifies the O(1) lexicographic sorting property through monotonicity checks.
g++ -O2 -o aether_core src/aether_core.cpp
./aether_core| File | Paper Reference | Description |
|---|---|---|
src/aether_sim.py |
— | Core quantization library (AF8/AF16, FP8 baseline, PTQ/QAT patching) |
src/aether_core.cpp |
Section IV-A | Lexicographic sort validation (1M random floats) |
src/wobble_plot.py |
Figure 1 | SQNR wobble comparison: bfloat16 vs AetherFloat-16 |
src/train_ablation.py |
Figure 2 | Stochastic rounding ablation study on Qwen2.5-7B |
src/train_qat_af8.py |
Figure 3 | QAT convergence: AF8 vs FP8 vs bfloat16 |
src/eval_ptq_7b.py |
Table II | PTQ benchmarks (WikiText-2, PIQA, HellaSwag) |
src/synth_mac_datapath.py |
Table III | MAC datapath synthesis (area / delay / power) |
Dockerfile |
Table III | Build environment for Yosys + OpenSTA |
- Python >= 3.11, uv
- Multi-GPU setup for training/eval scripts (Qwen2.5-7B)
- Docker for hardware synthesis (Yosys + OpenSTA are built inside the container)
- C++ compiler (g++ or clang++) for
aether_core.cpp
If you find this code or our paper useful in your research, please consider citing:
@article{morisaki2026aetherfloat,
title={The AetherFloat Family: Block-Scale-Free Quad-Radix Floating-Point Architectures for AI Accelerators},
author={Morisaki, Keita},
journal={arXiv preprint arXiv:2603.08741},
year={2026},
eprint={2603.08741},
archivePrefix={arXiv},
primaryClass={cs.AR}
}