CTO & Co-Founder at QWERKY AI, where I build novel LLM architectures from CUDA kernels to production inference. Currently pursuing my MS in Computer Science at Georgia Tech while shipping AI systems at scale. Over my career I've accumulated 10+ years of experience, shipped 15+ production apps, led teams of 20+ engineers, and have a pending LLM patent.
- LLM Architecture Research -- Novel attention mechanisms with custom CUDA implementations
- State Space Models -- Contributing Mamba SSM architectures to Modular's MAX framework in Mojo
- QDistill -- Transforming transformer layers to state space layers for 4x throughput and 1M token context lengths
- Open Source -- Kernels for selective scan, causal conv1d, and RMSNorm in the Modular ecosystem
Languages
AI / ML
Infrastructure
|
Modular MAX Framework
|
Pulley
|
|
QWERKY AI
|
key-gen
|
|
Robot2815 -- FIRST Robotics Competition team code
|
|
- Bringing Blazing Fast State Space Models to the Modular MAX Framework -- Feb 2026
- Mother May AI: An Opinion on Geoffrey Hinton's Mother AI -- Sep 2025
- Attention: The Breakthroughs and the Bottlenecks -- Jun 2025
- Incidental Non-Determinism: When AI Surprises You (and Why) -- May 2025
Read more on the QWERKY AI blog →




