Data scientist at The New York Times, where I build experimentation infrastructure and statistical tooling. Side projects in Python and whatever seems interesting. Based in Brooklyn.
-
Beat the Streak v2 — PA-level MLB hit prediction model that beats published benchmarks on backtested data. LightGBM on 1.5M plate appearances, 13 provably leak-free features, validated across 6 seasons. Successor to my 2021 project.
-
FTL Reinforcement Learning — Training an RL agent to play FTL: Faster Than Light using Gymnasium + Stable Baselines3. Live memory reading via Mach VM API, custom action masking, curriculum learning.
-
Screenshot to Calendar — Turn screenshots of events into Google Calendar entries using Claude's vision AI. Vercel, Cloudflare Workers, Google OAuth.
-
NYC Parking Ticket Heatmap — Interactive heatmap of parking tickets by street, day, and hour. NYC Open Data API, SQLite, Streamlit.
Languages: Python, SQL, JavaScript
ML/Data Science: LightGBM, scikit-learn, pandas, PyTorch, Gymnasium
Data Infrastructure: dbt, BigQuery, Statsig, Airflow, SQLAlchemy
Infrastructure: GCP, Vercel, Docker, GitHub Actions, PostgreSQL




