Kieran Bicheno combines deep expertise in AI/ML development, mass media, and data-driven infrastructure with a background in digital news leadership and economics. He has more than ten years' experience designing, deploying, and scaling workflow systems and data pipelines, with applied experience in AI and MLOps going back to 2018.
- Machine Learning & AI:
- MLOps architecture, automated model training pipelines, and continuous-integration workflows
- Research contributions in dataset documentation (co-author of “The Pile Datasheet”)[^1]
- Implementation of local LLM deployments and agentic coding systems (e.g., Llama.cpp, Mistral)
- Infrastructure & DevOps:
- Design and management of self-hosted GPU clusters (NVIDIA RTX-class hardware) on Ubuntu/Linux
- Containerization with Docker and orchestration using docker-compose and Kubernetes
- Secrets-management (Infisical), PostgreSQL configuration, and systemd service automation
- Programming & Automation:
- Advanced Python development (data pipelines, audio-processing, Google Apps Script)
- Shell scripting and CLI tooling for ffmpeg, yt-dl integration, and workflow automation (n8n, PM2/Bun)
- Data Engineering:
- Large-scale economic and cosmological data ingestion, statistical analysis, and time-series forecasting
- API design for Stable Diffusion and cost-push inflation models
- Led AI augmentation strategy and digital transformation in high-stress news environments at News Corp Australia, pioneering a Google Apps Script tool that cut a critical workflow from 2 hours to 12 minutes.
- Co-authored the influential “Datasheet for The Pile” paper on arXiv, establishing metadata standards for large language-model datasets.[^1]
- Architected and optimized self-hosted inference environments for LLMs, overcoming PyTorch compatibility issues with RTX 5090 GPUs.
- Directed GenFactory.io’s industrial-scale Stable Diffusion API platform, integrating data-driven image generation at production scale.
GitHub profile showcases a broad range of repositories spanning AI/ML prototypes, tooling for self-hosting, data-analysis libraries, and multimedia processing utilities. Highlights include:
- MLOps pipeline templates for training and deploying transformer models
- Kubernetes manifests and Helm charts for GPU-accelerated inference
- Audio-visual production scripts and video-processing workflows
- “Datasheet for The Pile” – co-author, arXiv:2201.07311
- Independent MLOps research on autonomous systems control and workforce augmentation (2020–2021)
- Cosmological data analysis using radio-telescope datasets (2020–2021)
- LinkedIn: linkedin.com/in/kieranbicheno
- Email: kieran.bicheno@gmail.com
- Phone: 0451 139 937




