This repo contains the code and infrastructure configuration for a machine-learning prediction API deployed on AWS EKS (Kubernetes) with Redis caching, Istio, and Grafana-based load-testing analysis (k6 + Prometheus).
I wrote a full project walkthrough here:
👉 Project Page: https://napronald.github.io/pages/mlapi.html
The goal of the project is to demonstrate how to run a real ML service in production-style infrastructure — including caching, autoscaling, and latency analysis under load.