This project implements a real-time data pipeline for Polymarket, consisting of a Python ingestion client, a Kafka message broker, and a Flink analytics engine running on Kubernetes.
- Ingestion Producer: A Python application that subscribes to Polymarket's WebSocket API and pushes market events to Kafka.
- Kafka: Acts as the message broker buffering events.
- Flink Analytics:
- Market Analytics: A Java-based Flink job that consumes events from Kafka, performs real-time aggregation/analytics.
- Sentiment Analysis: A Flink job that performs sentiment analysis on market data.
- Session Cluster: An interactive Flink session cluster for ad-hoc job submission.
- Infrastructure: The entire stack runs on Kubernetes, managed via Helm charts and K8s manifests. Includes Prometheus/Grafana for monitoring and Postgres for storage.
apps/ingestion-producer/: Python client for fetching Polymarket data.flink-analytics/: Flink Java application for processing data streams.
infrastructure/helm/: Values files for Helm charts (Kafka, Postgres, Prometheus).k8s/: Kubernetes manifests for deployments, services, and the Flink Operator (including Market Analytics, Sentiment Analysis, and Session Cluster).
Tiltfile: Configuration for local development with Tilt.
- Docker
- Minikube
- kubectl
- Helm
- Java JDK 17 (for building Flink jobs)
- Maven (for building Flink jobs)
- Tilt (for local development)
The project uses Tilt for local development, which handles building, deploying, and live-reloading automatically.
Create a Minikube cluster with sufficient resources:
minikube start --cpus 4 --memory 7000Point your terminal's Docker client to Minikube's Docker daemon:
eval $(minikube docker-env)Run Tilt to build images, deploy infrastructure, and start all applications:
tilt upTilt will automatically:
- Install infrastructure (Flink Operator, Kafka, Postgres, Prometheus/Grafana)
- Build Docker images for the Ingestion Producer and Flink Analytics jobs
- Deploy all applications (Ingestion, Market Analytics, Sentiment Analysis, Session Cluster) with proper dependency ordering
- Set up port forwarding for services
To stop and clean up all resources:
tilt downkubectl port-forward svc/prometheus-grafana 3000:80
Market Analytics:
kubectl port-forward svc/poly-example-rest 8081Sentiment Analysis:
kubectl port-forward svc/poly-sentiment-rest 8082:8081Session Cluster:
kubectl port-forward svc/flink-session-rest 8083:8081Check the logs of the ingestion producer:
kubectl logs -f -l app=polymarket-clientCheck the logs of the Flink TaskManager:
kubectl logs -f -l component=taskmanagerTilt provides live reload - changes to source files automatically trigger rebuilds and redeployments.
- Edit code in
apps/flink-analytics/jobs/poly/src/. - Tilt automatically runs Maven, rebuilds the Docker image, and redeploys.
- Edit code in
apps/ingestion-producer/. - Tilt automatically rebuilds and redeploys the container.
For an interactive session download Flink jars and use the sql-client to interact with the session cluster:
./bin/sql-client.sh embedded -Drest.address=localhost -Drest.port=8083