Data Engineer building the infrastructure behind data-driven products
What I Do:
- Design & deploy scalable ETL pipelines handling terabytes of real-time data
- Optimize data warehouses for performance & cost (saved 30% costs through efficient architecture)
- Build MLOps platforms with real-time & batch feature computation
- Ensure data governance & quality at scale
Technologies:
AWS Apache Spark Kafka Airflow Redshift EMR MSK S3 Cassandra Lakehouse
Currently Learning: 🏗️ Lakehouse architecture patterns (Delta Lake, Iceberg, Hudi) ⚙️ End-to-end MLOps platform design
Looking to Contribute: 🤝 Apache Cassandra open-source projects 💬 Open to collaborating on real-time data engineering & MLOps initiatives

