I am a Data Scientist at Amazon with experience in agentic systems, causal ML, and retrieval optimization. I hold a Master's in Mathematics, Vision, and Learning (MVA) from ENS Paris-Saclay.
Thesis evaluating retrieval strategies for enterprise knowledge bases connected to agentic systems. Studied agentic chunking, GraphRAG, query-aware hybrid fusion, and conditional reranking — achieving a 52% relative improvement in Top-1 Exact Match.
Implementation for the final project of the LLMs class at MVA, focusing on post-training pruning of MoE models.
Fine-tuning & evaluating pre-trained audio models (HuBERT & XLSR) on the ML-SUPERB dataset for monolingual and multilingual speech recognition tasks.
Implementation of a reinforcement learning solution for HIV treatment optimization using Fitted Q-Iteration (FQI) with XGBoost regression models.
Evaluation of various anomaly detection techniques on time series data & implementation of tools for benchmarking, ensemble creation, & performance analysis.
Implementation & analysis of the PCA-based K-Means Clustering method proposed in the paper "K-mean Clustering via Principcal Component Analysis" by Ding and He (2004).
Fine-tuning RoBERTa using Low-Rank Adaptation (LoRA) on different datasets.
- Data Scientist at Amazon — Building agentic solutions, knowledge base infrastructure, and causal ML models for supply chain optimization
- Research Intern at Cambridge Centre for Alternative Finance (CCAF) — Anomaly detection and NLP for financial research
- M2 MVA (Mathematics, Vision, Learning) at ENS Paris-Saclay
- Engineering Diploma (Exchange semester) at CentraleSupélec
- Bachelor in Mathematics from University of Luxembourg
- Programming: Python, Java, SQL, R
- AI & ML Frameworks: PyTorch, TensorFlow, scikit-learn
- AWS: Bedrock, AgentCore, Strands
- Data Engineering: PostgreSQL, MongoDB, Spark, Hadoop
- Languages: English (C2), French (C1), German (C1), Luxembourgish (C2)
- Email: olijacklu@gmail.com
- LinkedIn: oliver-jack-41a998216