🎓 Data Scientist | Machine Learning Enthusiast | AI Innovator 📍 Based in London, UK
Welcome to my GitHub! I’m a passionate Data Scientist with a Master’s degree in Data Science, driven by curiosity and a desire to turn data into actionable insights. I love exploring how AI and analytics can solve real-world problems from building predictive models and time series forecasting systems to developing NLP and generative AI solutions.
- 🎓 Master’s in Data Science with hands-on experience across the entire ML lifecycle from data cleaning and feature engineering to model deployment and interpretation.
- 💡 Fascinated by the intersection of AI, business strategy, and decision intelligence.
- 🧮 Experienced in machine learning, deep learning (Transformers, LSTMs), NLP, and statistical modeling.
- 🌱 Currently expanding my expertise in Generative AI and MLOps for scalable production systems.
- 🎯 My goal: to use data science not just for prediction, but for impact, creating smarter, more sustainable, and human-centered solutions.
Languages: Python, SQL, R Libraries & Frameworks: NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch, Hugging Face, Statsmodels Data Visualization: Matplotlib, Seaborn, Plotly, Tableau, Power BI Databases & Cloud: MySQL, AWS, GCP (familiarity) Other Tools: Git, Jupyter, Streamlit, Excel, Airflow (conceptual)
Predicting product-level sales using ARIMA, SARIMAX, and LSTM models to improve inventory management, reduce waste, and meet customer demand. 🔹 Techniques: Time Series Analysis, Feature Engineering, Model Evaluation 🔹 Tools: Python, Pandas, Statsmodels, TensorFlow
Developed an end-to-end Transformer-based NMT model using the OPUS Books dataset to translate between English and French, leveraging Hugging Face and PyTorch. 🔹 Techniques: NLP, Sequence-to-Sequence Models, Attention Mechanism 🔹 Tools: Python, PyTorch, Hugging Face, BLEU Evaluation
Worked collaboratively to build machine learning models predicting passenger survival on an interstellar voyage. 🔹 Conducted data cleaning, feature extraction, missing value imputation 🔹 Engineered features such as total expenditure, deck-based grouping, and cabin structure 🔹 Tested multiple models (Logistic Regression, Random Forest, XGBoost) 🔹 Evaluated performance using accuracy, ROC-AUC, and cross-validation 🔹 Built explainability plots to interpret model decisions
This project strengthened my teamwork, version control, and structured experimentation skills — while reinforcing best practices in the ML lifecycle.
- Exploring LLM fine-tuning for domain-specific text generation and summarization.
- Building data dashboards to visualise performance metrics interactively using Streamlit.
- Learning MLOps tools for end-to-end pipeline deployment.
When I’m not coding, you’ll probably find me: 🎮 Gaming on my PS5 (big fan of story-driven titles), 🎨 Sketching and exploring creative design, or
💼 LinkedIn 📧 omkarssss1414@gmail.com 🧠 Always open to collaborations, research discussions, or interesting AI projects.
"Data by itself is just noise — insight is what turns it into a story that drives change."