Skip to content
View OmkarSawant23's full-sized avatar

Organizations

@RMDS-GroupProject

Block or report OmkarSawant23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
OmkarSawant23/README.md

👋 Hi, I'm Omkar Sawant

🎓 Data Scientist | Machine Learning Enthusiast | AI Innovator 📍 Based in London, UK

Welcome to my GitHub! I’m a passionate Data Scientist with a Master’s degree in Data Science, driven by curiosity and a desire to turn data into actionable insights. I love exploring how AI and analytics can solve real-world problems from building predictive models and time series forecasting systems to developing NLP and generative AI solutions.


🧠 About Me

  • 🎓 Master’s in Data Science with hands-on experience across the entire ML lifecycle from data cleaning and feature engineering to model deployment and interpretation.
  • 💡 Fascinated by the intersection of AI, business strategy, and decision intelligence.
  • 🧮 Experienced in machine learning, deep learning (Transformers, LSTMs), NLP, and statistical modeling.
  • 🌱 Currently expanding my expertise in Generative AI and MLOps for scalable production systems.
  • 🎯 My goal: to use data science not just for prediction, but for impact, creating smarter, more sustainable, and human-centered solutions.

🔬 Technical Skills

Languages: Python, SQL, R Libraries & Frameworks: NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch, Hugging Face, Statsmodels Data Visualization: Matplotlib, Seaborn, Plotly, Tableau, Power BI Databases & Cloud: MySQL, AWS, GCP (familiarity) Other Tools: Git, Jupyter, Streamlit, Excel, Airflow (conceptual)


📊 Featured Projects

🛒 Supermarket Sales Forecasting using Time Series Models

Predicting product-level sales using ARIMA, SARIMAX, and LSTM models to improve inventory management, reduce waste, and meet customer demand. 🔹 Techniques: Time Series Analysis, Feature Engineering, Model Evaluation 🔹 Tools: Python, Pandas, Statsmodels, TensorFlow

🧠 Neural Machine Translation with Transformers

Developed an end-to-end Transformer-based NMT model using the OPUS Books dataset to translate between English and French, leveraging Hugging Face and PyTorch. 🔹 Techniques: NLP, Sequence-to-Sequence Models, Attention Mechanism 🔹 Tools: Python, PyTorch, Hugging Face, BLEU Evaluation

🚀 Group Project: Kaggle – Spaceship Titanic Classification Challenge

Worked collaboratively to build machine learning models predicting passenger survival on an interstellar voyage. 🔹 Conducted data cleaning, feature extraction, missing value imputation 🔹 Engineered features such as total expenditure, deck-based grouping, and cabin structure 🔹 Tested multiple models (Logistic Regression, Random Forest, XGBoost) 🔹 Evaluated performance using accuracy, ROC-AUC, and cross-validation 🔹 Built explainability plots to interpret model decisions

This project strengthened my teamwork, version control, and structured experimentation skills — while reinforcing best practices in the ML lifecycle.


🧩 What I’m Currently Working On

  • Exploring LLM fine-tuning for domain-specific text generation and summarization.
  • Building data dashboards to visualise performance metrics interactively using Streamlit.
  • Learning MLOps tools for end-to-end pipeline deployment.

🕹️ Beyond Data

When I’m not coding, you’ll probably find me: 🎮 Gaming on my PS5 (big fan of story-driven titles), 🎨 Sketching and exploring creative design, or


📫 Get in Touch

💼 LinkedIn 📧 omkarssss1414@gmail.com 🧠 Always open to collaborations, research discussions, or interesting AI projects.


"Data by itself is just noise — insight is what turns it into a story that drives change."


Pinned Loading

  1. Advanced-Research-Topics Advanced-Research-Topics Public

    A comprehensive time series forecasting project applying both classical statistical models (ARIMA) and deep learning methods (LSTM) to Johnson & Johnson EPS and Amazon stock price data. Includes ex…

    Jupyter Notebook 1

  2. Assignment_2LLM Assignment_2LLM Public

    A full German→English translation pipeline built using the OPUS Books dataset and a Transformer-based neural machine translation model. The project covers dataset exploration, preprocessing, tokeni…

    Jupyter Notebook

  3. Major_Project Major_Project Public

    A time series forecasting project comparing ARIMA, SARIMAX, LSTM, DES, and TES models to predict supermarket beverage sales (beer, wine, liquor). Includes data preprocessing, model training, evalua…

    Jupyter Notebook 1

  4. Research-Methods-in-data-science Research-Methods-in-data-science Public

    A complete PyTorch pipeline for training and evaluating Mask R-CNN and semantic segmentation models on the COCO dataset. Includes dataset loading, EDA visualizations, custom dataset classes, mask g…

    Jupyter Notebook 1

  5. DHV_infographics_project DHV_infographics_project Public

    A Python-based E-Commerce analytics dashboard that merges multiple dimensional datasets, filters sales data for 2021, and visualizes insights such as top suppliers, best-selling products, payment m…

    Jupyter Notebook 1

  6. AP_Assignment_2 AP_Assignment_2 Public

    A Python project that analyzes World Bank indicator datasets (1990–2020) for selected countries. It performs statistical analysis, generates line plots, bar charts, and correlation heatmaps, and vi…

    Python