Data Science Student | Data analyst | Business Analyst | Machine Learning Engineer | Research Enthusiast
Data Science Student at S.I.E.S College of Arts, Science and Commerce (Autonomous), Mumbai
Graduating: 2026
I'm a data science practitioner who builds end-to-end ML solutions β from exploratory analysis and model development to statistical validation and production deployment. My work spans time series forecasting, natural language processing, predictive analytics, and interactive data applications.
- π¬ Research Areas: Deep Learning, Time Series, NLP, Sports Analytics
- π οΈ Technical Focus: PyTorch, Feature Engineering, Model Optimization, Deployment
- π Approach: Data-driven decision making with rigorous validation
- π― Passion: Solving real-world Business problems with AI and statistical methods
- πΌ Status: Open to ML Engineering, Data Science, and Research opportunities
24-hour AQI predictions across 10 Indian cities using Temporal Fusion Transformer
- Built synthetic dataset from 99.7% missing data using Gaussian Process Regression
- Achieved 96.5% RMSE reduction through iterative model optimization (v1 β v3)
- Deployed interactive Streamlit dashboard with probabilistic forecasts (Q10/Q50/Q90)
- Tech: PyTorch, TFT Architecture, Scikit-learn, Streamlit, Plotly
- Results: RΒ² = 0.9584, Statistically validated (p < 0.05)
π View Project | π Live Demo
Multi-model NLP architecture for email security using BERT and LSTM
- Implemented dual-model approach combining contextual embeddings (BERT) and sequential patterns (LSTM)
- Engineered features from email headers, body text, and metadata
- Built robust preprocessing pipeline for handling diverse email formats
- Tech: BERT, LSTM, PyTorch, NLP, Scikit-learn
- Application: Cybersecurity, Email Filtering
π View Project
Sports analytics and match outcome forecasting using historical data
- Analyzed historical FIFA World Cup data with statistical modeling
- Developed predictive models for match outcomes and tournament progression
- Feature engineering from team statistics, player performance, and historical matchups
- Tech: Python, Scikit-learn, Pandas, Statistical Analysis
- Domain: Sports Analytics, Predictive Modeling
π View Project
- Frameworks: PyTorch, TensorFlow, Scikit-learn, Keras
- Architectures: Transformers (TFT, BERT), LSTM, CNN, Ensemble Methods
- Techniques: Time Series Forecasting, NLP, Feature Engineering, Model Optimization
- Specialization: Uncertainty Quantification, Multi-horizon Forecasting, Transfer Learning
- Analysis: Pandas, NumPy, Statistical Methods, Hypothesis Testing
- Visualization: Matplotlib, Plotly, Seaborn, Power BI
- Preprocessing: Feature Engineering, Data Wrangling, Synthetic Data Generation
- Statistical Tools: Gaussian Processes, Bootstrap Methods, A/B Testing
- Databases: MySQL, Oracle, Advanced DBMS
- Web Development: HTML, CSS, JavaScript, Bootstrap
- Mobile: Android Studio, Java (Beginner)
|
|
|
|
- Advanced ML: Transformers (GPT, Vision Transformers), Graph Neural Networks
- MLOps: Model Monitoring, CI/CD for ML, Experiment Tracking
- Big Data: Distributed Computing, Spark, Scalable ML Systems
- Specialized Topics: Reinforcement Learning, Federated Learning, AutoML
- Time Series Forecasting: Multi-horizon predictions, Uncertainty quantification
- Natural Language Processing: Transformers, Sentiment analysis, Text generation
- Environmental AI: Climate modeling, Pollution forecasting, Sustainability applications
- Sports Analytics: Predictive modeling, Performance optimization
- Explainable AI: Model interpretability, Feature importance, Trust in ML
I'm actively seeking opportunities where I can apply my data science and machine learning skills to solve impactful real-world problems.
Looking for:
- Data Science / ML Engineering roles
- Research collaborations
- Open-source contributions
- Kaggle competitions
π§ Email: kaustubh.n007@gmail.com
π Resume: View Resume
π¨βπ» Portfolio: All Projects
π‘ "Data is the new oil, but insights are the refined fuel" π‘
