Skip to content

Ashishbadal-source/Data_Science_Module_2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

🚀 Data_Science_Module_2

A comprehensive Data Science & Machine Learning foundation repository built using Python.
This repo focuses on hands-on analysis, real datasets, and practical ML workflows.

🔗 All notebooks are directly runnable on Google Colab — no local setup required.


📌 Overview

This repository demonstrates end-to-end data analysis, statistics, and machine learning techniques, covering everything from raw data exploration to clustering and decision tree models.

Designed as a learning + portfolio repository, not just theory.


🧠 Topics Covered

  • Data Visualization & EDA
  • Statistical Analysis & Probability
  • Missing Data Handling
  • Data Integration
  • Clustering (K-Means, Hierarchical)
  • PCA (Dimensionality Reduction)
  • Decision Trees
  • Hypothesis Testing
  • Domain-based analysis (Movies, Music, Retail)

📓 Notebooks (Open in Google Colab)

📊 Data Analysis & Statistics


🤖 Machine Learning & Data Mining


📁 Dataset Collection

📦 Additional_Datasets

Includes:

  • Movie datasets
  • Music datasets (ragas, emotions, mental health tags)
  • Retail & transactional datasets
  • Classification & clustering datasets

Used across notebooks for realistic analysis.


🛠️ Tech Stack

  • Python
  • NumPy
  • Pandas
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • Jupyter Notebook / Google Colab

🎯 What This Repo Shows

✔ Practical Data Science skills
✔ Clean analytical workflows
✔ ML concepts applied on real data
✔ Portfolio-ready notebooks
✔ Industry-style experimentation


▶️ Run Online (No Setup)

All notebooks can be opened and executed directly in Google Colab using the links above.


👤 Author

Ashish Kumar
GitHub: https://github.com/Ashishbadal-source


⭐ If this repository helped you or inspired you, consider giving it a star!