Prediction Using Supervised ML

Project Overview

This project is part of the Graduate Rotational Internship Program at The Spark Foundation. It consists of two main tasks:

Prediction Using Supervised Machine Learning
Species Segmentation Using K-Means Clustering

Author

Oluwashina Dedenuola

Libraries Used

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from sklearn.linear_model import LinearRegression
%matplotlib inline

Dataset

The dataset used in this task was provided by the Spark Foundation.

Steps

Data Loading: The data is imported and displayed successfully.
Data Analysis: Descriptive statistics of the data are generated.
Data Visualization: A scatter plot is created to show the relationship between hours studied and scores obtained.
Regression Model: A Linear Regression model is created to predict scores based on hours studied.
Predictions: The model is used to make predictions for given hours of study.
Visualization of Regression Line: The regression line is plotted on the scatter plot to visualize the predictions.

Results

The model shows a high R-squared value of approximately 0.953, indicating a strong relationship between study hours and scores.

Task 2: Species Segmentation Using K-Means Clustering

Libraries Used

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import KMeans

Dataset

The dataset used for this task is the Iris dataset, which contains measurements of various features of different Iris species.

Steps

Data Loading: The data is loaded from a CSV file provided by the Spark Foundation.
Data Mapping: The species names are mapped to numerical values for clustering.
Data Visualization: Scatter plots are created to visualize the distribution of data points based on sepal length and width.
Clustering: K-Means clustering is performed, and the optimal number of clusters is determined using the Elbow method.

Results

The clusters are visualized, providing insights into the distribution and segmentation of different Iris species.

Conclusion

This project demonstrates the application of supervised and unsupervised machine learning techniques in predictive analysis and clustering. It provides a practical understanding of regression analysis and clustering methods.

Future Work

Further improvements could include:

Testing additional algorithms for better accuracy.
Applying feature scaling techniques for better clustering results.

Acknowledgments

The Spark Foundation for the opportunity to participate in the Graduate Rotational Internship Program.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Prediction using Unsupervised ML.ipynb		Prediction using Unsupervised ML.ipynb
README.md		README.md
ld_README.md		ld_README.md
linearmodelling.ipynb		linearmodelling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prediction Using Supervised ML

Project Overview

Author

Libraries Used

Dataset

Steps

Results

Task 2: Species Segmentation Using K-Means Clustering

Libraries Used

Dataset

Steps

Results

Conclusion

Future Work

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prediction Using Supervised ML

Project Overview

Author

Libraries Used

Dataset

Steps

Results

Task 2: Species Segmentation Using K-Means Clustering

Libraries Used

Dataset

Steps

Results

Conclusion

Future Work

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages