Projects

Projects using Exploratory Data Analysis, Predictive Data Analysis, and Machine Learning techniques.

Projects

Exploratory Data Analysis and Predictive Analysis using R: Boston House Data

Overview

A simple EDA on the Boston House Data
The goal was to prepare data as best as we can for predictive analysis.
- This included checking dataset structure
- Remove any NA values in our columns
- Statistical summary
  - Min, Mean, Median, Max, Quartiles
- Visuals
  - Boxplot
    - Helped in visualizing our statistics and seeing where the outliers stood
  - Correlation plot
    - See the level of linear dependence between two variables
Predictive Analysis using Linear Regression
- Used 3 different Linear Models
  - Linear Model: Y = lm(Y ~ A, ...)
    - This model is a straight-line with an implicit y-interecept
  - Linear Model: Y = lm(Y ~ A + I(A^2), ...)
    - Polynomial model to find a relationship between independent variable and dependent variable
  - Linear Model: Y = lm(Y ~ A + B, ...)
    - First-order model in A, with no interaction terms

Let me know if you have an questions via email.

View full project here

❗IMPORTANT❗

If you are going to download the file make sure the following libraries are installed.

Libraries used in this Project:

For Visualization:

library(corrplot)
library(lattice)
library(ggplot2)
library(plotly)

For Data Splitting

library(dplyr)

Predictive Analysis using Python: Amazon Stock Price Prediction

Overview

A Predictive Analysis project involving Stock Market data
Goal was to prepare data to use for stock price prediction using a Machine Learning Algorithm
- This inlcuded using Yahoo Finance API
- Reviewing our data
- Visuals
  - Line plot to show Adj Close vs. Stock Prediction

Predictive Analysis using Support Vector Machine

Setup for the model:
kernel = 'rbf'
For our C value and gamma, I did a couple things:
- created different classifiers that ranged in values
- C Value range:
```
c_value = [0.1, 1.0, 10.0, 100.0, 1000.0]
```
- gamma value range:
```
gamma_Values = [1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7]
```
From here, I then got the best R^2 score from the best combination of values and used to create the SVm Regression model
The results are based from October 25, 2021 through 15 days out.
Here are the final results:

| Stock Price: $3116.55 |
  --------------------
| Stock Price: $3097.92 |
  --------------------
| Stock Price: $3031.01 |
  --------------------
| Stock Price: $3021.73 |
  --------------------
| Stock Price: $3024.18 |
  --------------------
| Stock Price: $3160.62 |
  --------------------
| Stock Price: $3168.1 |
  --------------------
| Stock Price: $3172.84 |
  --------------------
| Stock Price: $3166.63 |
  --------------------
| Stock Price: $3166.21 |
  --------------------
| Stock Price: $3179.58 |
  --------------------
| Stock Price: $3183.07 |
  --------------------
| Stock Price: $3189.64 |
  --------------------
| Stock Price: $3185.4 |
  --------------------
| Stock Price: $3182.98 |
  --------------------

I encourage you to try this out with different values and see what you get!

Let me know if you have an questions via email.

View the full project here.

❗IMPORTANT❗

If you are going to download the file make sure the following libraries are installed.

Libraries used in this Project:

Data Preprocessing/Manipulation

import pandas as pd
import numpy as np

Visualization

import matplotlib.pyplot as plt
import seaborn as sns

Stock Market Data from Yahoo Finance API

import yfinance as yf

Support Vector Machine Model

from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score
from sklearn.model_selection import train_test_split

Predictive Analysis using Linear Regression: Loan Status

Overview

A Predictive Analysis project involving loan data to predict whether or not someone can recieve a loan.
Goal was to prepare data to use for stock price prediction using a Machine Learning Algorithm
- This inlcuded using a dataset from Kaggle
  - We used 2 different datasets: Training/Testing
This process included:
- Preliminary Data Analysis
- Data Cleaning
- Exploratory Data Analysis
- Data Preprocessing for Modeling
- Machine Learning Implementation
Machine Learning Implementation: Logistic Regression
- Using logistic regression we looked at the followin to decided the best parameters:

parameters_log_reg = {
    'penalty' :['l2'],
    'C' : [0.01, 0.1, 1, 2, 10, 100]
}

Our results:
- Yes on a Loan
  - Applicant Income: 52k
  - Loan Amount 14k
  - Has Credit History
- no on a Loan
  - Applicant Income: 34k
  - Loan Amount: 13k
  - No credit history

❗IMPORTANT❗

If you are going to download the file make sure the following libraries are installed.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV


from sklearn.linear_model import LogisticRegression

Let me know if you have an questions via email.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
ChicagoCrime_pyspark		ChicagoCrime_pyspark
EDA_BostonHousing_Data		EDA_BostonHousing_Data
Extras		Extras
KNN_Boston_House		KNN_Boston_House
Loan_Prediction		Loan_Prediction
StockPrice_app		StockPrice_app
Stock_Price_Prediction		Stock_Price_Prediction
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contents

Projects

Exploratory Data Analysis and Predictive Analysis using R: Boston House Data

Predictive Analysis using Python: Amazon Stock Price Prediction

Predictive Analysis using Linear Regression: Loan Status

About

Uh oh!

Releases

Packages

Languages

iamAngelSH/Data-Science-Portfolio

Folders and files

Latest commit

History

Repository files navigation

Contents

Projects

Exploratory Data Analysis and Predictive Analysis using R: Boston House Data

Predictive Analysis using Python: Amazon Stock Price Prediction

Predictive Analysis using Linear Regression: Loan Status

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages