Skip to content

Use PySpark to perform the ETL process to extract the dataset, transform the data, connect to an AWS RDS instance, and load the transformed data into pgAdmin. Determine if there is any bias toward favorable reviews from Vine members in the dataset.

Notifications You must be signed in to change notification settings

slafton/Amazon_Vine_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Amazon Vine Analysis

Overview

Use PySpark to perform the ETL process to extract the dataset, transform the data, connect to an AWS RDS instance, and load the transformed data into pgAdmin. Determine if there is any bias toward favorable reviews from Vine members in the dataset.

Results

Number of Reviews

* Vine Reviews: 0

image

* Non-Vine Reviews: 403,807

image

5 Star Reviews

* 5 Star Vine Reviews: 0

* 5 Star Non-Vine Reviews: 242,889

Percentage of 5 Star Reviews

* Percentage of 5 Star Vine Reviews: 0.00%

* Percentage of 5 Star Non-Vine Reviews: 60.15%

Summary

From our results we have no reviews from the Vine Program from this dataset. This analysis does not show any bias toward favorable reviews from Vine members. For further analysis I would recommend computing the average of the star ratings.

About

Use PySpark to perform the ETL process to extract the dataset, transform the data, connect to an AWS RDS instance, and load the transformed data into pgAdmin. Determine if there is any bias toward favorable reviews from Vine members in the dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published