Skip to content

mppuerta/ih_datamadpt0420_project_m2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Ironhack Data Analytics Module 2 Project

Diamonds dataset analysis to see how different parameters affect to price. This analysis will later be used to build a model that estimates price.


💻 Job done

Previous analysis on python and data processing. Dashboard building in Tableau to show main conclussions.

💥 Main Conclussions

I've made two main analysis:

  • For numerical variables: at first, I will just consider for the model carat. Price has also big correlation with x, y and z but those also are correlated with carat.

  • For categorical variables: I didn't get a clear conclussion analysing Cut, Color and Clarity sepparated. I created groups for each parameter of Cut, Color and Clarity and represented it's relationship between carat and price. I saw there is a linear relationship where the line's slope is almost constant and the only thing that variates is the function's displacement.

Next steps: estimate the line's formula for each subgroup created so the price can be estimated very accurately.

🔧 Technology Stack and Configuration

Used Python, libraries needed: Numpy, Pandas, Matplotlib and Seaborn. Also used Tableau.

Database used can be found in the following link: https://www.kaggle.com/shivam2503/diamonds

Tableau Dashboard can be seen here: https://public.tableau.com/profile/marta.p.rez.puerta#!/vizhome/DiamondsDataset/DiamondsDashboard

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors