The aim of this project is to do a exploratory analysis of the Diamonds dataset, which contains 40.455 diamonds records with different attributes, such as cut, carat, color and clarity, among other variables.
The complementary part of the analysis is to create a dashboard in Tableau, which will synthesize the most important information related to diamonds in a more insightful manner.
- Challenge 1_Exploratory Analysis: You will find a notebook with the exploratory analysis carried out to understand the data and make inferences on diamonds price.
- Challenge 2_Dashboard: You will find a link to a Tableau workbook and a copy of it in a txt file with a link to Tableau Public.
- Data: Here you you will find the csv file of the diamond dataset
- price: price in US dollars ($326--$18,823)
- carat: weight of the diamond (0.2--5.01)
- cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal)
- color: diamond colour, from J (worst) to D (best)
- clarity: a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
- x: length in mm (0--10.74)
- y: width in mm (0--58.9)
- z: depth in mm (0--31.8)
- depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)
- table: width of top of diamond relative to widest point (43--95)
The core of the project is in Python 3.7.3, in order to run the project you have to install the following libraries:
- Pandas (v.0.24.2)
- Numpy (v.1.18.1)
- Matplotlib(v.3.2.1)
- Seaborn(v.0.10.1)