- What is Data Visualization?
- Why Data Visualization?
- What is Matplotlib?
- Types of Plots
Data visualization is the presentation of data in a pictorial or graphical form.
- To present data in more understandable form.
- It allows us to quickly interpret the data and adjust different variables to see their effect.
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
A line graph is commonly used to display change over time. The line graph therefore helps to determine the relationship between two sets of values, with one data set always being dependent on the other set. Example:
# Importing Libraries
import matplotlib.pyplot as plt
import numpy as np
# Data for plotiing Line Plot
x = np.arange(1,51)
y = 2*x
# Plotting Line Plot
plt.plot(x,y, color='green', linewidth=5, linestyle=":", marker='o')
plt.title("Line Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()
# Plotting Multiple line plots
x1 = np.arange(1,11)
y1 = 2*x1 + 5
y2 = 3*x1 + 4
plt.plot(x1,y1, color='green',linestyle='dashed',linewidth=2, marker='o')
plt.plot(x1,y2,color='red', linewidth=2, marker='*')
plt.grid(True, color='black')
plt.show()A scatter plot is a chart type that is normally used to observe and visually display the relationship between variables. Example:
x = np.arange(1,11)
y1 = x**4
y2 = x**3 + 7
plt.style.use('seaborn')
plt.figure(figsize = (6,6))
plt.scatter(x,y1, color='red', label='Plot 1')
plt.scatter(x,y2, color='yellow', label='Plot 2')
plt.legend()
plt.show()- Bar plots are used to see distribution of categorical data.
- Bar plots are used to compare things between different groups or to track changes over time.
Example:
# Simple Bar plot
students = {"Rahul":40, "Neha":70, "Shubham":90}
names = list(students.keys())
marks = list(students.values())
plt.bar(names,marks, color='skyblue')
plt.xlabel('Student Names')
plt.ylabel('Marks')
plt.show()
# Comparison Bar plots
plt.style.use('seaborn')
x_coord = np.array([1, 2, 3, 4])
x1 = x_coord - 0.125
x2 = x_coord + 0.125
y1 = np.array([10,15,20,25])
y2 = np.array([15,10,18,26])
xlabels = ['Gold','Silver','Bronze','Platinum']
plt.bar(x1, y1, width = 0.25, tick_label = xlabels, color = 'yellow',label = 2019)
plt.bar(x2,y2,width = 0.25,color = 'blue',label = 2020)
plt.xlabel('Metals')
plt.ylabel('Price')
plt.title('Metal Prices')
plt.legend(fontsize = 15)
plt.show()- Pie charts is used to have a general sense of the part-to-whole relationship in your data.
- To convey that one segment of the total is relatively less or more important. Example:
# Simple pie plot
data = np.array([35,25,25,15])
plt.pie(data)
plt.show()
population = np.array([90,80,40,98,20])
countries = ["India","China","America","Japan","Pakistan"]
mycolors = ["orange","red","skyblue","blue","green"]
spacing = (0.1,0,0.1,0,0)
plt.figure(figsize = (8,8))
plt.pie(population, labels=countries, explode=spacing, shadow = True, colors=mycolors, autopct= '%1.1f%%')
plt.title('Countries Population')
plt.legend()
plt.show()Note: By default the plotting of the first wedge starts from the x-axis and move anticlockwise.
- A histogram is a graph showing frequency distributions.
- It is a graph showing the number of observations within each given interval. Example:
X_standardNormal = np.random.randn(100)
sigma = 5
u = 75
X = np.round(sigma*X_standardNormal + u)
Y = np.round(sigma*X_standardNormal + u-20)
plt.hist(X,label = 'maths',color = 'blue',alpha = 0.3)
plt.hist(Y,label = 'physics',color = 'red',alpha = 0.2)
plt.title('Marks of the students')
plt.xlabel('marks')
plt.ylabel('freq. of number of students')
plt.legend()
plt.show()