It’s versatile, highly customizable, and integrates well with libraries like pandas and NumPy.
Key takeaways:
Matplotlib supports a wide range of plot types, from basic charts to advanced 3D and animated plots.
Matplotlib offers extensive options to tailor plots and integrates well with libraries like pandas and NumPy.
Matplotlib is beginner-friendly with a simple interface with advanced features for professional-grade visualizations.
Matplotlib is a versatile Python library that empowers data scientists and analysts to create various visualizations. Matplotlib gives us the means to visualize data in a variety of ways, from straightforward line plots to complex 3D representations. Users can tailor plots to specific needs by leveraging its extensive customization options, enhancing data exploration and insight extraction.
Use the pip
command to install this library:
pip install matplotlib
pyplot
from matplotlibThe Matplotlib library contains the pyplot
module, which offers a MATLAB-like interface for making visualizations.
It offers a stateful approach, meaning that each function call modifies the current figure or axes. This makes it easy to create quick and simple plots without needing to explicitly create figure and axes objects.
import matplotlib.pyplot as plt
Comprehensive visualization tools: Matplotlib covers a wide variety of plot types and supports advanced features like subplots, annotations, and 3D visualizations.
Highly customizable: Create professional, publication-ready graphs by tweaking fonts, colors, line styles, and more.
Seamless integration: Works seamlessly with other libraries like NumPy, pandas, and seaborn for extended functionality.
Open source and widely supported: Free to use, with active community support and extensive documentation.https://how.dev/answers/how-to-create-a-line-chart-using-d3
We can create a whole variety of plots using Matplotlib, with some examples listed below:
A plot contains a few important elements that you can add using this library:
Adding a title: Sets the main title of the plot.
matplotlib.pyplot.title(label, fontdict=None, loc=’center’, pad=None, **kwargs)
Adding X and Y labels: Sets the x-axis- and y-axis labels to describe the data.
matplotlib.pyplot.xlabel(xlabel, fontdict=None, labelpad=None, **kwargs)matplotlib.pyplot.ylabel(ylabel, fontdict=None, labelpad=None, **kwargs)
Setting limits and tick labels: Defines the range of values displayed on the axes and customizes the tick marks and their labels.
matplotlib.pyplot.xticks([x1, x2, x3], ['label1', 'label2', 'label3'])matplotlib.pyplot.yticks([y1, y2, y3], ['label1', 'label2', 'label3'])
Adding legends: Creates a legend to identify different plot elements.
matplotlib.pyplot.legend(['label1', 'label2', 'label3'])
In Matplotlib, a line chart is a graphic depiction of data points joined by straight lines. It is helpful for displaying correlations, trends, and patterns among continuous variables or over time.
# Importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of variablesx=np.arange(0,10) #Array of range 0 to 9y=x**3# Printing the variablesprint(x)print(y)plt.plot(x,y) # Function to plotplt.title('Line Chart') # Function to give title# Functions to give x and y labelsplt.xlabel('X-Axis')plt.ylabel('Y-Axis')# Functionn to show the graphplt.show()
Line 18: This line generates a line plot, where x
and y
are plotted as continuous points connected by a line.
A multiple line chart in Matplotlib is a visualization technique used to compare trends of multiple datasets over a common x-axis.
# importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of 1 set of variablesx = np.arange(0,11)y = x**3# Generation of 1 set of variablesx2 = np.arange(0,11)y2 = (x**3)/2# Printing all variablesprint(x,y,x2,y2,sep="\n")# "linewidth" is used to specify the width of the lines# "color" is used to specify the colour of the lines# "label"is used to specify the name of axes to represent in the lengendplt.plot(x,y,color='r',label='first data', linewidth=5)plt.plot(x2,y2,color='y',linewidth=5,label='second data')plt.title('Multiline Chart')# Uses the label attribute to display reference in legendplt.ylabel('Y axis')plt.xlabel('X axis')# Shows the legend in the best postion with respect to the graphplt.legend()plt.show()
Lines 21–22: These lines plot multiple line plots with additional customization: color ('r'
, 'y'
), line width (linewidth=5
), and a legend (label='first data'
, label='second data'
).
A bar chart is a data visualisation in which various categories are represented by rectangular bars or columns. Each bar’s length reflects the value it stands for.
# Importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of variablesx = ["India",'USA',"Japan",'Australia','Italy']y = [6,7,8,9,2]# Printing the variablesprint(x)print(y)plt.bar(x,y, label='Bars1', color ='r') # Function to plot# Function to give x and y labelsplt.xlabel("Country")plt.ylabel("Inflation Rate%")# Function to give heading of the chartplt.title("Bar Graph")# Function to show the chartplt.show()
Line 14: This line generates a bar chart with bars represented by the x
and y
data points. The color is set to red (color='r'
) and a label is added for reference in a legend.
A multiple bar chart, also known as a grouped bar chart, is used to compare multiple categories across different groups. It’s particularly useful for visualizing comparisons between different groups or time periods.
# importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of 1 set of variablesx = ["India",'USA',"Japan",'Australia','Italy']y = [6,7,8,9,5]# Generation of 2 set of variablesx2 = ["India",'USA',"Japan",'Australia','Italy']y2 = [5,1,3,4,2]# Printing all variablesprint(x,y,x2,y2,sep="\n")# Functions to plotplt.bar(x,y, label='Inflation', color ='y')plt.bar(x2,y2, label='Growth', color ='g')# Functions to give x and y labelsplt.xlabel("Country")plt.ylabel("Inflation & Growth Rate%")plt.title("Multiple Bar Graph")plt.legend()plt.show()
Line 18–19: These lines generate multiple bar charts with different sets of data. Each bar chart is given a label (label='Inflation'
, label='Growth'
) and a different color ('y'
, 'g'
).
A histogram graphically represents the distribution of numerical data. It counts the number of data points in each bin after dividing the data into bins. The height of each bar in the histogram shows the frequency of data points within each bin.
import numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of variablestock_prices = [32,67,43,56,45,43,42,46,48,53,73,55,54,56,43,55,54,20,33,65,62,51,79,31,27]# Function to show the chartplt.figure(figsize = (8,5))plt.hist(stock_prices, bins = 5)
Line 11: This line creates a histogram of the stock_prices
data. It divides the data into 5 bins (bins=5
), showing the frequency distribution.
Data points are represented graphically on a two-dimensional plane in a scatter plot. It’s helpful for illustrating how two numerical variables relate to one another. On the plot, each data point is represented by a dot, whose location is established by its x- and y- coordinates.
# Importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generation of x and y variablesx = [1,2,3,4,5,6,7,8]y = [5,2,4,2,1,4,5,2]# Function to plot the graphplt.scatter(x,y)plt.xlabel('x')plt.ylabel('y')plt.title('Scatter Plot')
Line 11: This line generates a scatter plot, where individual points are plotted based on their coordinates (x
and y
).
A pie chart is a circular diagram with slices that each show a different percentage of the total. It’s useful for visualizing categorical data and showing the relative sizes of different categories.
# Importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Collection of raw dataraw_data={'names':['Nick','Sani','John','Rubi','Maya'],'jan_score':[123,124,125,126,128],'feb_score':[23,24,25,27,29],'march_score':[3,5,7,6,9]}# Segregating the raw data into usuable form/variablesdf=pd.DataFrame(raw_data,columns=['names','jan_score','feb_score','march_score'])df['total_score']=df['jan_score']+df['feb_score']+df['march_score']# Printing the dataprint(df)# Function to plot the graphplt.pie(df['total_score'],labels=df['names'],autopct='%.2f%%')plt.axis('equal')plt.axis('equal')plt.show()
Line 20: This line creates a pie chart, where each slice represents the total_score
of each individual, with the names labeled, and the percentage is displayed (autopct='%.2f%%'
).
Using subplots, you can create several plots inside a single figure. This is helpful for visualizing several variables, comparing different datasets, and decomposing complex data into smaller, more focussed plots.
# Importing required librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Defining the sixe og the figuresplt.figure(figsize=(10,10))# Generation of variablesx = np.array([1,2,3,4,5,6,7,8])y = np.array([5,2,4,2,1,4,5,2])# Generating 4 subplots in form of 2x2 matrix# In the line below the arguments of plt.subplot are as follows:# 2- no. of rows# 2- no. of columns# 1- position in matrix# Position (0,0)plt.subplot(2,2,1)plt.plot(x,y,'g')plt.title('Sub Plot 1')plt.xlabel('X-Axis')plt.ylabel('Y-Axis')# Position (0,1)plt.subplot(2,2,2)plt.plot(y,x,'b')plt.title('Sub Plot 2')plt.xlabel('X-Axis')plt.ylabel('Y-Axis')# Position (1,0)plt.subplot(2,2,3)plt.plot(y*2,x*2,'y')plt.title('Sub Plot 3')plt.xlabel('X-Axis')plt.ylabel('Y-Axis')# Position (1,1)plt.subplot(2,2,4)plt.plot(x*2,y*2,'m')plt.title('Sub Plot 4')plt.xlabel('X-Axis')plt.ylabel('Y-Axis')# Function for layout and spacingplt.tight_layout(h_pad=5, w_pad=10)
Line 19: This line creates a grid of subplots (2 rows and 2 columns) in the same figure. Each subplot contains a different plot, and plt.subplot()
is used to specify the position of the plot within the grid.
Elevate your data science expertise with “Matplotlib for Python: Visually Represent Data with Plots.” Learn to craft stunning plots, manage axes, and create intricate layouts to showcase your data insights.
Matplotlib is a robust and flexible library for data visualization in Python. Its extensive customization options, compatibility with other libraries, and range of visualization types make it an essential tool for anyone working with data. Whether you’re a beginner exploring simple plots or an expert creating complex visualizations, Matplotlib has you covered.
Haven’t found what you were looking for? Contact Us