Pandas is a Python library used to perform operations on data. It provides the plot
function that helps in data visualization through different graphs. It is built on top of the data visualization library called Matplotlib, which means that when we use the pandas plot
function internally, it uses Matplotlib to create the visualizations (graphs).
In this Answer, we will explore the implementation of graphs using the plot
function.
plot()
syntaxThe plot function has the following syntax:
DataFrame.plot(<parameters...>)
The <parameters...>
that plot function takes are:
x
: We specify the x-axis of the graph by passing a DataFrame column label.
y
: We specify the y-axis of the graph by passing a DataFrame column label.
kind
: We pass the type of graph as a string that we want to create. The available options are:
line
: A line graph.
bar
: A vertical bar graph.
barh
: A horizontal bar graph.
hist
: A histogram plot.
box
: A box plot.
kde
: A kernel density estimation graph.
density
: Same as the kernel density estimation graph.
area
: An area plot.
pie
: A pie chart.
scatter
: A scatter plot that is only used for data frames.
hexbin
: A hexbin plot that is only used for data frames.
ax
: We define the axes of the graph on which is to be plotted.
subplots
: We define whether to group columns into subplots or not. If it is set to True
, we can make subplots by defining the nrows
and ncols
. Otherwise, no subplots will be created.
title
: We define the title, which is to be displayed at the top of the graph. In the case of subplots, we pass a list of titles that prints corresponding to each plot.
grid
: We set the bool to True
if we want to show grid lines on the axis. By default, it is set to None.
legend
: We set the bool value to True
if we want to show legend on the axis subplots.
logx
: We define the bool value to use log scaling on the x-axis.
logy
: We define the bool value to use log scaling on the y-axis.
loglog
: We define the bool value to use log scaling on both the x-axis and the y-axis.
xticks
: We define a sequence of xticks
values.
yticks
: We define a sequence of yticks
values.
xlim
: We define it to set the limit of the x-axis.
ylim
: We define it to set the limit of the y-axis.
xlabel
: We use it to define the label for the x-axis.
ylabel
: We use it to define the label for the y-axis.
rot
: We use it to rotate the tick values.
fontsize
: We use it to set the font size of yticks
and xticks
values.
colormap
: We use it to set the colors from Matplotlib.
colorbar
: We set it to True
to plot a colorbar on the graph.
position
: We use it to align the bar plot layout. It takes a value of 0 (left or bottom-end) to 1 (right or top-end). By default, the value is set to 0.5 to align the plot in the center.
table
: We set it to True
to draw a table under the plot using the DataFrame data.
yerr
: We pass in a dictionary, list, string, DataFrame, or Series to define uncertainty on each point along the y-axis.
xerr
: We pass in a dictionary, list, string, DataFrame, or Series to define uncertainty on each point along the x-axis.
stacked
: We use it to create a stacked plot. By default, it is set to False for inline and bar plots and True for area plots.
secondary_y
: We pass a bool or sequence to plot a secondary y-axis on the right of the plot.
mark_right
: We set it to true to display "(right)" in the legend with the seconday-y axis column label.
include_bool
: We set it to True
to plot boolean values.
Let's look into how we can use the pandas plot()
function to draw a couple of graphs.
A line graph
is created by joining the coordinates on a graph through a straight line. By default, when we use the plot function, it draws a line graph. Below we can see the code to display the line graph.
import pandas as pdimport matplotlib.pyplot as pltdf = pd.DataFrame({"Car" : ["Toyota" , "Honda" , "Nissan" , "Audi"],"Price ($)": [20000 , 25000, 43000 , 50000]})df.plot(x = "Car" , y = "Price ($)", ylabel = "Price ($)" , xlabel = "Car Model" , title = "Cars Prices")
Line 1: We import the pandas
library to create the DataFrame
.
Line 2: We import the Matplotlib's pyplot
module.
Line 4–7: We create a DataFrame with two columns Car
and Price ($)
. The data frame is stored in the df
variable.
Line 9: We call the df.plot()
function and pass in the x
and y
parameters to set the x-axis and y-axis, respectively. Further, we pass in the xlabel
and ylabel
parameters to set the labels for the x-axis and y-axis. The title
parameter sets the title of the graph. We can use it by setting the kind
parameter to line
and then pass the values just as shown below in scatter graph example.
A scatter graph is created by displaying coordinate points on the graph. Click the "Run" button below to view the scatter graph.
import pandas as pdimport matplotlib.pyplot as pltdf = pd.DataFrame({"Car" : ["Toyota" , "Honda" , "Nissan" , "Audi"],"Price ($)": [20000 , 25000, 43000 , 50000]})df.plot(kind = 'scatter', x = "Car" , y = "Price ($)")
Line 9: We call the df.plot()
function and set the kind
parameter to scatter
. We pass in the x
and y
parameters to set the x-axis and y-axis, respectively. Further, we pass in the xlabel
and ylabel
parameters to set the labels for the x-axis and y-axis. The title
parameter sets the title of the graph.
A bar graph displays data using vertically aligned rectangular bars of varying lengths. We can click the "Run" button below to view the bar graph.
import pandas as pdimport matplotlib.pyplot as pltdf = pd.DataFrame({"Car" : ["Toyota" , "Honda" , "Nissan" , "Audi"],"Price ($)": [20000 , 25000, 43000 , 50000]})df.plot(kind = 'bar' , x = "Car" , y = "Price ($)")
Line 9: We call the df.plot()
function and set the kind
parameter to bar
. We pass in the x
and y
parameters to set the x-axis and y-axis, respectively. Further, we pass in the xlabel
and ylabel
parameters to set the labels for the x-axis and y-axis. The title
parameter sets the title of the graph.
In conclusion, pandas provides us with the plot()
function to easily visualize our data. We can play around with the function's parameters and create different graphs according to our use cases.
Free Resources