pandas is a popular Python-based data analysis toolkit that can be imported using:
import pandas as pd
It presents a diverse range of utilities that range from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.
pandas can help with the creation of multiple types of data analysis graphs. One such graph is a boxplot.
The default implementation of boxplot is:
DataFrame.boxplot(
column
= Noneby
= None,ax
= None,fontsize
= None,rot
= 0,grid
:bool = True,figsize
= None,layout
= None,return_type
= None,backend
= None, **kwargs)
column
: string, list of string - Column name or names can be a valid input to pandas.Dataframe.groupby()
.
by
: string, array - The column in the input to pandas.Dataframe.groupby()
. One boxplot is plotted per the value of the column.
ax
: object of class matplot.axes.Axes - The matplot axis to be used by a boxplot.
fontsize
: int or float - The font size of the label.
rot
: int or float - The degree by which the labels should be rotated.
grid
: bool - Whether or not to show the grid.
figsize
: tuple (width, height) - The size of the output image.
layout
: tuple (rows, columns) - The layout in which the output graphs must be, e.g., (4, 1) gives the figures in a single column and four rows.
return_type
: {‘axes’, ‘dict’, ‘both’} - The kind of object to return:
- ‘axes’ returns the matplot axes that the boxplot is drawn on
- ‘dict’ returns the dictionary that is in the matplotlib Lines of the boxplot
-‘both’ returns a named tuple with the axes and dict.
-when grouping with by
, a series mapping columns to return_type is returned.
- If return_type is None, a NumPy array of axes with the same shape as layout is returned.
backend
: str - Backend to use instead of the backend specified in the option plotting.backend (e.g., ‘matplotlib’). Alternatively, to set the plotting.backend for the whole session, set pd.options.plotting.backend.
**kwargs
: tuple (rows, columns) - All other plotting keyword arguments to be passed to matplotlib.pyplot.boxplot().
Let’s look at an example. Import the library and load the dataset in the data frame. Here, the dataset includes the marks of students for multiple subjects.
#import libraryimport pandas as pd#add csv file to dataframedf = pd.read_csv('dataset.csv')#create boxplotboxplot = df.boxplot(figsize = (5,5))
Similarly, we can rotate the labels, remove the grid, and increase font size.
#import libraryimport pandas as pd#add csv file to dataframedf = pd.read_csv('dataset.csv')#create boxplotboxplot = df.boxplot(figsize = (5,5), rot = 90, fontsize= '8', grid = False)
Free Resources