pandas is a popular Python-based data analysis toolkit that can be imported using:
" import pandas as pd
"
It presents a diverse range of utilities from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.
pandas can help with the creation of multiple types of data analysis graphs. One such graph is the
The default implementation of histogram is:
DataFrame.hist(
column
= Noneby
= None,grid
:bool = True,xlabelsize
= None,xrot
= None,ylabelsize
= None,yrot
= None,ax
= None,sharex
= False,sharey
= False,figsize
= None,layout
= None,bins
= 10,backend
= None,legend
:bool = False, **kwargs)
column
: string, list of string - The columns that should be plotted.
by
: object - Used to form histograms for separate groups.
grid
: bool - Whether or not to show the axis grid lines.
xlabelsize
: int - The fontsize of the x-axis labels.
xrotsize
: float - The rotation for the x-axis labels.
ylabelsize
: int - The fontsize of the y-axis labels.
yrotsize
: float - The rotation for the y-axis labels.
ax
: Matplotlib axes object - The axis on which to plot the histogram.
sharex
: bool, default true if ax is True - In case subplots = True, share x-axis labels and set some names to invisible.
sharey
: bool - In case subplots = True, share y-axis labels and set some names to invisible.
figsize
: tuple (width, height) - The size of the output image.
layout
: tuple (rows, columns) - The layout in which the output graphs must be, for example, (4, 1) gives the figures in a single column and four rows.
bins
: int or sequence - Number of histogram bins to be used.
backend
: str - Backend to use instead of the backend specified in the option plotting.backend. For instance, ‘matplotlib.’ Alternatively, set pd.options.plotting.backend to determine the plotting.backend for the whole session.
legend
: bool - Whether or not to show the legend.
**kwargs
: tuple (rows, columns) - All other plotting keyword arguments to be passed to matplotlib.pyplot.hist().
The following code shows how histograms can be added in Python. You can change different parameters and look at how the output varies.
#import libraryimport pandas as pd#add csv file to dataframedf = pd.read_csv('dataset.csv')#create histogramhistogram = df.hist(bins = 7)
Free Resources