The var()
function in pandas obtains the variance of the values of a specified axis of a given DataFrame.
Mathematically, variance is defined as the measure of the spread between the values of a data set.
It takes the formula below:
S2 =
Where:
In another context, the variance of a dataset is given as √standard deviation
. That is, the square root of the standard deviation.
The var()
function takes the following syntax:
DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
The var()
function takes the following optional parameter values:
axis
: This represents the name of the row (designated as 0
or 'index'
) or the column (designated as 1
or columns
) axis.skipna
: This takes a boolean value indicating whether NA or null values are to be excluded.ddof
: This takes an int
that represents the delta degrees of freedom. numeric_only
: This takes a boolean value indicating whether to include only float, int, or boolean columns.**kwargs
: This is an additional keyword argument that can be passed to the function.The var()
function returns a DataFrame object holding the results.
# A code to illustrate the var() function in Pandas# Importing the pandas libraryimport pandas as pd# Creating a DataFramedf = pd.DataFrame([[1,2,3,4,5],[1,7,5,9,0.5],[3,11,13,14,12]],columns=list('ABCDE'))# Printing the DataFrameprint(df)# Obtaining the median value vertically across rowsprint(df.var())# Obtaining the median value horizontally over columnsprint(df.var(axis="columns"))
pandas
library.df
.df
.var()
function, we obtain the variance of the values that run downwards across the rows (axis 0
). We print the result to the console.var()
function, we obtain the variance of values that run horizontally across columns (axis 1
). We print the result to the console.