The var() function in pandas obtains the variance of the values of a specified axis of a given DataFrame.
Mathematically, variance is defined as the measure of the spread between the values of a data set.
It takes the formula below:
S2 =
Where:
In another context, the variance of a dataset is given as √standard deviation. That is, the square root of the standard deviation.
The var() function takes the following syntax:
DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
The var() function takes the following optional parameter values:
axis: This represents the name of the row (designated as 0 or 'index') or the column (designated as 1 or columns) axis.skipna: This takes a boolean value indicating whether NA or null values are to be excluded.ddof: This takes an int that represents the delta degrees of freedom. numeric_only: This takes a boolean value indicating whether to include only float, int, or boolean columns.**kwargs: This is an additional keyword argument that can be passed to the function.The var() function returns a DataFrame object holding the results.
# A code to illustrate the var() function in Pandas# Importing the pandas libraryimport pandas as pd# Creating a DataFramedf = pd.DataFrame([[1,2,3,4,5],[1,7,5,9,0.5],[3,11,13,14,12]],columns=list('ABCDE'))# Printing the DataFrameprint(df)# Obtaining the median value vertically across rowsprint(df.var())# Obtaining the median value horizontally over columnsprint(df.var(axis="columns"))
pandas library.df.df.var() function, we obtain the variance of the values that run downwards across the rows (axis 0). We print the result to the console.var() function, we obtain the variance of values that run horizontally across columns (axis 1). We print the result to the console.