How to obtain the variance over a specified axis in pandas

Overview

The var() function in pandas obtains the variance of the values of a specified axis of a given DataFrame.

Mathematically, variance is defined as the measure of the spread between the values of a data set.

It takes the formula below:

S2 =Σ(xix)n1\frac{Σ(xi -x)}{n-1}

Where:

  • S2 = variance
  • xi = value of the dataset
  • x = the number of values in the dataset

In another context, the variance of a dataset is given as √standard deviation. That is, the square root of the standard deviation.

Syntax

The var() function takes the following syntax:

DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
Syntax for the var() function in Pandas

Parameter values

The var() function takes the following optional parameter values:

  • axis: This represents the name of the row (designated as 0 or 'index') or the column (designated as 1 or columns) axis.
  • skipna: This takes a boolean value indicating whether NA or null values are to be excluded.
  • ddof: This takes an int that represents the delta degrees of freedom.
  • numeric_only: This takes a boolean value indicating whether to include only float, int, or boolean columns.
  • **kwargs: This is an additional keyword argument that can be passed to the function.

Return value

The var() function returns a DataFrame object holding the results.

Example

# A code to illustrate the var() function in Pandas
# Importing the pandas library
import pandas as pd
# Creating a DataFrame
df = pd.DataFrame([[1,2,3,4,5],
[1,7,5,9,0.5],
[3,11,13,14,12]],
columns=list('ABCDE'))
# Printing the DataFrame
print(df)
# Obtaining the median value vertically across rows
print(df.var())
# Obtaining the median value horizontally over columns
print(df.var(axis="columns"))

Explanation

  • Line 4: We import the pandas library.
  • Lines 7–10: We create a DataFrame, df.
  • Line 12: We print df.
  • Line 15: Using the var() function, we obtain the variance of the values that run downwards across the rows (axis 0). We print the result to the console.
  • Line 18: Using the var() function, we obtain the variance of values that run horizontally across columns (axis 1). We print the result to the console.
New on Educative
Learn any Language for FREE all September 🎉
For the entire month of September, get unlimited access to our entire catalog of beginner coding resources.
🎁 G i v e a w a y
30 Days of Code
Complete Educative’s daily coding challenge every day in September, and win exciting Prizes.

Free Resources