How to obtain the variance over a specified axis in pandas

Overview

The var() function in pandas obtains the variance of the values of a specified axis of a given DataFrame.

Mathematically, variance is defined as the measure of the spread between the values of a data set.

It takes the formula below:

S2 =Σ(xix)n1\frac{Σ(xi -x)}{n-1}

Where:

  • S2 = variance
  • xi = value of the dataset
  • x = the number of values in the dataset

In another context, the variance of a dataset is given as √standard deviation. That is, the square root of the standard deviation.

Syntax

The var() function takes the following syntax:

DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
Syntax for the var() function in Pandas

Parameter values

The var() function takes the following optional parameter values:

  • axis: This represents the name of the row (designated as 0 or 'index') or the column (designated as 1 or columns) axis.
  • skipna: This takes a boolean value indicating whether NA or null values are to be excluded.
  • ddof: This takes an int that represents the delta degrees of freedom.
  • numeric_only: This takes a boolean value indicating whether to include only float, int, or boolean columns.
  • **kwargs: This is an additional keyword argument that can be passed to the function.

Return value

The var() function returns a DataFrame object holding the results.

Example

# A code to illustrate the var() function in Pandas
# Importing the pandas library
import pandas as pd
# Creating a DataFrame
df = pd.DataFrame([[1,2,3,4,5],
[1,7,5,9,0.5],
[3,11,13,14,12]],
columns=list('ABCDE'))
# Printing the DataFrame
print(df)
# Obtaining the median value vertically across rows
print(df.var())
# Obtaining the median value horizontally over columns
print(df.var(axis="columns"))

Explanation

  • Line 4: We import the pandas library.
  • Lines 7–10: We create a DataFrame, df.
  • Line 12: We print df.
  • Line 15: Using the var() function, we obtain the variance of the values that run downwards across the rows (axis 0). We print the result to the console.
  • Line 18: Using the var() function, we obtain the variance of values that run horizontally across columns (axis 1). We print the result to the console.
New on Educative
Learn to Code
Learn any Language as a beginner
Develop a human edge in an AI powered world and learn to code with AI from our beginner friendly catalog
🏆 Leaderboard
Daily Coding Challenge
Solve a new coding challenge every day and climb the leaderboard

Free Resources