What is DataFrame.median() in polars?

Polars is a powerful library for data manipulation and analysis. It is designed to process and analyze large datasets more quickly and efficiently.

The DataFrame.median() function

The DataFrame.median() function in polars facilitates the computation of the median across columns or specific columns within a DataFrame.

The median is computed by finding the middle value after sorting the list of numbers.

  • If there are an odd amount of numbers in the list, then the middle number is the median of the list.

Median of a list with odd amount of numbers
Median of a list with odd amount of numbers
  • If there are an even amount of numbers in the list, then the mean of the middle pair is the median of the list.

Median of a list with even amount of numbers
Median of a list with even amount of numbers

Syntax

Here is the syntax for the median function:

DataFrame.median()

We can also compute the median of a specific column by specifying the column name in the median function like the following:

DataFrame[<column_name>].median()

Return value

The function returns a DataFrame, which contains the median values calculated for each numeric column. The resulting DataFrame will have a single row containing the median values for each column. If we specify the column name to compute the median of the specific column, it will return a single median value for that column.

Note: The non-numeric columns will be excluded from the computation.

Code

Here's the coding example of the DataFrame.median() function to calculate the median of numeric columns in polars:

import polars as pl
df = pl.DataFrame(
{
"Product": ["Note holder", "Scissors", "Stapler", "Paper clip"],
"Price": [2, 1, 24, 25],
"Quantity": [70, None, 30, 200],
}
)
# Computing the median of complete table
print(df.median())
# Computing the median of only one column, "Price"
print("Median of the column with Prices: ", df["Price"].median())

Explanation

  • Line 1: We import the polars library as pl.

  • Lines 2–9: We define our DataFrame as df for the stationary shop with the product's name, price, and quantity columns.

  • Line 11: We use the df.median() function to print the median of the complete table.

  • Line 14: We use the df.median() function with the column name Price to only print the median of the Price column.

Understanding the central tendencies is very important in the field of data analysis and manipulation. The DataFrame.median() in polars is a powerful tool that offers efficiency and robustness for computing medians in our data, making it a great addition to our data analysis toolkit.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved