Polars is a powerful library for data manipulation and analysis. It is designed to process and analyze large datasets more quickly and efficiently.
DataFrame.median()
functionThe DataFrame.median()
function in polars facilitates the computation of the median across columns or specific columns within a DataFrame.
The median is computed by finding the middle value after sorting the list of numbers.
If there are an odd amount of numbers in the list, then the middle number is the median of the list.
If there are an even amount of numbers in the list, then the mean of the middle pair is the median of the list.
Here is the syntax for the median function:
DataFrame.median()
We can also compute the median of a specific column by specifying the column name in the median function like the following:
DataFrame[<column_name>].median()
The function returns a DataFrame, which contains the median values calculated for each numeric column. The resulting DataFrame will have a single row containing the median values for each column. If we specify the column name to compute the median of the specific column, it will return a single median value for that column.
Note: The non-numeric columns will be excluded from the computation.
Here's the coding example of the DataFrame.median()
function to calculate the median of numeric columns in polars:
import polars as pldf = pl.DataFrame({"Product": ["Note holder", "Scissors", "Stapler", "Paper clip"],"Price": [2, 1, 24, 25],"Quantity": [70, None, 30, 200],})# Computing the median of complete tableprint(df.median())# Computing the median of only one column, "Price"print("Median of the column with Prices: ", df["Price"].median())
Line 1: We import the polars
library as pl
.
Lines 2–9: We define our DataFrame as df
for the stationary shop with the product's name, price, and quantity columns.
Line 11: We use the df.median()
function to print the median of the complete table.
Line 14: We use the df.median()
function with the column name Price
to only print the median of the Price
column.
Understanding the central tendencies is very important in the field of data analysis and manipulation. The DataFrame.median()
in polars is a powerful tool that offers efficiency and robustness for computing medians in our data, making it a great addition to our data analysis toolkit.
Free Resources