Polars is a fast and efficient data manipulation library written in Rust.
It’s designed to provide high-performance operations on large datasets and handles them more quickly than pandas
. It’s particularly suitable when working with tabular data. One of the useful functions of Polars is DataFrame.sum_horizontal
, which allows you to compute the sum across rows, aggregating values horizontally across specified columns.
This function is especially beneficial when you need to perform row-wise computations, such as summing up related features in a dataset, without manually iterating through rows.
sum_horizontal()
methodThe sum_horizontal()
function computes the sum of values across columns in a DataFrame horizontally.
Below is the syntax of sum_horizontal()
function:
pl.sum_horizontal(*exprs)
*exprs
: It represents the column(s) that are to be aggregated. It accepts the expression input. Strings are parsed as column names; other non-expression inputs are parsed as literals.
It returns a Series
type object that represents the sum of values for each row in the DataFrame.
Look at the slides below for further understanding.
To demonstrate the use of sum_horizontal()
function, we will take an example:
import polars as pl# Creating a DataFramedata = pl.DataFrame({"alpha": [10, 20, 30, 40],"beta": [5.0, 15.0, 25.0, 30.0],"gamma": [2, 4, 6, 6],})# Use sum_horizontal to compute the sum across columns for each rowresult_sum = data.select([pl.sum_horizontal(["alpha", "beta", "gamma"]).alias("sum_horizontal")])# Display the resultprint(result_sum)
Line 4–10: We create a new DataFrame, data
, that has three columns (alpha
, beta
, and gamma
) and four rows, each containing corresponding numeric values.
Line 13–15: The sum_horizontal()
function is applied to data
to calculate the sum across columns for each row. The DataFrame title is set as "sum_horizontal" using alias("sum_horizontal")
. Next, the select()
method is applied to this result to produce a new DataFrame with the computed sums.
Line 17: We print the result_sum
DataFrame.
The pl.sum_horizontal()
function simplifies the process of creating composite metrics or overall scores from multiple data points. It is useful in contexts where data needs to be aggregated across multiple columns for each row, such as in financial analysis, survey scoring, quality control, fitness tracking, and environmental monitoring.
Free Resources