Polars, a Rust DataFrame library with Python bindings, excels in high-performance data processing for large datasets. Its parallel processing capabilities and support for various data sources make it an ideal choice for efficient tabular data management with better performance than pandas
.
Here, we’ll discuss the update()
function of the Polars library.
update()
functionThe update()
function helps us to merge or align two DataFrames, updating values in the target DataFrame with non-null values from the source DataFrame. This function facilitates the integration of new data into an existing DataFrame, allowing for flexible updating strategies, such as inner, outer, and explicit column-based joins.
Here’s the syntax for the update()
function:
DataFrame.update(other: DataFrame,on: str | Sequence[str] | None = None,how: Literal['left', 'inner', 'outer'] = 'left',*,left_on: str | Sequence[str] | None = None,right_on: str | Sequence[str] | None = None,include_nulls: bool = False,) → DataFrame[source]
In the syntax above:
other
passes a DataFrame required to update the original one.
on
represents the column names that will be joined on.
how
defines the merging approach: left
retains all left rows, inner
keeps matching keys, and outer
updates existing matches while adding new rows.
left_on
joins columns of the left DataFrame.
right_on
joins columns of the right DataFrame.
include_nulls
states that null values in the right DataFrame will be utilized to update the left DataFrame.
Let’s discuss a coding example to better understand how this function works:
import polars as pldf = pl.DataFrame({"EmployeeName": ["John", "Smith", "David", "Ronaldo"],"Age": [24, 32, 19, 26],"Salary": [100, 300, 250, 320]})new_df = pl.DataFrame({"Salary": [140, 330, None],"LastName": ["Dan", "Rohn", "Michel"],})# Simply update DataFrameprint(df.update(new_df))# Update DataFrame by keeping those rows that are commonprint(df.update(new_df, how="inner"))# Update DataFrame containing all rows in both DataFramesprint(df.update(new_df, how="outer"))# Explicitly joining columns in each Dataframe, including null valuesprint(df.update(new_df, left_on="EmployeeName", right_on="LastName", how="outer", include_nulls=True))
Let’s discuss the above code in detail.
Line 1: We import the polars
library as pl
.
Lines 3–9: We create a DataFrame named df
containing EmployeeName
, Age
, and Salary
columns.
Lines 11–16: We create another DataFrame to update the previous one.
Line 19: We call the update()
function to update the df
DataFrame values with the new_df
.
Line 22: We update DataFrame by keeping those rows that are common by passing how="inner"
argument to the function.
Line 25: We update DataFrame containing all rows in both DataFrames by passing how="outer"
argument to the function.
Line 28: We join all columns in each Dataframe, including null values, by passing include_nulls=True
argument.
Unlock your potential: Polars in Python series, all in one place!
To continue your exploration of Polars, check out our series of Answers below:
How to scale and normalize data in Python using Polars
Learn how to transform raw data using Python's Polars library to scale it (0-1) and normalize it (mean 0, std 1).
What is DataFrame.clear function in Polars Python?
Learn how to use Polars' DataFrame.clear()
to create a null-filled copy, either empty if n=0
or with n
null rows.
How to reverse a DataFrame in Polars Python?
Learn how to use Polars, a Rust-based DataFrame library for Python, which offers a reverse()
function to efficiently revert DataFrame rows, providing an alternative to pandas.
How to rename the column names in Polars Python?
Learn how to use Polars' rename()
function to efficiently rename DataFrame columns using key-value pairs, enhancing data management and processing.
What is Polars library in Python?
Learn how Polars, a fast DataFrame library in Rust for Python, offers high-performance data manipulation and analysis similar to Pandas.
How to concatenate two Dataframes in Polars Python
Learn how Polars, leveraging Rust, offers efficient DataFrame concatenation in Python with the concat()
method.
How to perform a transpose of a Python Polars DataFrame
Learn how to use Polars' DataFrame.transpose()
to efficiently transpose DataFrames, with options for including headers and custom column names, enhancing data manipulation capabilities.
How to check the polars version in Python
Learn how to ensure the correct Polars version by using pip3 show polars
or by printing pl.__version__
in Python.
What is DataFrame.update function in Polars Python?
Learn how to use the update()
function in Polars to merge two DataFrames, updating the target with non-null values from the source, and supporting various join strategies.
Free Resources