DataFrame.melt()
methodIn Polars, the melt operation transforms a DataFrame from a wide format to a long format by reshaping columns into rows.
It enables the reorganization of tabular data by converting columns, representing measured variables, into rows. This results in two essential columns: one for identifiers and another for corresponding values.
The syntax of the DataFrame.melt()
method is mentioned below:
DataFrame.melt(id_vars: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,value_vars: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,variable_name: str | None = None,value_name: str | None = None,)
Line 2: The id_vars
parameter specifies the column(s) or selector(s) to function as identifier variables. If not explicitly specified, the operation will include all columns that are not defined in value_vars
.
Line 3: The value_vars
parameter defines the column(s) or selector(s) intended as value variables. If not explicitly specified, the operation will encompass all columns not mentioned in id_vars
.
Line 4: The variable_name
parameter allows the assignment of a name to the variable column. The default is set to variable
.
Line 5: The value_name
parameter permits the assignment of a name to the value column. The default is set to value
.
Now let’s take a look at the coding example to understand the DataFrame.melt()
method:
import polars as plimport polars.selectors as cs# Create a sample DataFramedf = pl.DataFrame({"a": ["aa", "bb", "cc"],"b": [2, 4, 6],"c": [3, 6, 9],"d": [4 ,8 ,12]})# Melt the DataFramemelted_df = df.melt(id_vars="a", value_vars=cs.numeric())# Melting some columnsmelted_some= df.melt(id_vars="b", value_vars= ("c","d"))# Display the resultprint(melted_df)print ("melted some values are: ", melted_some)
Let’s take a look at the above code step-by-step:
Lines 1–2: We import the Polars library and its selectors module. The pl
alias is commonly used for Polars, and cs
is used for selectors.
Lines 5–0: We create a DataFrame named df
with columns a
, b
, c
, and d
, each containing sample data.
Lines 13: We apply the melt()
method to the DataFrame df
. It specifies:
id_vars="a"
: The a
column is set as the identifier variable.
value_vars=cs.numeric()
: All numeric columns that are not in id_vars
will be melted.
Line 15: We apply the melt()
method to the DataFrame df
. It specifies:
id_vars="b"
: The b
column is set as the identifier variable. This means that the values in the b
column will be retained as is in the resulting melted DataFrame.
value_vars=("c", "d")
: The columns c
and d
are specified as the columns to be melted. This implies that the values in these columns will be unpivoted, and a new variable
column will be created to hold the column names (c
and d
).
Line 18: We print the melted DataFrame for all numeric columns.
Line 19: We print the melted DataFrame for specific columns c
and d
with b
as the identifier variable.
The output of the above code example shows a new DataFrame returned by the DataFrame.melt()
method containing the melted data.
The DataFrame.melt()
method in Polars is particularly useful for data manipulation tasks where a more compact representation of the data is desired, facilitating downstream analysis and visualization. The method provides flexibility through parameters such as specifying identifier and value columns, as well as customizable names for the resulting variable and value columns. Ultimately, the DataFrame.melt()
method in Polars aids in the efficient transformation of data, supporting a wide range of data analysis workflows.
Free Resources