The apply
function in pandas is used to apply a function on an axis of the dataframe.
Axis refers to rows or columns.
A series object must be passed to the apply
function, which would have either the dataframe index (axis = 0
) or a dataframe column (axis = 1
).
The apply
function has the following syntax:
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
The apply
functions take the following parameters:
Parameters | Description |
---|---|
func |
The function to apply to a row or column. |
axis |
The axis on which the function is applied. 0 refers to applying the function column-wise. 1 refers to applying the function row-wise. |
raw |
Determines if the row or column passed is a series or ndarray . False refers to a series. True refers to a ndarray . By default, it is False . |
result_type |
Only used when axis = 1 . Can take four values: expand , reduce , broadcast , None . By default, it is None . |
args |
Positional arguments to pass to the function in addition to the array or series. |
**kwargs |
Any additional arguments. |
Only the
func
parameter is required. The rest have default values.
result_type
parameterThe result_type
argument is only used when axis = 1
. The argument can take four values: expand
, reduce
, broadcast
, None
. By default, it is None
.
Each value is discussed below:
expand
: list-like results convert into columns.reduce
: returns a series if possible, instead of expanding list-like results.broadcast
: results are broadcasted to the original shape of the dataframe, and the original index and columns will be retained.None
: depends on the return value of the applied function; list-like results will be returned as a series. However, if the apply function returns a series, then these are expanded to columns.The apply
function returns a series or dataframe after applying the specified function.
The snippet below shows how the apply
function works in pandas:
The
lambda
keyword is used to define a simple function in Python without a name.
import pandas as pdimport numpy as npdf = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])print("Original Dataframe")print(df)print('\n')print("Applying square root")print(df.apply(np.sqrt))print('\n')print("Column-wise Sum")print(df.apply(np.sum, axis=0))print('\n')print("Row-wise Sum")print(df.apply(np.sum, axis=1))print('\n')print("Returning a list-like column to each index")print(df.apply(lambda x: [1, 2], axis=1))print('\n')
Free Resources