What is Pandas DataFrame.where() in Python?

Overview

The DataFrame.where() method in Python replaces values in a DataFrame where a specified condition is false. By default, it replaces empty fields with NaN values.

Note: Pandas DataFrame is a two-dimensional labeled data structure.

Syntax


# Signature according to Pandas documentation
DataFrame.where(condition,
other=NoDefault.no_default,
inplace=False,
axis=None,
level=None,
errors='raise',
try_cast=NoDefault.no_default)

Parameter

This method takes the following argument values:

  • condition: Boolean
    • It is the condition to check DataFrame for. It can be single or multiple.
  • other: Series, Scalar, DataFrame, or callable
    • It represents the type of entries to replace where the condition gets false.
  • inplace: Boolean, default= False
    • It checks whether to form operation on the same data or its copy.
  • axis: Integer, default=None
    • It checks for rows or columns.
  • level: Integer, default=None
    • Level alignmentRows and Column alignment
  • errors: String, default=raise
    • raise: It allows this method to raise exceptions.
    • ignore: It suppresses exceptions.
  • try_cast: Boolean, default=None
    • It cast/changes results back into the input type.

Return value

This method returns the same type as caller, or None when inplace gets True in other cases.

Code

Let's look at how we can create a DataFrame and filter databases on specified conditions using where().

Single condition operation

# Importing the Pandas package
import pandas as pd
# Nested lists of data
data= [['Julia','Grade 10',78],
['Butller','Grade 12',90],
['Monitosh','Grade 11',88],
['Butller','Grade 5',95],
['vyohi','Grade 7',72]]
# Creating a DataFrame
df = pd.DataFrame(data, columns=['Name', 'Class','Marks'])
# Creating Boolean series for Butller name
_filter = df["Name"]=="Butller"
# Filtering the extracted data
results= df.where(_filter, inplace = False)
# Showing the data on the console
print(results)

Explanation

  • Lines 4–8: We create a nested list of five observations to convert into a DataFrame.
  • Line 10: We invoke the DataFrame() method from the Pandas package to convert this nested list into a DataFrame of Name, Class, and Marks.
  • Line 12: We create a Boolean series of students with the name Butller.
  • Lines 14–16: We call the df.where() function to filter the student Buttler's data.

Code

Here, the code is the same as above other than the filtering condition. Instead of one, we can also use multiple conditions using logical operators.

Multiple conditions operation

# Importing the Pandas package
import pandas as pd
# Nested lists of data
data= [['Julia','Grade 10',78],
['Butller','Grade 12',90],
['Monitosh','Grade 11',88],
['Butller','Grade 5',95],
['vyohi','Grade 7',72]]
# Creating a DataFrame
df = pd.DataFrame(data, columns=['Name', 'Class','Marks'])
# Creating a Boolean series for the name Butller
_filter1 = df["Name"]=="Butller"
_filter2 = df["Class"]=="Grade 5"
# Filtering the extracted data
results= df.where(_filter1 & _filter2, inplace = False)
# Showing data on the console
print(results)

Explanation

  • Line 12: We create a Boolean series of students with the name Butller.
  • Line 13: We create a Boolean series of students who study in Grade 5.
  • Lines 15–16: We filter out the student named Buttler who studies in Grade 5.

Free Resources