How to filter pandas DataFrame by column value

Pandas is an open-source Python library mostly used in data sciences, data analysis, and machine learning projects. In machine learning tasks, it helps clean and prepare the data to pass them into machine learning models. The most common operations performed using pandas are filling in missing values, manipulating the data, loading data from files, and visualizing data.

In this Answer, we will explore how to select a DataFrame with specific values by applying a filter on the column values.

Filtering methods

We can filter DataFrame by column values using the following methods:

  1. Boolean indexing: We can create a condition (Boolean expression). Thee DataFrame will then keep the rows where the condition was true and discard the rows where it was false.

filtered_df = df[df['Age'] > 18]
Filtering using boolean indexing
  1. qeury() method: We can pass a query in string format in the query method. It is mainly used for complex filtering.

filtered_df = df.query('Age > 25 and Country == "USA"')
Filtering using query method
  1. loc[] accessor: We can access the column values of a DataFrame directly by applying conditions in the loc[] accessor.

filtered_df = df.loc[df['Age'] > 25]
Filtering using loc[] accessor
  1. isin() method: We can filter rows of a DataFrame based on whether the values in a specified column are present in a given list or array.

ages = [25, 30, 35]
filtered_df = df[df['Age'].isin(ages)]
Filtering using isin() method

Example

To learn how we can apply a filter on the column values, let's first create a data example. Consider that we have the following 2-D data present in a pandas DataFrame. The data contains the person's name, country, and age:

Name

Country

Age

Albert

USA

12

James

USA

14

Alex

USA

20

Bob

Canada

25

Sara

Canada

29

Bill

Germanny

45

Daniel

Germany

32

John

Egypt

60

In this example, we aim to infer two separate pieces of information from the data:

1. Information of persons living in The USA.

The DataFrame will provide us with the following person's information if we find the person living in the USA:

Name

Country

Age

Albert

USA

12

James

USA

14

Alex

USA

20

2. Information of persons older than 18 but younger than 40.

The DataFrame will provide us with the following person's information if we find the person older than 18 and younger than 40:

Name

Country

Age

Alex

USA

20

Bob

Canada

25

Sara

Canada

29

Daniel

Germany

32

In the coming sections, we will load the example data in a pandas DataFrame and apply the filters to the data.

Coding example

First of all, let's create a pandas DataFrame and load it with some hard-coded data:

import pandas as pd

person_data = pd.DataFrame({
  "Name": ["Albert", "James", "Alex", "Bob", "Sara", "Bill", "Daniel", "John"],
  "Country": ["USA", "USA", "USA", "canada", "Canada", "Germany", "Germany", "Egypt"] ,
  "Age": [12, 14, 20, 25, 29, 45, 32, 60]
})

print(person_data)
Creating pandas DataFrame

Code explanation

  • Line 1: We import pandas and give it the name pd.

  • Lines 3–7: We create a DataFrame using pd.DataFrame function, store it in the person_data variable, and pass the person's information in the curly brackets {}.

Now that we have created the DataFrame, we can move towards applying filters on column values.

import pandas as pd

person_data = pd.DataFrame({
  "Name": ["Albert", "James", "Alex", "Bob", "Sara", "Bill", "Daniel", "John"],
  "Country": ["USA", "USA", "USA", "canada", "Canada", "Germany", "Germany", "Egypt"] ,
  "Age": [12, 14, 20, 25, 29, 45, 32, 60]
})

age = [25,60]

age_greater_than_25 = person_data.loc[person_data['Age'] > 25]
germany_and_older_than_25 = person_data.query('Age > 25 and Country == "Germany"')
age_25_and_60 = person_data[person_data["Age"].isin(age)]
living_in_USA = person_data[person_data["Country"] == "USA"]
age_filter = person_data[(person_data["Age"] > 18) & (person_data["Age"] < 40)]

print("Age greater than 25:")
print(age_greater_than_25)

print("Germany citizen and age greater than 25:")
print(germany_and_older_than_25)

print("Age 25 and 60:")
print(age_25_and_60)

print("Person living in USA:")
print(living_in_USA)

print("Person older than 18 and younger than 40:")
print(age_filter)
Applying filter on DataFrame column values

Code explanation

  • Line 11: We use the loc accessor and pass the condition person_data['Age'] > 25 in the square brackets [].

  • Line 12: We use the query method and pass the condition Age > 25 and Country == "Germany" in the method.

  • Line 13: We use the isin() method and pass the age list. The result of the isin() method is stored in the age_25_and_60 variable.

  • Line 14: We use the person_data variable and apply the filter person_data["Country"] == "USA" in the square brackets []. The result of applying the filter is stored in the living_in_USA variable.

  • Line 15: We again use the person_data variable and apply the filter (person_data["Age"] > 18) & (person_data["Age"] < 40) inside the square brackets []. The result of applying the filter is stored in the age_filter variable.

  • Lines 17–30: We use the print statement and print out the filtered data.

Conclusion

Pandas provide us with the capability of filtering out DataFrame values by applying filters on the column values. This can help us to extract information and get insights from the data.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved