How to check for duplicated rows of a DataFrame in Pandas

Parameters

The duplicated() function takes the following parameter values:

subset (optional): This represents a column label or sequence of labels denoting the column in which the duplicates are to be identified.
keep (optional): This takes any of the values:

"first": To mark any existing duplicate as True except for the first occurrence.
"last": To mark any existing duplicate as True except for the last occurrence.
"false": To mark all duplicates as True.

Return value

The duplicated() function returns a Boolean Series for each duplicated row.

By default the duplicated() function will return False for the first occurrence of a duplicated row and will return True for the other occurrence. By setting the keep = "last", the first occurrence is set as True while the last occurrence is set as False.

Example

# A code to illustrate the duplicate() function 

# importing the pandas library
import pandas as pd

# creating a dataframe
df = pd.DataFrame([["THEO",1,1,3,"A"],
                   ["Theo",1,1,3,"A"],
                   ["THEO",1,1,3,"A"]],
                   columns=list('ABCDE'))
# printing the dataframe
print(df)

print("\n")
# to check for duplicate rows
print(df.duplicated())

print("/n")
# setting first occurence as true
print(df.duplicated(keep = "last"))

print("\n")
# getting duplicates on column A
print(df.duplicated(subset = ["A"]))

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

License: Creative Commons-Attribution NonCommercial-ShareAlike 4.0 (CC-BY-NC-SA 4.0)

How to check for duplicated rows of a DataFrame in Pandas

Overview

Syntax

Parameters

Return value

Example

Explanation