How to make a copy of a data frame in pandas

Overview

The copy method is used to make a copy of the given DataFrame. There are two ways a DataFrame is copied:

  1. Deep copy: It creates a new DataFrame with a copy of the data and indices of the given DataFrame. Changes to the copy’s data or indices will not be reflected in the original DataFrame.
  2. Shallow copy: It creates a new DataFrame without copying the data or index of the caller object (only references to the data and index are copied). Any modifications to the original’s data will be mirrored in the copy (and vice versa).

The method’s default behavior is the deep copy. Set the parameter deep to False to enable shallow copy.

Note: Refer to What is pandas in Python? to learn more about pandas.

Syntax

DataFrame.copy(deep=True)

Parameter

deep is a boolean parameter that indicates whether to make a deep or a shallow copy. If True, a deep copy is made. Otherwise, a shallow copy is made.

Code example (deep copy)

Let’s look at the code below:

import pandas as pd
data = [['dom', 10], ['abhi', 15], ['celeste', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df_deep_copy = df.copy(deep=True)
print("Original Dataframe - \n")
print(df)
print("Deep Copy Dataframe - \n")
print(df_deep_copy)
print("\n")
print("Changing value in original dataframe\n")
df.iloc[0,1] = -9
print("Original Dataframe after changes - \n")
print(df)
print("Deep Copy Dataframe after changes - \n")
print(df_deep_copy)

Explanation

  • Line 1: We import the pandas module.
  • Lines 2 to 3: We create a dataframe called df.
  • Line 6: We get a deep copy of df called df_deep_copy using the copy method with deep argument as True.
  • Lines 8 to 12: We print the df and df_deep_copy.
  • Line 16: We modify the Age column for one of the rows in df.
  • Lines 18 to 22: We print df and df_deep_copy.

In the above code, when we modify the original dataframe, it doesn’t affect the copy of the dataframe.

Code example (shallow copy)

Let’s look at the code below:

import pandas as pd
data = [['dom', 10], ['abhi', 15], ['celeste', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df_shallow_copy = df.copy(deep=False)
print("Original Dataframe - \n")
print(df)
print("Shallow Copy Dataframe - \n")
print(df_shallow_copy)
print("\n")
print("Changing value in original dataframe\n")
df.iloc[0,1] = -9
print("Original Dataframe after changes - \n")
print(df)
print("Shallow Copy Dataframe after changes - \n")
print(df_shallow_copy)

Explanation

  • Line 1: We import the pandas module.
  • Lines 2 to 3: We create a dataframe called df.
  • Line 6: We get a shallow copy of df called df_shallow_copy using the copy method with deep argument as True.
  • Lines 8 to 12: We print df and df_shallow_copy.
  • Line 16: We modify the Age column for one of the rows in df.
  • Lines 18 to 22: We print df and df_shallow_copy.

In the above code, when we modify the original dataframe, it reflects the changes in the copy of the dataframe.

Free Resources