Get the subset of the columns of a dataframe based on dtype

Overview

In Pandas, the select_dtypes() function is used to return the subset of the columns of a dataframe by specifying the data types.

Some datatypes or dtype in Python include float64, bool, int64, and more.

Syntax

DataFrame.select_dtypes(include=None, exclude=None)
Syntax for the select_dtypes() function in Pandas

Parameter value

This function takes the following parameter values:

  • include: This is used to specify the datatype to be included or returned in the output result.
  • exclude: This is used to specify the datatype to be excluded in the output result.
  • Note: At least one of the parameters, include or exclude, must be passed to the select_dtypes() function.

Return value

This function returns the subset of the given dataframe having the datatypes specified in include and excluding the datatypes in exclude.

Example

import pandas as pd
# creating a DataFrame
df = pd.DataFrame({'INTEGERS': [1, 0] * 3,
'BOOLEAN': [True, False] * 3,
'FLOAT': [1.0, 2.0] * 3})
# printing the DataFrame
print(df)
# implementing the select_dtypes() function to include boolean values
print(df.select_dtypes(include="bool"))
# implementing the select_dtypes() function to exclude boolean values
print(df.select_dtypes(exclude="bool"))

Explanation

  • Line 1: We import the pandas module.
  • Line 4–6: We create a dataframe, df.
  • Line 9: We print the dataframe, df.
  • Line 12: We use the select_dtypes() function to include boolean values. We print the results to the console.
  • Line 15: We use the select_dtypes() function to exclude boolean values. We print the results to the console.

Free Resources