How to compute correlation using pandas

pandas is a popular Python-based data analysis toolkit that can be imported using:

import pandas as pd

It presents a diverse range of utilities from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.

pandas can help in the creation of multiple types of data analysis graphs. One such tool is correlation.

The default implementation of the correlation table is:

DataFrame.corr(methods = Pearson min_periods= 1)

Parameters

  • method: {‘pearson’, ‘kendall’, ‘spearman’} or callable - Mathods of correlation:
    -pearson: Standard correlation coefficient
    -kendall: Kendall Taus correlation coefficient
    -spearman: Spearman rank correlation
    -callable: Any callable that takes two 1d ndarrays as an input and returns a float.“

  • min_period: int - The minimum number of observations required per pair of columns to have a valid result. This is only for Pearson and Spearman.

Code

The following code shows how correlation can be computed in Python – you can change different parameters and look at how the output varies.

It shows the correlation between dogs and cats using the default settings.

#import library
import pandas as pd
#add csv file to dataframe
df = pd.DataFrame([(.2, .3), (.01, .6), (.6, .01), (.2, .1)],
columns=['dogs', 'cats'])
#create correlation
corr = df.corr()
print(corr)
New on Educative
Learn to Code
Learn any Language as a beginner
Develop a human edge in an AI powered world and learn to code with AI from our beginner friendly catalog
🏆 Leaderboard
Daily Coding Challenge
Solve a new coding challenge every day and climb the leaderboard

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved