pandas is a popular Python-based data analysis toolkit that can be imported using:
import pandas as pd
It presents a diverse range of utilities from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.
pandas can help in the creation of multiple types of data analysis graphs. One such tool is correlation.
The default implementation of the correlation table is:
DataFrame.corr(
methods= Pearsonmin_periods= 1)
method: {‘pearson’, ‘kendall’, ‘spearman’} or callable - Mathods of correlation:
-pearson: Standard correlation coefficient
-kendall: Kendall Taus correlation coefficient
-spearman: Spearman rank correlation
-callable: Any callable that takes two 1d ndarrays as an input and returns a float.“
min_period: int - The minimum number of observations required per pair of columns to have a valid result. This is only for Pearson and Spearman.
The following code shows how correlation can be computed in Python – you can change different parameters and look at how the output varies.
It shows the correlation between dogs and cats using the default settings.
#import libraryimport pandas as pd#add csv file to dataframedf = pd.DataFrame([(.2, .3), (.01, .6), (.6, .01), (.2, .1)],columns=['dogs', 'cats'])#create correlationcorr = df.corr()print(corr)
Free Resources