The statistics
module in Python comes with many statistical functions that help analyze numerical data.
The statistics.correlation()
method in Python is used to return Pearson’s correlation coefficient between two inputs.
The Pearson’s correlation formula is:
Where:
= The correlation coefficient. It is usually between -1 (negative correlation) and +1 (positive correlation). When the value is zero, it means that there is no correlation between the inputs.
= The values of the x dataset.
= The mean values of the x dataset.
= The values of the y dataset.
= The mean value of the y dataset.
statisticcs.corrrelation(x,y,/)
The statistics.correlation()
method takes the x
and y
parameters which represent the x and y values for which the correlation coefficient is to be determined.
The statistics.correlation()
method returns the Pearson’s correlation coefficient for two given inputs.
Let’s use the statistics.correlation()
method to determine the Pearson’s correlation coefficient
for two inputs, x
and y
:
import numpy as npx = [11, 2, 7, 4, 15, 6, 10, 8, 9, 1, 11, 5, 13, 6, 15]y = [2, 5, 17, 6, 10, 8, 13, 4, 6, 9, 11, 2, 5, 4, 7]# to return the upper three quartilespearsons_coefficient = np.corrcoef(x, y)print("The pearson's coeffient of the x and y inputs are: \n" ,pearsons_coefficient)
numpy
as a module.x
and y
.np.corrcoef(x, y)
function and assign the result to a variable, pearsons_coefficient
.