What is the statistics correlation() method in Python?

Overview

The statistics module in Python comes with many statistical functions that help analyze numerical data.

The statistics.correlation() method in Python is used to return Pearson’s correlation coefficient between two inputs.

The Pearson’s correlation formula is:

Where:

rr = The correlation coefficient. It is usually between -1 (negative correlation) and +1 (positive correlation). When the value is zero, it means that there is no correlation between the inputs.

xix_{i}= The values of the x dataset.

xx = The mean values of the x dataset.

yiy_{i} = The values of the y dataset.

yy = The mean value of the y dataset.

Syntax

statisticcs.corrrelation(x,y,/)

Parameters

The statistics.correlation() method takes the x and y parameters which represent the x and y values for which the correlation coefficient is to be determined.

Return value

The statistics.correlation() method returns the Pearson’s correlation coefficient for two given inputs.

Example

Let’s use the statistics.correlation() method to determine the Pearson’s correlation coefficient for two inputs, x and y:

import numpy as np
x = [11, 2, 7, 4, 15, 6, 10, 8, 9, 1, 11, 5, 13, 6, 15]
y = [2, 5, 17, 6, 10, 8, 13, 4, 6, 9, 11, 2, 5, 4, 7]
# to return the upper three quartiles
pearsons_coefficient = np.corrcoef(x, y)
print("The pearson's coeffient of the x and y inputs are: \n" ,pearsons_coefficient)

Explanation

  • Line 1: We import numpy as a module.
  • Lines 2 and 3: We make two datasets, x and y.
  • Line 6: We calculate the coefficient using the np.corrcoef(x, y) function and assign the result to a variable, pearsons_coefficient.

Free Resources