How to create a confusion matrix in Python using scikit-learn

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.

Suppose that a classifier produces the following results:

svg viewer

Code

The following code snippet shows how to create a confusion matrix and calculate some important metrics using a Python library called scikit-learn (also known​ as sklearn):

# Importing the dependancies
from sklearn import metrics
# Predicted values
y_pred = ["a", "b", "c", "a", "b"]
# Actual values
y_act = ["a", "b", "c", "c", "a"]
# Printing the confusion matrix
# The columns will show the instances predicted for each label,
# and the rows will show the actual number of instances for each label.
print(metrics.confusion_matrix(y_act, y_pred, labels=["a", "b", "c"]))
# Printing the precision and recall, among other metrics
print(metrics.classification_report(y_act, y_pred, labels=["a",
"b","c"]))

Explanation

y_pred is a list that holds the predicted labels. y_act contains the actual labels.

metrics.confusion_matrix() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates the confusion matrix for the given inputs.

metrics.classification_report() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates performance metrics like precision, recall, and support.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved