A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.
Suppose that a classifier produces the following results:
The following code snippet shows how to create a confusion matrix and calculate some important metrics using a Python library called scikit-learn (also known as sklearn):
# Importing the dependanciesfrom sklearn import metrics# Predicted valuesy_pred = ["a", "b", "c", "a", "b"]# Actual valuesy_act = ["a", "b", "c", "c", "a"]# Printing the confusion matrix# The columns will show the instances predicted for each label,# and the rows will show the actual number of instances for each label.print(metrics.confusion_matrix(y_act, y_pred, labels=["a", "b", "c"]))# Printing the precision and recall, among other metricsprint(metrics.classification_report(y_act, y_pred, labels=["a","b","c"]))
y_pred
is a list that holds the predicted labels. y_act
contains the actual labels.
metrics.confusion_matrix()
takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates the confusion matrix for the given inputs.
metrics.classification_report()
takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates performance metrics like precision, recall, and support.
Free Resources