What is a precision and recall curve?

Precision and recall

Precision and recall are valuable measures in predicting success when the classes are highly imbalanced. In information retrieval, precision measures the result relevancy, whereas recall measures how many truly relevant results are obtained.

The precision and recall curve

The precision and recall curve depicts the tradeoff between precision and recalls for various thresholds.

A high area under the curve indicates both high recall and high precision, with high precision corresponding to a low false positive rate and high recall corresponding to a low false negative rate. If the scores are high for both, it indicates that the classifier is yielding accurate findings (high accuracy) and a majority of all positive results (high recall).

Graph

The precision and recall curve plots precision on the y-axis and recall on the x-axis. A recall is the same as sensitivity, and precision is the equivalent of a positive predictive value.

The following graph shows the relationship between precision and recall. Notice that when recall increases, the precision decreases. The slightest modification in the threshold at the edges of these steps significantly affects precision, with only a minor improvement in recall.

There are two essential scenarios regarding precision and recall curve that are elaborated as follows:

High recall and low precision

A high recall but low precision generates many results, but most of its forecasted labels are inaccurately matched with the training labels.

High precision and low recall

A high precision but low recall does the converse, returning very few results yet predicting most of the correct labels compared to the training labels.

For a detailed understanding of these topics, an introduction is given below:

Precision

Precision $(P)$ is the ratio of true positives $(T_p)$ to the true positives plus false positives $(F_p)$ .

Precision doesn't necessarily decrease with recall. The definition of precision indicates that decreasing a classifier's threshold may raise the denominator by raising the number of outcomes returned.

Note: If the previous threshold was set too high, the new results may all be true positives, increasing precision. If the prior threshold was close to or too low, lowering it further will produce false positives, reducing precision.

Recall

Recall $(R)$ is the ratio of true positives $(T_p)$ to the true positives plus false negatives $(F_n)$ .

In recall's definition, the classifier threshold does not effect ${T_p+F_n}$ . Lowering the classifier threshold may increase recall by increasing the number of true positive outcomes. Lowering the threshold may also cause the recall to remain constant while the precision changes.

Note: The harmonic mean of recall and precision is known as $F_1$ score.

Application

Precision and recall curves are commonly employed in binary classification to investigate a classifier's output. Binarizing the output is required to expand the precision-recall curve to multi-class or multi-label classification.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

You TubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources