Object detection is a method of recognizing and locating instances of an object from an image or video. Mean average precision (mAP) is a metric used to measure the accuracy of object detection models.
It is a value between
The following formula describes it:
In the formula above,
To understand the calculation of mAP for object detection, we must first explore intersection over union, precision-recall curve, and average precision.
Intersection over union (IOU) is a metric used to measure the accuracy of a
IOU is the extent to which the predicted bounding box overlaps with the actual bounding box. It is a value between
An IOU is calculated by:
IOU scores are converted into positive or negative predictions using a threshold. A typical threshold is
The following table highlights how samples are categorized.
True positve (TP) | False positive (FP) | False negative (FN) |
The predicted class is correct, and IOU is greater than or equal to the threshold. | The IOU score is less than the threshold (less than 0.5) or multiple bounding boxes have been predicted. | The IOU score was greater than or equal to 0.5 but the prediction was made for the wrong class. |
The training data is assumed to have some objects present in each image. As a result, true negatives (TN) are not considered.
A precision-recall curve is a graph that shows the tradeoff between precision and recall.
A precision-recall curve can be plotted by following these steps:
To understand the calculations in each step, let's consider an object detection model that can classify dogs. Let's suppose that the dataset contains only four images of dogs.
For each image, the model returns a bounding box, a predicted class, and a
The resultant values generate the following curve:
Average precision is the area under the generated precision-recall curve. However, it is common practice to first smooth out the zig-zag pattern.
Using
This results in the following graph:
There are many different variations for calculating AP from the precision-recall curve. The Pascal VOC challenge uses an 11-point interpolation.
In the equation above,
For the ongoing example, AP is calculated below:
The calculation for AP is repeated for each class of the object detector. The mAP for the model is the average of calculated APs.
Free Resources