Background subtraction is a fundamental technique in computer vision and image processing used to extract foreground objects from a video stream by removing the stationary or static background. This technique finds applications in various fields, such as surveillance, object tracking,
In this Answer, we will explore how to perform background subtraction using the OpenCV library in Python.
Background subtraction is a critical step in many computer vision tasks, allowing us to focus on moving objects within a scene. By identifying the differences between consecutive frames of a video stream and subtracting the background, we can highlight the foreground objects.
OpenCV provides several background subtraction algorithms, including:
MOG2: Gaussian mixture-based background/foreground segmentation algorithm.
KNN: K-nearest neighbors background/foreground segmentation algorithm.
GMG: Global motion-based background/foreground segmentation algorithm.
For this Answer, we will use the MOG2 algorithm due to its effectiveness and adaptability.
Before we proceed, make sure to have OpenCV installed. We can install it using pip:'
pip install opencv-python
Now let's see the steps involved in background subtraction using OpenCV:
import cv2import numpy as np
# Path to the input video fileinputvid = 'video.mp4'
VideoCapture
object# Create a VideoCapture object to read from the video filevidcap = cv2.VideoCapture(inputvid)
# Create a background subtractor objectbgsub = cv2.createBackgroundSubtractorMOG2()
while True:# Read the video framesuc, vidframe = vidcap.read()# If there are no more frames to show, break the loopif not suc:break# Apply the background subtractor to the frameforemask = bgsub.apply(vidframe)# Convert the mask to 3 channelsforemask = cv2.cvtColor(foremask, cv2.COLOR_GRAY2BGR)# Resize the frame and mask to a smaller sizescale_percent = 50winwidth = int(vidframe.shape[1] * scale_percent / 100)winheight = int(vidframe.shape[0] * scale_percent / 100)small_vidframe = cv2.resize(vidframe, (winwidth, winheight))small_vidmask = cv2.resize(foremask, (winwidth, winheight))# Stack the resized frame and mask horizontallyhstacked_frames = np.hstack((small_vidframe, small_vidmask))cv2.imshow("Frame and Mask (Scaled)", hstacked_frames)# If the 'q' key is pressed, stop the loopif cv2.waitKey(30) == ord("q"):break
# Release the video capture objectvidcap.release()# Close all OpenCV windowscv2.destroyAllWindows()
Here's the complete code implementing the above steps:
import cv2 import numpy as np # Path to the input video file inputvid = 'video.mp4' # Create a VideoCapture object to read from the video file vidcap = cv2.VideoCapture(inputvid) # Create a background subtractor object bgsub = cv2.createBackgroundSubtractorMOG2() while True: # Read the video frame suc, vidframe = vidcap.read() # If there are no more frames to show, break the loop if not suc: break # Apply the background subtractor to the frame foremask = bgsub.apply(vidframe) # Convert the mask to 3 channels foremask = cv2.cvtColor(foremask, cv2.COLOR_GRAY2BGR) # Resize the frame and mask to a smaller size scale_percent = 50 # Adjust this value to control the scaling winwidth = int(vidframe.shape[1] * scale_percent / 100) winheight = int(vidframe.shape[0] * scale_percent / 100) small_vidframe = cv2.resize(vidframe, (winwidth, winheight)) small_vidmask = cv2.resize(foremask, (winwidth, winheight)) # Stack the resized frame and mask horizontally hstacked_frames = np.hstack((small_vidframe, small_vidmask)) cv2.imshow("Original and Masked Video", hstacked_frames) # If the 'q' key is pressed, stop the loop if cv2.waitKey(30) == ord("q"): break # Release the video capture object vidcap.release() cv2.destroyAllWindows()
Here’s the explanation for the above code:
Lines 1–2: In the first two lines, we import the necessary libraries for our video processing task. We use the cv2
library for computer vision functions and the numpy
library for numerical operations.
Line 5: Here, we specify the path to the input video file that we want to process. This is where the video frames will be read from. Make sure to replace 'video.mp4'
with the actual path to your video file.
Line 8: We create a VideoCapture
object named vidcap
to read frames from the input video file. This object allows us to access the video frames sequentially.
Line 11: We create a background subtractor object named bgsub
using the MOG2 (Mixture of Gaussians) algorithm. This algorithm helps us distinguish moving objects from the static background.
Line 13: This section forms the core of our processing loop. We enter a while
loop that will continue until there are no more frames to process from the input video.
Line 15: We read the next video frame from the vidcap
object using the read()
method. The suc
variable indicates whether the read was successful, and vidframe
holds the actual frame data.
Lines 22–25: We apply the background subtractor to the current video frame using the bgsub.apply()
function. This creates a foreground mask foremask
, which highlights the moving objects in the frame. We then convert this grayscale mask to a 3-channel image using cv2.cvtColor()
to prepare it for stacking.
Lines 28–32: We proceed to resize both the original video frame and the foreground mask using the cv2.resize()
function. The scale_percent
variable controls the scaling factor, which determines the final dimensions of the frames. We calculate the width and height based on the original frame size and the specified scaling percentage.
Lines 34–35: We stack the resized video frame and mask horizontally using the np.hstack()
function, creating a side-by-side comparison of the original frame and the foreground mask.
Line 36: We display the horizontally stacked frame and mask using the cv2.imshow()
function. The window title is set to "Original and Masked Video"
.
Line 39–40: We check if the ‘q’ key is pressed by using cv2.waitKey(30)
. If the ‘q’ key is pressed, the loop will exit, and the program will proceed to release resources and close windows.
Lines 43–44: Once the loop finishes (either due to the end of the video or the ‘q’ key press), we release the vidcap
object using the release()
method to free up system resources. We also close any remaining OpenCV windows using the cv2.destroyAllWindows()
function.
Background subtraction is a powerful technique for isolating moving objects in a video stream. By leveraging OpenCV's MOG2 algorithm, we can effectively perform background subtraction and extract foreground objects. This Answer provided a step-by-step guide along with a complete code implementation for capturing video from a camera and applying background subtraction. Experiment with different algorithms and parameters to optimize the foreground extraction for a specific use case.
Free Resources