Understanding the world of computer vision starts with grasping the fundamental techniques that power image recognition and processing. The Hough transform is one such technique. An incredible tool for detecting shapes in an image, it is a cornerstone of modern image processing.
The Hough transform is a popular method in image analysis and digital image processing for extracting features. Its main purpose is to identify partially formed instances of objects belonging to a particular class of shapes using a voting process. This technique is widely used to detect simple shapes like lines, circles, ellipses, etc.
OpenCV is a collection of programming functions primarily designed for real-time computer vision tasks. It is highly optimized and extremely efficient. When combined with the Hough transform, OpenCV allows for robust and precise shape detection.
Before we dive into feature matching, it's important to have OpenCV installed. We can install it using pip:
pip install opencv-python
OpenCV can implement the Hough transform on a simple image and extract line segments.
In this step, we import all the necessary libraries. Here, cv2 is the OpenCV library, numpy is a library for handling arrays (which images essentially are), and matplotlib is for displaying images.
import cv2import numpy as npimport matplotlib.pyplot as plt
Load the image and convert it into grayscale. The reason for conversion to grayscale is that edge detection, which will be used later, works more effectively on grayscale images than on color images.
input_image = cv2.imread('image.jpg')gray = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY)
Here, apply the Canny edge detection method, which helps identify areas of the image with abrupt pixel intensity changes, denoting edges.
edge_image = cv2.Canny(gray, 50, 150, apertureSize=3)
After performing edge detection, apply the Hough transform to the detected edges. The Hough transform will return an array of parameters of the detected lines in the form of
detected_lines = cv2.HoughLines(edge_image, 1, np.pi / 180, 200)
In this step, iterate through all the detected lines and draw them on the original image.
for line in detected_lines:for rho, theta in line:cos_theta = np.cos(theta)sin_theta = np.sin(theta)x_0 = cos_theta * rhoy_0 = sin_theta * rhox_1 = int(x_0 + 1000 * (-sin_theta))y_1 = int(y_0 + 1000 * (cos_theta))x_2 = int(x_0 - 1000 * (-sin_theta))y_2 = int(y_0 - 1000 * (cos_theta))cv2.line(output_image, (x_1, y_1), (x_2, y_2), (0, 0, 255), 2)
The OpenCV library reads images in BGR format by default, but matplotlib's imshow() function expects images in RGB format. So, convert the original and the processed image to RGB format before displaying them.
# Convert the processed image from BGR to RGBoutput_image_rgb = cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB)# Convert the original image from BGR to RGBoriginal_image_rgb = cv2.cvtColor(cv2.imread('image.jpg'), cv2.COLOR_BGR2RGB)
Finally, use matplotlib to display the original and processed images side by side. The original image is displayed on the left, and the processed image with detected lines is on the right.
plt.figure(figsize=(15, 10))plt.subplot(1, 2, 1)plt.title('Original Image')plt.imshow(original_image_rgb)plt.axis('off')plt.subplot(1, 2, 2)plt.title('Image with Hough Lines')plt.imshow(output_image_rgb)plt.axis('off')plt.show()
Here's the complete executable code implementing the above steps:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image and convert it to grayscale
input_image = cv2.imread('image.jpg')
gray = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY)
# Perform edge detection
edge_image = cv2.Canny(gray, 50, 150, apertureSize=3)
# Apply the Hough Transform
detected_lines = cv2.HoughLines(edge_image, 1, np.pi / 180, 200)
# Draw the detected lines on the image
output_image = input_image.copy()
for line in detected_lines:
for rho, theta in line:
cos_theta = np.cos(theta)
sin_theta = np.sin(theta)
x_0 = cos_theta * rho
y_0 = sin_theta * rho
x_1 = int(x_0 + 1000 * (-sin_theta))
y_1 = int(y_0 + 1000 * (cos_theta))
x_2 = int(x_0 - 1000 * (-sin_theta))
y_2 = int(y_0 - 1000 * (cos_theta))
cv2.line(output_image, (x_1, y_1), (x_2, y_2), (0, 0, 255), 2)
# Convert the processed image from BGR to RGB
output_image_rgb = cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB)
# Convert the original image from BGR to RGB
original_image_rgb = cv2.cvtColor(cv2.imread('image.jpg'), cv2.COLOR_BGR2RGB)
# Display original and output images
plt.figure(figsize=(15, 10))
plt.subplot(1, 2, 1)
plt.title('Original Image')
plt.imshow(original_image_rgb)
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Image with Hough Lines')
plt.imshow(output_image_rgb)
plt.axis('off')
plt.show()Here’s a line-by-line breakdown of the code:
Line 1–3: We import the required libraries. OpenCV (cv2) is used for image processing, NumPy (np) for numerical operations and matplotlib.pyplot (plt) for visualizing the images.
Line 6–7: The image is loaded using cv2.imread and then converted to grayscale using cv2.cvtColor. Grayscale is used because the Canny edge detection requires a grayscale image as input.
Line 10: We perform edge detection using the Canny algorithm. It is utilized for detecting a diverse range of edges present in the images. The two numbers 50 and 150 are the thresholds for the hysteresis procedure in the Canny algorithm.
Line 13: This line performs the Hough transform on the edge_image, which is the result of edge detection on a grayscale image. The function identifies lines in the image by converting them into polar coordinates (rho and theta). The parameters 1 and np.pi / 180 specify the distance resolution and angle resolution used in the Hough space. The value 200 represents the threshold, which determines the minimum number of votes required to consider a line as a detected line.
Line 16–28: These lines iterate through each line detected by the Hough transform and draw it onto the original image. The rho and theta values represent the distance and angle of the line, respectively. A line in the image space can be expressed with two variables - rho and theta. The np.cos and np.sin functions are used to calculate the x and y coordinates of the two points defining the line.
Line 31: Here, we convert the BGR image to RGB. OpenCV loads images in BGR format by default but matplotlib displays images in RGB. Thus, we need to convert the images.
Line 34: We load the original image and convert it to RGB.
Line 37–47: These lines of code are for visualizing the original and processed images side by side using matplotlib. plt.figure is used to create a new figure, plt.subplot is used to add a subplot to the figure, plt.title is used to set a title for the subplot, plt.imshow is used to display an image in the subplot, and plt.axis(‘off’) is used to turn off the axis.
Line 49: plt.show() is used to display the figure with the two plots.
The Hough transform is a simple, yet powerful way to find lines in images. Using Python and OpenCV, even beginners can apply this method and start their journey in the exciting world of image processing. Play around with the parameters to better understand how they impact the results.
Here are some more OpenCV tutorials:
Free Resources