Hand tracking Python

Hand tracking is an exciting computer vision application that involves detecting and tracking human hand gestures and movements in real-time.

In this Answer, we’ll explore how to perform hand tracking using OpenCV and the MediaPipe library. We’ll walk through the entire process, from setting up the environment to creating a Python script that tracks hands in a video.

Hand tracking has numerous applications, from virtual reality and gesture-based interfaces to sign language recognition and more. OpenCV is a popular computer vision library that provides tools for image and video processing, while MediaPipe is a powerful library developed by Google that offers various pre-trained models for tasks like face detection, hand tracking, and pose estimation.

OpenCV and MediaPipe
OpenCV and MediaPipe

Prerequisites

Before we begin, make sure to have the following installed:

  • Python (3.6 or later)

  • OpenCV (cv2)

  • MediaPipe (mediapipe)

Installing required libraries

We can install the necessary libraries using pip:

pip install opencv-python mediapipe
Installing required libraries

Understanding MediaPipe hands

The MediaPipe Hands module provides a pre-trained model for hand tracking, which can detect and track the landmarks (key points) of the human hand in images or video frames. Each landmark corresponds to a specific point on the hand, such as fingertips, knuckles, and palm center.

Hand landmarks
Hand landmarks

Setting up the environment

Let’s start by importing the required libraries and initializing the MediaPipe Hands module:

import cv2
import mediapipe as mp
# Initialize MediaPipe Hands module
mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils
# Specify the path to your video file
video_path = 'video.mov'
# Initialize video capture
vidcap = cv2.VideoCapture(video_path)
# Set the desired window width and height
winwidth = 350
winheight = 600

Loading and processing video

We’ll load the video file and process each frame for hand tracking:

while vidcap.isOpened():
ret, frame = vidcap.read()
if not ret:
break
# Convert the BGR image to RGB
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

Hand detection and landmark tracking

Now, let’s perform hand detection and landmark tracking using the MediaPipe Hands module:

# Process the frame for hand tracking
with mp_hands.Hands(min_detection_confidence=0.5, min_tracking_confidence=0.5) as hands:
process_frames = hands.process(rgb_frame)
# Draw landmarks on the frame
if process_frames.multi_hand_landmarks:
for lm in process_frames.multi_hand_landmarks:
mp_drawing.draw_landmarks(frame, lm, mp_hands.HAND_CONNECTIONS)

Displaying the results

Finally, we’ll resize the frame and display the results:

# Resize the frame to the desired window size
resized_frame = cv2.resize(frame, (winwidth, winheight))
# Display the resized frame
cv2.imshow('Hand Tracking', resized_frame)
# Exit loop by pressing 'q'
if cv2.waitKey(1) & 0xFF == ord('q'):
break

Complete code

Here's the complete code of hand tracking implementing the above steps:

import cv2
import mediapipe as mp

# Initialize mediapipe hands module
mphands = mp.solutions.hands
mpdrawing = mp.solutions.drawing_utils

# Specify the path to your video file
vidpath = 'video.mov'

# Initialize video capture
vidcap = cv2.VideoCapture(vidpath)

# Set the desired window width and height
winwidth = 350
winheight = 600

# Initialize hand tracking
with mphands.Hands(min_detection_confidence=0.5, min_tracking_confidence=0.5) as hands:
    while vidcap.isOpened():
        ret, frame = vidcap.read()
        if not ret:
            break

        # Convert the BGR image to RGB
        rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        # Process the frame for hand tracking
        processFrames = hands.process(rgb_frame)

        # Draw landmarks on the frame
        if processFrames.multi_hand_landmarks:
            for lm in processFrames.multi_hand_landmarks:
                mpdrawing.draw_landmarks(frame, lm, mphands.HAND_CONNECTIONS)

        # Resize the frame to the desired window size
        resized_frame = cv2.resize(frame, (winwidth, winheight))

        # Display the resized frame
        cv2.imshow('Hand Tracking', resized_frame)

        # Exit loop by pressing 'q'
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

# Release the video capture and close windows
vidcap.release()
cv2.destroyAllWindows()
Code for hand tracking in Python

Code explanation

Here's an explanation for the above code:

  • Lines 12: In these lines, we import the necessary libraries for the hand tracking project.

  • Lines 46: Here, we initialize the MediaPipe Hands module and its drawing utilities. This module will be used for hand tracking, and the drawing utilities will help visualize the detected landmarks on the frames.

  • Lines 816: In these lines, we specify the path to the input video file and initialize a video capture object using OpenCV’s VideoCapture class. Additionally, we set the desired dimensions for the display window.

  • Lines 1819: This section initializes the hand tracking process using the mphands.Hands context manager. It specifies the minimum confidence levels for both detection and tracking. The main tracking process is enclosed within a while loop that reads frames from the video capture object.

  • Lines 2034: This block of code handles the processing of each frame for hand tracking. It involves converting the frame’s color space, processing the frame using the hand tracking module, and drawing the detected landmarks on the frame if any hands are detected.

  • Lines 3644: These lines handle the final steps of the process. The frame is resized to the desired window dimensions, displayed using cv2.imshow(), and the loop continues until the "q’" key is pressed.

  • Lines 4648: The final lines of the code release the video capture object and close any open windows, ensuring a clean exit from the program.

Conclusion

In this Answer, we explored how to perform hand tracking using OpenCV and the MediaPipe library. We learned how to set up the environment, load and process a video, detect and track hand landmarks, and display the results in a resized window. Hand tracking has a wide range of applications, and with the help of OpenCV and MediaPipe, we can easily integrate this functionality into our projects.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved