Create a virtual paint app using MediaPipe

MediaPipe is a framework developed by Google to facilitate developers with pre-built and customizable AI solutions. It enables the developers to create applications that require real-time observation of video or audio data by providing open-source pre-trained models.

What is a virtual paint app?

It is a painting application that allows the user to create attractive visuals or doodles through their finger gestures without the need for a physical medium to write on or write with. It is specially designed for people who enjoy painting as a hobby or a quick escape from their routine. The canvas is reusable; the user can clear the screen and use it again for a different visual which saves storage.

Color palette

Color palette is a set of color options provided to choose the brush shade according to the user's preference. We can provide multiple color shades; for now, we have given three colors as options: blue, pink, and yellow. There is an eraser as well that helps the user to erase the strokes that are not required.

The color pallete.
The color pallete.

The purpose of providing a color palette along with an eraser is to give users a sense of control and liberty to be creative with their drawings.

How does the header update?

We use a set of images, each highlighting one of the icons from the header, and save them inside a folder according to the order. During the selection procedure, we change the header overlay image according to the selected option. For example, if the user selects the blue color, the default header is changed to the image where the blue color is highlighted.

Headers used for this code.
Headers used for this code.

Note: These are four different .png images saved in the Images directory.

Expected performance

The virtual paint application detects the live hand movement via a web camera and displays the drawn design on the screen. It offers different features that can be accessed through different finger movements. Let's take a look at them one by one.

Select color

Use the index finger and the middle finger to select a color from the palette. we set the proximities for each shade; hence when the fingers are detected, the code checks the range it is detected in and selects the corresponding color from the header.

The following code snippet is used to implement this functionality. We check if the index finger and middle fingers are up, proceed with the if conditions, and set the header image and brush color corresponding to the range. A rectangle appears between the two fingers indicating that the selection process is taking place.

nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Mode
if (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel):
xp, yp = [x1, y1]
# Selecting the colors and the eraser on the screen
if(y1 < 125):
if(170 < x1 < 295):
header = overlayList[0]
drawColor = (255 , 0, 0)
elif(436 < x1 < 561):
header = overlayList[3]
drawColor = (95, 0, 189)
elif(700 < x1 < 825):
header = overlayList[2]
drawColor = (0, 255, 255)
elif(980 < x1 < 1105):
header = overlayList[1]
drawColor = (0, 0, 0)
cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED)

Erase strokes

This feature is implemented in the same way as the color selection method. To erase strokes, select the eraser icon at the rightmost of the header and draw over the stroke that is to be removed. We use a transparent color layer over it so the area appears clear again.

The following if statement from the color selection code snippet implements this functionality. The header image is updated, and the color is set to black.

elif(980 < x1 < 1105):
header = overlayList[1]
drawColor = (0, 0, 0)

Increase brush thickness

Use the index finger and the thumb to change the thickness of the brush. The thickness increases as the gap between the finger and thumb is increased and decreases as the gap between the finger and the thumb decreases. Once the desired size is obtained, use the pinky finger to select and set that size.

The following code snippet is used to implement this functionality. We check if the index finger and the thumb are up and calculate the distance between the two to find the middle point coordinates. a circle of the corresponding size is drawn on the screen to show the thickness. If the pinky finger is up, set the current circle size as the thickness and show a check message that indicates the size is being recorded.

# Adjust the thickness of the line using the index finger and thumb
selecting = [1, 1, 0, 0, 0]
setting = [1, 1, 0, 0, 1]
if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)):
r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3)
# Getting the middle point between these two fingers
x0, y0 = [(x1+x3)/2, (y1+y3)/2]
# Getting the vector that is orthogonal to the line formed between these fingers
v1, v2 = [x1 - x3, y1 - y3]
v1, v2 = [-v2, v1]
#Normalizing it
mod_v = math.sqrt(v1**2 + v2**2)
v1, v2 = [v1/mod_v, v2/mod_v]
#Draw the circle that represents the draw thickness in (x0, y0)
c = 3 + r
x0, y0 = [int(x0 - v1*c), int(y0 - v2*c)]
cv2.circle(image, (x0, y0), int(r/2), drawColor, -1)
#Set the thickness when pinky finger is up
if fingers[4]:
thickness = r
cv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1)
xp, yp = [x1, y1]

Break the stroke

Use the index finger and the pinky finger to break the stroke while making a drawing. This helps to give a gap between the drawn objects to create a more clear visual. Hold the two mentioned fingers in front of the webcam until you notice a thin line appearing between the two fingers.

The following code snippet is used to implement this functionality. We check if the index finger and pinky fingers are up, proceed with drawing a line between the two fingers, and record their current position to show that it is on standby mode and no stroke is drawn.

#Stand by Mode
nonStand = [0, 2, 3]
if (fingers[1] and fingers[4]) and all(fingers[i] == 0 for i in nonStand):
cv2.line(image, (xp, yp), (x4, y4), drawColor, 5)
xp, yp = [x1, y1]

Clear the screen

Close all the fingers and thumb to form a fist or move the hand away from the camera proximity to clear the screen.

The following code snippet is used to implement this functionality. We check if none of the fingers or thumb is visible, then it creates a blank canvas and updates the canvas that is visible to the user. The coordinates are also updated to the current index finger coordinates.

#Clear the canvas when no fingers visible
if all(fingers[i] == 0 for i in range(0, 5)):
imgCanvas = np.zeros((height, width, 3), np.uint8)
xp, yp = [x1, y1]

Let's combine all these code snippets to integrate the mentioned feature in the virtual paint application.

Required imports

To implement a virtual pain application in code, we first need to import the following libraries and modules.

import cv2
import numpy as np
import os
import mediapipe as mp
import math
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands
  • cv2: The OpenCV library that is used for computer vision-related tasks.

  • numpy: Used to do numerical computations.

  • os: Used to interact with the operating system and access file operations.

  • mediapipe: Used to build computer vision and AI-related applications.

  • math: Used to access mathematical functions.

Example code

In this code, we implement an application that allows the user to select the color and thickness of the stroke and paint on the live canvas through gestures.

import cv2
import mediapipe as mp
import numpy as np
import os
import math

mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands

#Take webcam input:
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 5)
width = 1280
height = 720
cap.set(3, width)
cap.set(4, height)

imgCanvas = np.zeros((height, width, 3), np.uint8)

#Get header images from the image directory
script_dir = os.path.dirname(os.path.abspath(__file__))
folderPath = os.path.join(script_dir, 'Images')

myList = os.listdir(folderPath)
overlayList = []
for imPath in myList:
    image = cv2.imread(f'{folderPath}/{imPath}')
    overlayList.append(image)

#Default setting:
header = overlayList[0]
drawColor = (0, 0, 255)
thickness = 20
tipIds = [4, 8, 12, 16, 20] 
xp, yp = [0, 0]

with mp_hands.Hands(min_detection_confidence=0.85, min_tracking_confidence=0.5, max_num_hands=1) as hands:
    while cap.isOpened():
        success, image = cap.read()
        if not success:
            print("Ignoring empty camera frame.")
            break
            
        image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
        image.flags.writeable = False
        results = hands.process(image)

        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        if results.multi_hand_landmarks:
            for hand_landmarks in results.multi_hand_landmarks:
                points = []
                for lm in hand_landmarks.landmark:
                    points.append([int(lm.x * width), int(lm.y * height)])

                #If hand is detected initialize coordinates
                if len(points) != 0:
                    x1, y1 = points[8]  # Index finger
                    x2, y2 = points[12] # Middle finger
                    x3, y3 = points[4]  # Thumb
                    x4, y4 = points[20] # Pinky

                    #Checking if a finger/thumb is up
                    fingers = []
                    if points[tipIds[0]][0] < points[tipIds[0] - 1][0]:
                        fingers.append(1)
                    else:
                        fingers.append(0)

                    for id in range(1, 5):
                        if points[tipIds[id]][1] < points[tipIds[id] - 2][1]:
                            fingers.append(1)
                        else:
                            fingers.append(0)

                    #Selection Mode - Two fingers are up
                    nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Mode
                    if (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel):
                        xp, yp = [x1, y1]

                        # Selecting the colors and the eraser on the screen
                        if(y1 < 125):
                            if(170 < x1 < 295):
                                header = overlayList[0]
                                drawColor = (255 , 0, 0)
                            elif(436 < x1 < 561):
                                header = overlayList[3]
                                drawColor = (95, 0, 189)
                            elif(700 < x1 < 825):
                                header = overlayList[2]
                                drawColor = (0, 255, 255)
                            elif(980 < x1 < 1105):
                                header = overlayList[1]
                                drawColor = (0, 0, 0)

                        cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED)

                    #Stand by Mode
                    nonStand = [0, 2, 3]
                    if (fingers[1] and fingers[4]) and all(fingers[i] == 0 for i in nonStand):
                        cv2.line(image, (xp, yp), (x4, y4), drawColor, 5) 
                        xp, yp = [x1, y1]

                    #Draw Mode
                    nonDraw = [0, 2, 3, 4]
                    if fingers[1] and all(fingers[i] == 0 for i in nonDraw):
                        
                        cv2.circle(image, (x1, y1), int(thickness/2), drawColor, cv2.FILLED) 
                        if xp==0 and yp==0:
                            xp, yp = [x1, y1]
                        cv2.line(imgCanvas, (xp, yp), (x1, y1), drawColor, thickness)
                        xp, yp = [x1, y1]

                    #Clear the canvas
                    if all(fingers[i] == 0 for i in range(0, 5)):
                        imgCanvas = np.zeros((height, width, 3), np.uint8)
                        xp, yp = [x1, y1]

                    #Adjust the thickness 
                    selecting = [1, 1, 0, 0, 0] 
                    setting = [1, 1, 0, 0, 1]  
                    if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)):
                        r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3)
                        
                        # Get the middle point between fingers
                        x0, y0 = [(x1+x3)/2, (y1+y3)/2]
                        
                        # Getting the vector that is orthogonal
                        v1, v2 = [x1 - x3, y1 - y3]
                        v1, v2 = [-v2, v1]

                        #Normalizing it 
                        mod_v = math.sqrt(v1**2 + v2**2)
                        v1, v2 = [v1/mod_v, v2/mod_v]
                        
                        # Draw the circle that represents the draw thickness
                        c = 3 + r
                        x0, y0 = [int(x0 - v1*c), int(y0 - v2*c)]
                        cv2.circle(image, (x0, y0), int(r/2), drawColor, -1)

                        #Set the thickness
                        if fingers[4]:                        
                            thickness = r
                            cv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1)

                        xp, yp = [x1, y1]

        #Set the header
        header_height = 125
        header_width = width

        header = cv2.resize(header, (header_width, header_height))
        image[0:125, 0:width] = header

        #Camera image with the drawing made in imgCanvas
        imgGray = cv2.cvtColor(imgCanvas, cv2.COLOR_BGR2GRAY)
        _, imgInv = cv2.threshold(imgGray, 5, 255, cv2.THRESH_BINARY_INV)
        imgInv = cv2.cvtColor(imgInv, cv2.COLOR_GRAY2BGR)
        img = cv2.bitwise_and(image, imgInv)
        img = cv2.bitwise_or(img, imgCanvas)

        cv2.imshow('MediaPipe Hands', img)
        if cv2.waitKey(3) == 13:
            break

cap.release()
cv2.destroyAllWindows()
Code for a virtual paint app.

Note: This is an unexcutable code. Copy this code to your Python file and import the specified required imports to run it successfully.

Code explanation

It is the main file that contains the customized specification of the application's appearance, the source of data being tested, and checks.

  • Lines 1–8: Import all the necessary libraries and modules.

  • Lines 11–16: Use VideoCapture() to open the camera and specify the video resolutions, width, and height.

    • Pass 0 as a parameter to open a webcam.

    • Pass the filename or file link to open a pre-recorded video inside "".

  • Line 18: Define the image that will contain the drawing and then passed to the camera image.

  • Lines 21–28: Get the path of the folder that contains all the header images and pass it to the listdir() so the images in the folder can be saved in the overlayList.

  • Lines 31–35: Define the default settings for the application that are set as soon as the webcam opens, and initialize a list containing landmark IDs corresponding to the fingertips and the coordinates that will change with the movement.

  • Line 37: Set the detection and tracking accuracy level as well as the maximum number of hands in the Hands() object.

  • Lines 38–42: Set the Create a loop that starts as the webcam opens and read() the images if it was a success, or else print a statement indicating the action was unsuccessful.

  • Lines 44–46: Flip the image horizontally, convert the BGR image to RGB, and store the updated image to the image attribute that is then further processed and saved to results.

  • Line 48: The processed image is again converted back from RBG to BGR using the cvtColor method.

  • Lines 49–60: Check if handmarks are detected in the frame, iterate through them to identify the coordinates and append them to the points list as well as initialize the coordinates for each finger.

  • Lines 63–73: Check if the thumb is extended horizontally or the fingers are extended vertically; append 1 to the fingers list else, append 0.

  • Lines 81–95: Add the code snippet for selecting a color or eraser from the pallet in the header.

  • Lines 98–101: Add the code snippet for breaking the stroke by adding a standby mode.

  • Lines 104–111: Add the code snippet for activating the drawing mode by setting the color and coordinates.

  • Lines 114–116: Add the code snippet for clearing the screen.

  • Lines 119–145: Add the code snippet for adjusting the thickness of the stroke and dynamically changing the size of the circle that is visible on the screen.

  • Lines 148–152: Resize the image to fit the camera screen that is displayed to the user, and add it to the top of the screen as a header

  • Lines 155–159: Takes the content that is drawn on the imgCanvas, creates its binary mask, and displays it on the live camera screen.

  • Lines 161–166: Display the application using imshow() and lose the application as the enter key is pressed and consequently turn off the webcam and destroy the window.

Code output

This video shows a working virtual paint app with all the features integrated and a dynamically changing header that contains the color palette and an eraser.

Note: Learn more about mediapipe through an Answer of finger counter using mediapipe.

Common queries

Question

How can we add more colors to the palette?

0/500
Show Answer

New on Educative
Learn to Code
Learn any Language as a beginner
Develop a human edge in an AI powered world and learn to code with AI from our beginner friendly catalog
🏆 Leaderboard
Daily Coding Challenge
Solve a new coding challenge every day and climb the leaderboard

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved