Create a virtual paint app using MediaPipe

MediaPipe is a framework developed by Google to facilitate developers with pre-built and customizable AI solutions. It enables the developers to create applications that require real-time observation of video or audio data by providing open-source pre-trained models.

What is a virtual paint app?

It is a painting application that allows the user to create attractive visuals or doodles through their finger gestures without the need for a physical medium to write on or write with. It is specially designed for people who enjoy painting as a hobby or a quick escape from their routine. The canvas is reusable; the user can clear the screen and use it again for a different visual which saves storage.

Color palette

Color palette is a set of color options provided to choose the brush shade according to the user's preference. We can provide multiple color shades; for now, we have given three colors as options: blue, pink, and yellow. There is an eraser as well that helps the user to erase the strokes that are not required.

The purpose of providing a color palette along with an eraser is to give users a sense of control and liberty to be creative with their drawings.

How does the header update?

We use a set of images, each highlighting one of the icons from the header, and save them inside a folder according to the order. During the selection procedure, we change the header overlay image according to the selected option. For example, if the user selects the blue color, the default header is changed to the image where the blue color is highlighted.

Note: These are four different .png images saved in the Images directory.

Expected performance

The virtual paint application detects the live hand movement via a web camera and displays the drawn design on the screen. It offers different features that can be accessed through different finger movements. Let's take a look at them one by one.

Select color

Use the index finger and the middle finger to select a color from the palette. we set the proximities for each shade; hence when the fingers are detected, the code checks the range it is detected in and selects the corresponding color from the header.

                   nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Mode
                    if (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel):
                        xp, yp = [x1, y1]
                        # Selecting the colors and the eraser on the screen
                        if(y1 < 125):
                            if(170 < x1 < 295):
                                header = overlayList[0]
                                drawColor = (255 , 0, 0)
                            elif(436 < x1 < 561):
                                header = overlayList[3]
                                drawColor = (95, 0, 189)
                            elif(700 < x1 < 825):
                                header = overlayList[2]
                                drawColor = (0, 255, 255)
                            elif(980 < x1 < 1105):
                                header = overlayList[1]
                                drawColor = (0, 0, 0)
                        cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED)

# Adjust the thickness of the line using the index finger and thumb
selecting = [1, 1, 0, 0, 0] 
setting = [1, 1, 0, 0, 1]  
if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)):
                    
    r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3)
                        
    # Getting the middle point between these two fingers
    x0, y0 = [(x1+x3)/2, (y1+y3)/2]
                        
    # Getting the vector that is orthogonal to the line formed between these fingers
    v1, v2 = [x1 - x3, y1 - y3]
    v1, v2 = [-v2, v1]
    #Normalizing it 
    mod_v = math.sqrt(v1**2 + v2**2)
    v1, v2 = [v1/mod_v, v2/mod_v]
                        
    #Draw the circle that represents the draw thickness in (x0, y0)
    c = 3 + r
    x0, y0 = [int(x0 - v1*c), int(y0 - v2*c)]
    cv2.circle(image, (x0, y0), int(r/2), drawColor, -1)
    #Set the thickness when pinky finger is up
    if fingers[4]:                        
        thickness = r
        cv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1)
    xp, yp = [x1, y1]

import cv2
import mediapipe as mp
import numpy as np
import os
import math

mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands

#Take webcam input:
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 5)
width = 1280
height = 720
cap.set(3, width)
cap.set(4, height)

imgCanvas = np.zeros((height, width, 3), np.uint8)

#Get header images from the image directory
script_dir = os.path.dirname(os.path.abspath(__file__))
folderPath = os.path.join(script_dir, 'Images')

myList = os.listdir(folderPath)
overlayList = []
for imPath in myList:
    image = cv2.imread(f'{folderPath}/{imPath}')
    overlayList.append(image)

#Default setting:
header = overlayList[0]
drawColor = (0, 0, 255)
thickness = 20
tipIds = [4, 8, 12, 16, 20] 
xp, yp = [0, 0]

with mp_hands.Hands(min_detection_confidence=0.85, min_tracking_confidence=0.5, max_num_hands=1) as hands:
    while cap.isOpened():
        success, image = cap.read()
        if not success:
            print("Ignoring empty camera frame.")
            break
            
        image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
        image.flags.writeable = False
        results = hands.process(image)

        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        if results.multi_hand_landmarks:
            for hand_landmarks in results.multi_hand_landmarks:
                points = []
                for lm in hand_landmarks.landmark:
                    points.append([int(lm.x * width), int(lm.y * height)])

                #If hand is detected initialize coordinates
                if len(points) != 0:
                    x1, y1 = points[8]  # Index finger
                    x2, y2 = points[12] # Middle finger
                    x3, y3 = points[4]  # Thumb
                    x4, y4 = points[20] # Pinky

                    #Checking if a finger/thumb is up
                    fingers = []
                    if points[tipIds[0]][0] < points[tipIds[0] - 1][0]:
                        fingers.append(1)
                    else:
                        fingers.append(0)

                    for id in range(1, 5):
                        if points[tipIds[id]][1] < points[tipIds[id] - 2][1]:
                            fingers.append(1)
                        else:
                            fingers.append(0)

                    #Selection Mode - Two fingers are up
                    nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Mode
                    if (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel):
                        xp, yp = [x1, y1]

                        # Selecting the colors and the eraser on the screen
                        if(y1 < 125):
                            if(170 < x1 < 295):
                                header = overlayList[0]
                                drawColor = (255 , 0, 0)
                            elif(436 < x1 < 561):
                                header = overlayList[3]
                                drawColor = (95, 0, 189)
                            elif(700 < x1 < 825):
                                header = overlayList[2]
                                drawColor = (0, 255, 255)
                            elif(980 < x1 < 1105):
                                header = overlayList[1]
                                drawColor = (0, 0, 0)

                        cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED)

                    #Stand by Mode
                    nonStand = [0, 2, 3]
                    if (fingers[1] and fingers[4]) and all(fingers[i] == 0 for i in nonStand):
                        cv2.line(image, (xp, yp), (x4, y4), drawColor, 5) 
                        xp, yp = [x1, y1]

                    #Draw Mode
                    nonDraw = [0, 2, 3, 4]
                    if fingers[1] and all(fingers[i] == 0 for i in nonDraw):
                        
                        cv2.circle(image, (x1, y1), int(thickness/2), drawColor, cv2.FILLED) 
                        if xp==0 and yp==0:
                            xp, yp = [x1, y1]
                        cv2.line(imgCanvas, (xp, yp), (x1, y1), drawColor, thickness)
                        xp, yp = [x1, y1]

                    #Clear the canvas
                    if all(fingers[i] == 0 for i in range(0, 5)):
                        imgCanvas = np.zeros((height, width, 3), np.uint8)
                        xp, yp = [x1, y1]

                    #Adjust the thickness 
                    selecting = [1, 1, 0, 0, 0] 
                    setting = [1, 1, 0, 0, 1]  
                    if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)):
                        r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3)
                        
                        # Get the middle point between fingers
                        x0, y0 = [(x1+x3)/2, (y1+y3)/2]
                        
                        # Getting the vector that is orthogonal
                        v1, v2 = [x1 - x3, y1 - y3]
                        v1, v2 = [-v2, v1]

                        #Normalizing it 
                        mod_v = math.sqrt(v1**2 + v2**2)
                        v1, v2 = [v1/mod_v, v2/mod_v]
                        
                        # Draw the circle that represents the draw thickness
                        c = 3 + r
                        x0, y0 = [int(x0 - v1*c), int(y0 - v2*c)]
                        cv2.circle(image, (x0, y0), int(r/2), drawColor, -1)

                        #Set the thickness
                        if fingers[4]:                        
                            thickness = r
                            cv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1)

                        xp, yp = [x1, y1]

        #Set the header
        header_height = 125
        header_width = width

        header = cv2.resize(header, (header_width, header_height))
        image[0:125, 0:width] = header

        #Camera image with the drawing made in imgCanvas
        imgGray = cv2.cvtColor(imgCanvas, cv2.COLOR_BGR2GRAY)
        _, imgInv = cv2.threshold(imgGray, 5, 255, cv2.THRESH_BINARY_INV)
        imgInv = cv2.cvtColor(imgInv, cv2.COLOR_GRAY2BGR)
        img = cv2.bitwise_and(image, imgInv)
        img = cv2.bitwise_or(img, imgCanvas)

        cv2.imshow('MediaPipe Hands', img)
        if cv2.waitKey(3) == 13:
            break

cap.release()
cv2.destroyAllWindows()

Code for a virtual paint app.

Note: This is an unexcutable code. Copy this code to your Python file and import the specified required imports to run it successfully.

Code explanation

It is the main file that contains the customized specification of the application's appearance, the source of data being tested, and checks.

Lines 1–8: Import all the necessary libraries and modules.
Lines 11–16: Use VideoCapture() to open the camera and specify the video resolutions, width, and height.
- Pass 0 as a parameter to open a webcam.
- Pass the filename or file link to open a pre-recorded video inside "".
Line 18: Define the image that will contain the drawing and then passed to the camera image.
Lines 21–28: Get the path of the folder that contains all the header images and pass it to the listdir() so the images in the folder can be saved in the overlayList.

Lines 31–35: Define the default settings for the application that are set as soon as the webcam opens, and initialize a list containing landmark IDs corresponding to the fingertips and the coordinates that will change with the movement.
Line 37: Set the detection and tracking accuracy level as well as the maximum number of hands in the Hands() object.
Lines 38–42: Set the Create a loop that starts as the webcam opens and read() the images if it was a success, or else print a statement indicating the action was unsuccessful.

Lines 44–46: Flip the image horizontally, convert the BGR image to RGB, and store the updated image to the image attribute that is then further processed and saved to results.
Line 48: The processed image is again converted back from RBG to BGR using the cvtColor method.
Lines 49–60: Check if handmarks are detected in the frame, iterate through them to identify the coordinates and append them to the points list as well as initialize the coordinates for each finger.
Lines 63–73: Check if the thumb is extended horizontally or the fingers are extended vertically; append 1 to the fingers list else, append 0.
Lines 81–95: Add the code snippet for selecting a color or eraser from the pallet in the header.
Lines 98–101: Add the code snippet for breaking the stroke by adding a standby mode.
Lines 104–111: Add the code snippet for activating the drawing mode by setting the color and coordinates.
Lines 114–116: Add the code snippet for clearing the screen.
Lines 119–145: Add the code snippet for adjusting the thickness of the stroke and dynamically changing the size of the circle that is visible on the screen.

Lines 148–152: Resize the image to fit the camera screen that is displayed to the user, and add it to the top of the screen as a header
Lines 155–159: Takes the content that is drawn on the imgCanvas, creates its binary mask, and displays it on the live camera screen.
Lines 161–166: Display the application using imshow() and lose the application as the enter key is pressed and consequently turn off the webcam and destroy the window.

Code output

This video shows a working virtual paint app with all the features integrated and a dynamically changing header that contains the color palette and an eraser.

Create a virtual paint app using MediaPipe

What is a virtual paint app?

Color palette

How does the header update?

Expected performance

Select color

Erase strokes

Increase brush thickness

Break the stroke

Clear the screen

Required imports

Example code

Code explanation

Code output

Common queries