MediaPipe is a framework developed by Google to facilitate developers with pre-built and customizable AI solutions. It enables the developers to create applications that require real-time observation of video or audio data by providing open-source pre-trained models.
It is a painting application that allows the user to create attractive visuals or doodles through their finger gestures without the need for a physical medium to write on or write with. It is specially designed for people who enjoy painting as a hobby or a quick escape from their routine. The canvas is reusable; the user can clear the screen and use it again for a different visual which saves storage.
Color palette is a set of color options provided to choose the brush shade according to the user's preference. We can provide multiple color shades; for now, we have given three colors as options: blue, pink, and yellow. There is an eraser as well that helps the user to erase the strokes that are not required.
The purpose of providing a color palette along with an eraser is to give users a sense of control and liberty to be creative with their drawings.
We use a set of images, each highlighting one of the icons from the header, and save them inside a folder according to the order. During the selection procedure, we change the header overlay image according to the selected option. For example, if the user selects the blue color, the default header is changed to the image where the blue color is highlighted.
Note: These are four different .png images saved in the Images directory.
The virtual paint application detects the live hand movement via a web camera and displays the drawn design on the screen. It offers different features that can be accessed through different finger movements. Let's take a look at them one by one.
Use the index finger and the middle finger to select a color from the palette. we set the proximities for each shade; hence when the fingers are detected, the code checks the range it is detected in and selects the corresponding color from the header.
The following code snippet is used to implement this functionality. We check if the index finger and middle fingers are up, proceed with the if conditions, and set the header image and brush color corresponding to the range. A rectangle appears between the two fingers indicating that the selection process is taking place.
nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Modeif (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel):xp, yp = [x1, y1]# Selecting the colors and the eraser on the screenif(y1 < 125):if(170 < x1 < 295):header = overlayList[0]drawColor = (255 , 0, 0)elif(436 < x1 < 561):header = overlayList[3]drawColor = (95, 0, 189)elif(700 < x1 < 825):header = overlayList[2]drawColor = (0, 255, 255)elif(980 < x1 < 1105):header = overlayList[1]drawColor = (0, 0, 0)cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED)
This feature is implemented in the same way as the color selection method. To erase strokes, select the eraser icon at the rightmost of the header and draw over the stroke that is to be removed. We use a transparent color layer over it so the area appears clear again.
The following if statement from the color selection code snippet implements this functionality. The header image is updated, and the color is set to black.
elif(980 < x1 < 1105):header = overlayList[1]drawColor = (0, 0, 0)
Use the index finger and the thumb to change the thickness of the brush. The thickness increases as the gap between the finger and thumb is increased and decreases as the gap between the finger and the thumb decreases. Once the desired size is obtained, use the pinky finger to select and set that size.
The following code snippet is used to implement this functionality. We check if the index finger and the thumb are up and calculate the distance between the two to find the middle point coordinates. a circle of the corresponding size is drawn on the screen to show the thickness. If the pinky finger is up, set the current circle size as the thickness and show a check message that indicates the size is being recorded.
# Adjust the thickness of the line using the index finger and thumbselecting = [1, 1, 0, 0, 0]setting = [1, 1, 0, 0, 1]if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)):r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3)# Getting the middle point between these two fingersx0, y0 = [(x1+x3)/2, (y1+y3)/2]# Getting the vector that is orthogonal to the line formed between these fingersv1, v2 = [x1 - x3, y1 - y3]v1, v2 = [-v2, v1]#Normalizing itmod_v = math.sqrt(v1**2 + v2**2)v1, v2 = [v1/mod_v, v2/mod_v]#Draw the circle that represents the draw thickness in (x0, y0)c = 3 + rx0, y0 = [int(x0 - v1*c), int(y0 - v2*c)]cv2.circle(image, (x0, y0), int(r/2), drawColor, -1)#Set the thickness when pinky finger is upif fingers[4]:thickness = rcv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1)xp, yp = [x1, y1]
Use the index finger and the pinky finger to break the stroke while making a drawing. This helps to give a gap between the drawn objects to create a more clear visual. Hold the two mentioned fingers in front of the webcam until you notice a thin line appearing between the two fingers.
The following code snippet is used to implement this functionality. We check if the index finger and pinky fingers are up, proceed with drawing a line between the two fingers, and record their current position to show that it is on standby mode and no stroke is drawn.
#Stand by ModenonStand = [0, 2, 3]if (fingers[1] and fingers[4]) and all(fingers[i] == 0 for i in nonStand):cv2.line(image, (xp, yp), (x4, y4), drawColor, 5)xp, yp = [x1, y1]
Close all the fingers and thumb to form a fist or move the hand away from the camera proximity to clear the screen.
The following code snippet is used to implement this functionality. We check if none of the fingers or thumb is visible, then it creates a blank canvas and updates the canvas that is visible to the user. The coordinates are also updated to the current index finger coordinates.
#Clear the canvas when no fingers visibleif all(fingers[i] == 0 for i in range(0, 5)):imgCanvas = np.zeros((height, width, 3), np.uint8)xp, yp = [x1, y1]
Let's combine all these code snippets to integrate the mentioned feature in the virtual paint application.
To implement a virtual pain application in code, we first need to import the following libraries and modules.
import cv2import numpy as npimport osimport mediapipe as mpimport mathmp_drawing = mp.solutions.drawing_utilsmp_hands = mp.solutions.hands
cv2:
The OpenCV
library that is used for computer vision-related tasks.
numpy:
Used to do numerical computations.
os:
Used to interact with the operating system and access file operations.
mediapipe:
Used to build computer vision and AI-related applications.
math:
Used to access mathematical functions.
In this code, we implement an application that allows the user to select the color and thickness of the stroke and paint on the live canvas through gestures.
import cv2 import mediapipe as mp import numpy as np import os import math mp_drawing = mp.solutions.drawing_utils mp_hands = mp.solutions.hands #Take webcam input: cap = cv2.VideoCapture(0) cap.set(cv2.CAP_PROP_FPS, 5) width = 1280 height = 720 cap.set(3, width) cap.set(4, height) imgCanvas = np.zeros((height, width, 3), np.uint8) #Get header images from the image directory script_dir = os.path.dirname(os.path.abspath(__file__)) folderPath = os.path.join(script_dir, 'Images') myList = os.listdir(folderPath) overlayList = [] for imPath in myList: image = cv2.imread(f'{folderPath}/{imPath}') overlayList.append(image) #Default setting: header = overlayList[0] drawColor = (0, 0, 255) thickness = 20 tipIds = [4, 8, 12, 16, 20] xp, yp = [0, 0] with mp_hands.Hands(min_detection_confidence=0.85, min_tracking_confidence=0.5, max_num_hands=1) as hands: while cap.isOpened(): success, image = cap.read() if not success: print("Ignoring empty camera frame.") break image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB) image.flags.writeable = False results = hands.process(image) image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) if results.multi_hand_landmarks: for hand_landmarks in results.multi_hand_landmarks: points = [] for lm in hand_landmarks.landmark: points.append([int(lm.x * width), int(lm.y * height)]) #If hand is detected initialize coordinates if len(points) != 0: x1, y1 = points[8] # Index finger x2, y2 = points[12] # Middle finger x3, y3 = points[4] # Thumb x4, y4 = points[20] # Pinky #Checking if a finger/thumb is up fingers = [] if points[tipIds[0]][0] < points[tipIds[0] - 1][0]: fingers.append(1) else: fingers.append(0) for id in range(1, 5): if points[tipIds[id]][1] < points[tipIds[id] - 2][1]: fingers.append(1) else: fingers.append(0) #Selection Mode - Two fingers are up nonSel = [0, 3, 4] # indexes of the fingers that need to be down in the Selection Mode if (fingers[1] and fingers[2]) and all(fingers[i] == 0 for i in nonSel): xp, yp = [x1, y1] # Selecting the colors and the eraser on the screen if(y1 < 125): if(170 < x1 < 295): header = overlayList[0] drawColor = (255 , 0, 0) elif(436 < x1 < 561): header = overlayList[3] drawColor = (95, 0, 189) elif(700 < x1 < 825): header = overlayList[2] drawColor = (0, 255, 255) elif(980 < x1 < 1105): header = overlayList[1] drawColor = (0, 0, 0) cv2.rectangle(image, (x1-10, y1-15), (x2+10, y2+23), drawColor, cv2.FILLED) #Stand by Mode nonStand = [0, 2, 3] if (fingers[1] and fingers[4]) and all(fingers[i] == 0 for i in nonStand): cv2.line(image, (xp, yp), (x4, y4), drawColor, 5) xp, yp = [x1, y1] #Draw Mode nonDraw = [0, 2, 3, 4] if fingers[1] and all(fingers[i] == 0 for i in nonDraw): cv2.circle(image, (x1, y1), int(thickness/2), drawColor, cv2.FILLED) if xp==0 and yp==0: xp, yp = [x1, y1] cv2.line(imgCanvas, (xp, yp), (x1, y1), drawColor, thickness) xp, yp = [x1, y1] #Clear the canvas if all(fingers[i] == 0 for i in range(0, 5)): imgCanvas = np.zeros((height, width, 3), np.uint8) xp, yp = [x1, y1] #Adjust the thickness selecting = [1, 1, 0, 0, 0] setting = [1, 1, 0, 0, 1] if all(fingers[i] == j for i, j in zip(range(0, 5), selecting)) or all(fingers[i] == j for i, j in zip(range(0, 5), setting)): r = int(math.sqrt((x1-x3)**2 + (y1-y3)**2)/3) # Get the middle point between fingers x0, y0 = [(x1+x3)/2, (y1+y3)/2] # Getting the vector that is orthogonal v1, v2 = [x1 - x3, y1 - y3] v1, v2 = [-v2, v1] #Normalizing it mod_v = math.sqrt(v1**2 + v2**2) v1, v2 = [v1/mod_v, v2/mod_v] # Draw the circle that represents the draw thickness c = 3 + r x0, y0 = [int(x0 - v1*c), int(y0 - v2*c)] cv2.circle(image, (x0, y0), int(r/2), drawColor, -1) #Set the thickness if fingers[4]: thickness = r cv2.putText(image, 'Check', (x4-25, y4-8), cv2.FONT_HERSHEY_TRIPLEX, 0.8, (0,0,0), 1) xp, yp = [x1, y1] #Set the header header_height = 125 header_width = width header = cv2.resize(header, (header_width, header_height)) image[0:125, 0:width] = header #Camera image with the drawing made in imgCanvas imgGray = cv2.cvtColor(imgCanvas, cv2.COLOR_BGR2GRAY) _, imgInv = cv2.threshold(imgGray, 5, 255, cv2.THRESH_BINARY_INV) imgInv = cv2.cvtColor(imgInv, cv2.COLOR_GRAY2BGR) img = cv2.bitwise_and(image, imgInv) img = cv2.bitwise_or(img, imgCanvas) cv2.imshow('MediaPipe Hands', img) if cv2.waitKey(3) == 13: break cap.release() cv2.destroyAllWindows()
Note: This is an unexcutable code. Copy this code to your Python file and import the specified required imports to run it successfully.
It is the main file that contains the customized specification of the application's appearance, the source of data being tested, and checks.
Lines 1–8: Import all the necessary libraries and modules.
Lines 11–16: Use VideoCapture()
to open the camera and specify the video resolutions, width, and height.
Pass 0
as a parameter to open a webcam.
Pass the filename or file link to open a pre-recorded video inside ""
.
Line 18: Define the image that will contain the drawing and then passed to the camera image.
Lines 21–28: Get the path of the folder that contains all the header images and pass it to the listdir()
so the images in the folder can be saved in the overlayList
.
Lines 31–35: Define the default settings for the application that are set as soon as the webcam opens, and initialize a list containing landmark IDs corresponding to the fingertips and the coordinates that will change with the movement.
Line 37: Set the detection and tracking accuracy level as well as the maximum number of hands in the Hands()
object.
Lines 38–42: Set the Create a loop that starts as the webcam opens and read()
the images if it was a success, or else print a statement indicating the action was unsuccessful.
Lines 44–46: Flip the image horizontally, convert the BGR image to RGB, and store the updated image to the image
attribute that is then further processed and saved to results
.
Line 48: The processed image is again converted back from RBG to BGR using the cvtColor
method.
Lines 49–60: Check if handmarks are detected in the frame, iterate through them to identify the coordinates and append them to the points list as well as initialize the coordinates for each finger.
Lines 63–73: Check if the thumb is extended horizontally or the fingers are extended vertically; append 1
to the fingers
list else, append 0
.
Lines 81–95: Add the code snippet for selecting a color or eraser from the pallet in the header.
Lines 98–101: Add the code snippet for breaking the stroke by adding a standby mode.
Lines 104–111: Add the code snippet for activating the drawing mode by setting the color and coordinates.
Lines 114–116: Add the code snippet for clearing the screen.
Lines 119–145: Add the code snippet for adjusting the thickness of the stroke and dynamically changing the size of the circle that is visible on the screen.
Lines 148–152: Resize the image to fit the camera screen that is displayed to the user, and add it to the top of the screen as a header
Lines 155–159: Takes the content that is drawn on the imgCanvas
, creates its binary mask, and displays it on the live camera screen.
Lines 161–166: Display the application using imshow()
and lose the application as the enter key is pressed and consequently turn off the webcam and destroy the window.
This video shows a working virtual paint app with all the features integrated and a dynamically changing header that contains the color palette and an eraser.
Note: Learn more about mediapipe through an Answer of finger counter using mediapipe.
How can we add more colors to the palette?
Free Resources