A face mesh is a 3D model representation of a person's face. We can achieve this using a collection of interconnected vertices and edges that define the structure of the face in three-dimensional space. How amazing is that?
The vertices represent specific points on the face, such as the corners of the eyes, nose, mouth, and other facial landmarks. Edges connect these vertices, and polygons form the surface of the 3D model.
For media processing and face mesh creation, we will be incorporating the two libraries below.
OpenCV is a highly efficient open-source computer vision and machine learning library. We can process images and videos, handle real-time computer vision tasks, perform feature detection, object recognition, and many other computer vision tasks using this library.
Mediapipe is a library developed by Google that focuses on building pipelines for various media processing tasks including computer vision. It simplifies our work by providing pre-built solutions for tasks like face detection, hand tracking, pose estimation, face mesh generation, and even more.
Let's start with images initially. The following section gives an easy-to-grasp walkthrough on how to set up a mesh for images.
import cv2import mediapipe as mp
First and foremost, we import the required libraries in order to execute the code.
cv2
for image processing
mediapipe
for the face mesh model
def detectFaceLandmarks(imagePath):mpFaceMesh = mp.solutions.face_mesh.FaceMesh(static_image_mode=True,max_num_faces=1,min_detection_confidence=0.7,min_tracking_confidence=0.7,)frame = cv2.imread(imagePath)frameRGB = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)results = mpFaceMesh.process(frameRGB)return frame, results
We define detectFaceLandmarks
that takes an imagePath
as input and returns the original frame
and the results
i.e. obtained from the FaceMesh model.
It creates an instance of the mp.solutions.face_mesh.FaceMesh
class with customized parameters to perform face landmark detection. We pass static_image_mode
as True
since we're dealing with an unchangeable image.
The image is read from the provided imagePath
through cv2.imread
and converted to an cv2.cvtColor
.
Finally, we apply the face mesh model to our image using mpFaceMesh.process
, and the face landmarks are stored in the results
variable.
def displayFaceLandmarks(frame, results):frameHeight, frameWidth, _ = frame.shapeif results.multi_face_landmarks:for faceLandmarks in results.multi_face_landmarks:for landmark in faceLandmarks.landmark:x, y = int(landmark.x * frameWidth), int(landmark.y * frameHeight)cv2.circle(frame, (x, y), 1, (0, 255, 0), -1)cv2.imshow('3D face mesh', frame)cv2.waitKey(0)cv2.destroyAllWindows()
Next up, our displayFaceLandmarks
method takes the frame
and results
as input and displays the image along with the face landmarks.
It extracts the height and width of the frame
using frame.shape
so that we can map both the face landmarks and the pixels relative to each other.
The function checks if any face landmarks were detected using results.multi_face_landmarks
. If landmarks are detected, we draw a green circle at each landmark point on our initial image using cv2.circle
.
The processed image is displayed in a window titled '3D face mesh' using cv2.imshow
.
Note: The window will close when any key is pressed
cv2.waitKey(0)
, and all OpenCV windows will be closed bycv2.destroyAllWindows()
.
if __name__ == "__main__":imagePath = 'sample.png'frame, results = detectFaceLandmarks(imagePath)displayFaceLandmarks(frame, results)
We define our main function, which is called when the code runs. Here, we can specify the path to our image and see our face mesh in action!
Congratulations, we just made our first 3D face mesh! Experiment with the code below and click on "Run" to see it in action.
import cv2 import mediapipe as mp def detectFaceLandmarks(imagePath): mpFaceMesh = mp.solutions.face_mesh.FaceMesh( static_image_mode=True, max_num_faces=1, min_detection_confidence=0.7, min_tracking_confidence=0.7, ) frame = cv2.imread(imagePath) frameRGB = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) results = mpFaceMesh.process(frameRGB) return frame, results def displayFaceLandmarks(frame, results): frameHeight, frameWidth, _ = frame.shape if results.multi_face_landmarks: for faceLandmarks in results.multi_face_landmarks: for landmark in faceLandmarks.landmark: x, y = int(landmark.x * frameWidth), int(landmark.y * frameHeight) cv2.circle(frame, (x, y), 1, (0, 255, 0), -1) cv2.imshow('3D face mesh', frame) cv2.waitKey(0) cv2.destroyAllWindows() if __name__ == "__main__": imagePath = 'sample.png' frame, results = detectFaceLandmarks(imagePath) displayFaceLandmarks(frame, results)
We're now ready to process videos as well, so let's dive straight into it!
import cv2import mediapipe as mp
Similar to the last example, we make our imports first.
def drawFaceMesh(frame, faceLandmarks):mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_TESSELATION,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_tesselation_style())mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_CONTOURS,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_contours_style())mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_IRISES,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_iris_connections_style())
In this block, we define a method called drawFaceMesh
that takes two parameters: frame
and faceLandmarks
i.e. the detected face landmarks using mediapipe
.
For this purpose, we use mp_drawing.draw_landmarks
from drawing_utils
module to draw the face mesh annotations on our input frame.
We make it draw three annotations given below.
Tessellation
Contours
Irises
def main():videoCapture = cv2.VideoCapture('https://player.vimeo.com/external/373966277.sd.mp4?s=bc69e79a8007eb5682e9e72a415a2142173228f6&profile_id=164&oauth2_token_id=57447761')videoCapture.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)videoCapture.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)frameRate = 30with mp_face_mesh.FaceMesh(max_num_faces=1,refine_landmarks=True,min_detection_confidence=0.7,min_tracking_confidence=0.7,) as faceMesh:while videoCapture.isOpened():isSuccess, frame = videoCapture.read()if not isSuccess:breakframe.flags.writeable = Falseframe = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)results = faceMesh.process(frame)frame.flags.writeable = Trueframe = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)if results.multi_face_landmarks:for faceLandmarks in results.multi_face_landmarks:drawFaceMesh(frame, faceLandmarks)cv2.imshow('3D video face mesh', cv2.flip(frame, 1))if cv2.waitKey(int(1000 / frameRate)) & 0xFF == 27:breakvideoCapture.release()cv2.destroyAllWindows()
Our main
method handles all core things, so let's get straight into the functionalities of it!
Create videoCapture
to read frames from the given URL (changeable).
Set the desired width/height of the video frames.
Initialize the FaceMesh model using mp_face_mesh.FaceMesh
by customizing max_num_faces
, min_detection_confidence
, and min_tracking_confidence
.
Start a loop. This processes each frame of the video. The iterations include the following flow.
Read a frame from the video capture.
Convert the frame from BGR to RGB format since it's a mediapipe
requirement.
Process the frame using the FaceMesh model to get the results
i.e. landmark detections.
Convert the frame back to BGR format.
Check for face landmarks in the current frame using results.multi_face_landmarks
.
If landmarks were detected, call the drawFaceMesh
function to draw them on the said frame.
Show the processed frame with face landmarks in our window '3D video face mesh'
.
Release the video capture object and close all windows after the loop.
Note: The loop will exit when the Escape i.e.
0xFF
key is pressed.
if __name__ == "__main__":mp_drawing = mp.solutions.drawing_utilsmp_drawing_styles = mp.solutions.drawing_stylesmp_face_mesh = mp.solutions.face_meshmain()
Lastly, we define our main function, which renders the video in the window along with the corresponding face mesh coordinates for each frame!
The following code is completely executable and renders a face mesh for the frames of the video.
You can replace the video link with any compatible video link as well.
import cv2 import mediapipe as mp def drawFaceMesh(frame, faceLandmarks): mp_drawing.draw_landmarks( image=frame, landmark_list=faceLandmarks, connections=mp_face_mesh.FACEMESH_TESSELATION, landmark_drawing_spec=None, connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_tesselation_style() ) mp_drawing.draw_landmarks( image=frame, landmark_list=faceLandmarks, connections=mp_face_mesh.FACEMESH_CONTOURS, landmark_drawing_spec=None, connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_contours_style() ) mp_drawing.draw_landmarks( image=frame, landmark_list=faceLandmarks, connections=mp_face_mesh.FACEMESH_IRISES, landmark_drawing_spec=None, connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_iris_connections_style() ) def main(): videoCapture = cv2.VideoCapture('https://player.vimeo.com/external/373966277.sd.mp4?s=bc69e79a8007eb5682e9e72a415a2142173228f6&profile_id=164&oauth2_token_id=57447761') videoCapture.set(cv2.CAP_PROP_FRAME_WIDTH, 1280) videoCapture.set(cv2.CAP_PROP_FRAME_HEIGHT, 720) frameRate = 30 with mp_face_mesh.FaceMesh( max_num_faces=1, refine_landmarks=True, min_detection_confidence=0.7, min_tracking_confidence=0.7, ) as faceMesh: while videoCapture.isOpened(): isSuccess, frame = videoCapture.read() if not isSuccess: break frame.flags.writeable = False frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) results = faceMesh.process(frame) frame.flags.writeable = True frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR) if results.multi_face_landmarks: for faceLandmarks in results.multi_face_landmarks: drawFaceMesh(frame, faceLandmarks) cv2.imshow('3D video face mesh', cv2.flip(frame, 1)) if cv2.waitKey(int(1000 / frameRate)) & 0xFF == 27: break videoCapture.release() cv2.destroyAllWindows() if __name__ == "__main__": mp_drawing = mp.solutions.drawing_utils mp_drawing_styles = mp.solutions.drawing_styles mp_face_mesh = mp.solutions.face_mesh main()
The most exciting part is here! We can also project the 3D face mesh on ourselves by using our own webcam or an external camera.
We pass 0 to the VideoCapture
function so that the video source becomes our webcam directly.
import cv2import mediapipe as mpdef drawFaceMesh(frame, faceLandmarks):mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_TESSELATION,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_tesselation_style())mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_CONTOURS,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_contours_style())mp_drawing.draw_landmarks(image=frame,landmark_list=faceLandmarks,connections=mp_face_mesh.FACEMESH_IRISES,landmark_drawing_spec=None,connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_iris_connections_style())def main():videoCapture = cv2.VideoCapture(0)videoCapture.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)videoCapture.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)frameRate = 30with mp_face_mesh.FaceMesh(max_num_faces=1,refine_landmarks=True,min_detection_confidence=0.7,min_tracking_confidence=0.7,) as faceMesh:while videoCapture.isOpened():isSuccess, frame = videoCapture.read()if not isSuccess:breakframe.flags.writeable = Falseframe = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)results = faceMesh.process(frame)frame.flags.writeable = Trueframe = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)if results.multi_face_landmarks:for faceLandmarks in results.multi_face_landmarks:drawFaceMesh(frame, faceLandmarks)cv2.imshow('Web cam face mesh', cv2.flip(frame, 1))if cv2.waitKey(int(1000 / frameRate)) & 0xFF == 27:breakvideoCapture.release()cv2.destroyAllWindows()if __name__ == "__main__":mp_drawing = mp.solutions.drawing_utilsmp_drawing_styles = mp.solutions.drawing_stylesmp_face_mesh = mp.solutions.face_meshmain()
Note: Run this code on your local machine so that your code can connect to the webcam.
The field of computer vision is excelling at great speeds and offers a lot of potential for new discoveries. Currently, 3D face mesh technology finds itself highly useful in the following use cases depicted in the diagram.
Note: Here's the complete list of related projects in MediaPipe or deep learning.
Let’s test your knowledge!
What does passing 0 to the function VideoCapture
do?
Free Resources