In industries such as graphic design, data visualization, and image analysis, color palette extraction can be a helpful tool. What we focus on in this Answer is to generalize the colors from any picture into five categories which can give a basic color palette of that image using the Pillow library in Python.
Pillow is a Python imaging library (PIL)
Computer vision is one of the most crucial advancements in artificial intelligence. It provides us with another medium to communicate with the computer. Computer vision serves as the foundational concept that supports real-life applications such as autonomous cars, improved medical diagnosis, and facial recognition.
Now let's focus on the implementation of our program using the Pillow library
Before getting into the details of the program, we will show a brief overview of what we do, step by step. The steps include:
Open and convert the image to RGB mode.
Apply k-means clustering to group similar colors together.
Calculate color occurrences in each cluster.
Sort colors based on occurrences to identify dominant colors.
Visualize the dominant colors in a color palette.
For the actual implementation of our program, we will be utilizing different libraries, such as:
Pillow
pip install pillow
NumPy
pip install numpy
OpenCV
pip install opencv-python
scikit-learn
pip install scikit-learn
The following program will be executed on the following image from which the most dominant colors will be extracted.
import cv2 import numpy as np from PIL import Image from sklearn.cluster import KMeans image_path = "pic4.jpg" num_colors = 5 num_clusters = 5 def get_dominant_colors(image_path, num_colors=10, num_clusters=5): image = Image.open(image_path) image = image.resize((200, 200)) image = image.convert('RGB') img_array = np.array(image) pixels = img_array.reshape(-1, 3) kmeans = KMeans(n_clusters=num_clusters, random_state=0) labels = kmeans.fit_predict(pixels) centers = kmeans.cluster_centers_ color_counts = {} for label in np.unique(labels): color = tuple(centers[label].astype(int)) color_counts[color] = np.count_nonzero(labels == label) sorted_colors = sorted(color_counts.items(), key=lambda x: x[1], reverse=True) dominant_colors = [color for color, count in sorted_colors[:num_colors]] color_occurrences = [count for color, count in sorted_colors[:num_colors]] dominant_colors_hex = ['#%02x%02x%02x' % color for color in dominant_colors] return dominant_colors_hex, color_occurrences dominant_colors, color_occurrences = get_dominant_colors(image_path, num_colors, num_clusters) print("Dominant Colors:") print(dominant_colors) palette_height = 100 palette_width = 100 * num_colors palette = np.zeros((palette_height, palette_width, 3), dtype=np.uint8) start_x = 0 for color_hex in dominant_colors: color_rgb = tuple(int(color_hex[i:i+2], 16) for i in (1, 3, 5)) end_x = start_x + 100 palette[:, start_x:end_x] = color_rgb start_x = end_x palette_image = Image.fromarray(palette) palette_bgr = cv2.cvtColor(np.array(palette_image), cv2.COLOR_RGB2BGR) cv2.imshow("Palette", palette_bgr) cv2.waitKey(0) cv2.destroyAllWindows()
Lines 1–4: Here, we import the required modules..
Line 6: Set the path to the input image using the variable image_path
.
Line 7: Define the variable num_colors
and set it to 5
, representing the number of dominant colors to extract from the image.
Line 8: Define the variable num_clusters
and set it to 5
, representing the number of clusters to use in the k-means clustering.
Line 10: Define a function get_dominant_colors
that takes image_path
, num_colors
, and num_clusters
as input parameters.
Line 11: Inside the get_dominant_colors
function, load the image using Pillow's Image.open
function and store it in the variable image
.
Line 12–13: Resize the image to a smaller size using the resize
method and converting the image to RGB mode using convert
method.
Line 14: Convert the image to a NumPy array using the np.array
function and store it in img_array
.
Line 15: Flatten the NumPy array into a list of pixels, where each pixel is represented as an RGB triplet. This is achieved using the reshape
method with -1
as the first dimension, which means NumPy will infer the size based on the other dimensions. The flattened array is stored in pixels
.
Line 16: Create a KMeans
object kmeans
with num_clusters
as the number of clusters to create and random_state=0
for reproducibility.
Line 17: Perform k-means clustering on the flattened pixel list using scikit-learn
fit_predict
method of the kmeans
object. It assigns each pixel to one of the num_clusters
clusters and returns the cluster labels, which are stored in the variable labels
.
Line 18: Get the cluster centers, which represent the dominant colors found by the k-means clustering algorithm, and store them in the variable centers
.
Line 19: Create an empty dictionary color_counts
to store the count of each color in the clusters.
Line 20 – 22: Loop through the unique labels from the clustering result, calculate the count of each color in the clusters, and store the colors and their occurrences in the color_counts
dictionary.
Line 23: Sort the colors based on their occurrences in descending order using the sorted
function.
Line 24 – 26: Extract the dominant colors and their occurrences from the sorted list, taking only the first num_colors
items. The dominant colors are stored in the dominant_colors
list, and their corresponding occurrences are stored in the color_occurrences
list. The dominant colors are also converted to hexadecimal format and stored in the dominant_colors_hex
list.
Line 29: Call the get_dominant_colors
function with the given input image path, num_colors
, and num_clusters
, and store the results in dominant_colors
and color_occurrences
.
Line 32: Print the list of dominant colors using the print
function.
Line 34: Set the height of the color palette image to 100 pixels using the variable palette_height
.
Line 35: Calculate the width of the color palette image based on the number of colors and set it to palette_width
. Each color block will have a width of 100 pixels.
Line 36: Create an empty NumPy array palette
with dimensions (palette_height, palette_width, 3)
, where the last argument represents RGB channels.
Line 38: Set the variable start_x
to 0, which will be used to position the colored blocks in the palette
array.
Line 39 – 43: Loop through the dominant colors, convert them from hexadecimal to RGB format, and fill the palette
array with colored blocks for each dominant color. Each colored block has a size of 100x100 pixels.
Line 45: Convert the NumPy array palette
back to a Pillow image using the Image.fromarray
function, creating the palette_image
.
Line 46: Convert the palette_image
from RGB format to BGR format using the cv2.cvtColor
function. OpenCV uses BGR order instead of RGB.
Line 48 – 50: Show the color palette using OpenCV's cv2.imshow
function. The window will remain open until any key is pressed.
Free Resources