How to create a kernel density estimate plot in seaborn

A KDE (kernel density estimate) plot is a data visualization technique that takes a dataset and generates a smoothed curve that depicts how the data is likely to be distributed across its range. It offers a smooth and informative representation of data patterns, making it easier to understand the underlying characteristics of a dataset.

The idea behind KDE is to place a kernel, usually a Gaussian or similar shape, at each data point in the dataset. These kernels are then summed up to create the estimated density curve. The width of the kernel, known as the bandwidth, controls the smoothness of the resulting curve. A wider bandwidth leads to a smoother curve, while a narrower bandwidth captures more detailed fluctuations in the data.

Advantages of KDE plots

KDE plots are especially useful for exploratory data analysis because they can reveal insights about the central tendency, spread, and multimodality (presence of multiple peaks) of the data distribution. They provide a way to visualize data patterns that might not be apparent from simple summary statistics.

Here are a few key points to understand about KDE plots:

Smooth representation: KDE plots provide a continuous and smooth estimate of the data distribution.

Bandwidth parameter: This controls the balance between capturing finer details and smoothing out noise in the data. Selecting an appropriate bandwidth is crucial for accurate representation.

Non-parametric: Unlike parametric methods that assume a specific distribution (e.g., Gaussian), KDE is non-parametric, meaning it does not assume a particular shape for the data distribution.

Visualization: KDE plots are often used as standalone visualizations or overlaid on other plots, such as histograms or scatter plots, to provide additional insight into the data distribution.

Multivariate KDE: KDE can also be extended to visualize joint distributions of two or more variables, known as 2D or multivariate KDE plots.

Python libraries like seaborn and Matplotlib provide built-in functions to create KDE plots, making it easy for data analysts and scientists to generate these visualizations for their datasets.

Creating univariate KDE plots

The code below creates overlapped univariate KDE plots.

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Generate random numbers
x = np.random.randint(3,50,50)
y = np.random.randint(3,50,50)
z = np.random.randint(3,50,50)
# Create KDE plots
sns.kdeplot(x, label='Dataset 1', fill=True, color='green')
sns.kdeplot(y, label='Dataset 2', color='red', linestyle='dashed')
sns.kdeplot(z, label='Dataset 3')
# Beautify the charts
plt.legend()
plt.title('Data Distribution')
plt.show()

The code is explained below:

  • Lines 2–4: We import the necessary libraries.
  • Lines 6–8: The code generates three arrays of 50 random numbers between the range 3 and 50.
  • Lines 10–12: The code creates three univariate KDE plots inside the same chart.
  • Lines 14–16: We add a legend and title to the chart, as well as render it.

Creating bivariate KDE plots

The code below creates bivariate 2D KDE plots.

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Generate random numbers
y = np.random.randint(3,19,50)
z = np.random.randint(3,19,50)
# Create KDE plot
sns.kdeplot(x=y, y=z, fill=True, levels=20, cmap='Blues')
# Beautify chart
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.title('2D KDE Plot Example')
plt.show()

The code is explained below:

  • Lines 2–3: The code generates two arrays of 50 random numbers between the range 3 and 50.

  • Line 5: The code creates a bivariate KDE plot that uses the following arguments:

    • fill: This is a boolean variable that controls the color fill of the plot.
    • levels: This represents kernel levels.
    • cmap: This represents the color map.
  • Lines 7–10: We add axis labels and a title to the chart, as well as render it.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved