How to create a scatter plot with Plotly Express in Python

Plotly Express is a Python library designed for creating interactive and customizable data visualizations, including scatter plots. The scatter function of Plotly Express creates a scatter plot from two variables, x and y. It’s a flexible function that can be used to visualize a variety of data, including trends over time, connections between two continuous variables, and patterns in categorical data.

Some of the key features of the scatter function include:

  • Customizable markers and colors: The scatter function enables users to customize the appearance of data points using a wide range of markers and colors. This includes specifying marker size, shape, and color and defining custom color scales.

  • Support for categorical variables: The scatter function visualizes patterns in categorical data by assigning different markers or colors to each category, enabling users to easily identify relationships and distributions.

  • Trend lines and error bars: The scatter function provides options for adding trend lines and error bars to visualizations, making it easy to see patterns or trends in the data.

  • Interactive features: The scatter function plots are interactive by default, allowing users to zoom in and out, pan, and hover over data points to view additional information.

  • Ease of use: The scatter function provides a simple and intuitive syntax for creating visualizations, making it easy for users to quickly create and customize scatter plots.

Syntax

The scatter function syntax typically follows this structure:

import plotly.express as px
fig = px.scatter(data_frame, x=x_column, y=y_column,
color=color_column, size=size_column,
hover_data=[hover_column_1, hover_column_2])
Syntax of the scatter function

Parameters

The scatter function of Plotly Express offers a wide range of parameters that allow users to customize and enhance their scatter plots. Here are the key parameters:

  • data_frame: A pandas DataFrame object containing the data to be plotted.

  • x: A string or list of strings specifying the column(s) of the DataFrame to be plotted on the x-axis.

  • y: A string or list of strings specifying the column(s) of the DataFrame to be plotted on the y-axis.

  • color: A string or list of strings specifying the column(s) of the DataFrame to be used for coloring the data points.

  • symbol: A string or list of strings specifying the column(s) of the DataFrame to be used for specifying different marker symbols for the data points.

  • size: A string or list of strings specifying the column(s) of the DataFrame to be used for specifying different marker sizes for the data points.

  • hover_name: A string or list of strings specifying the column(s) of the DataFrame to display additional information about each data point when the user hovers over it. The purpose of hover_name is to provide a concise and easily accessible summary of information related to each data point.

  • hover_data: A list of strings specifying additional columns of the data frame to be displayed when the user hovers over a data point. The purpose of hover_data is to provide more detailed and comprehensive information about each data point beyond just a single column.

  • log_x: A boolean value indicating whether or not the x-axis should be scaled in log units.

  • log_y: A boolean value indicating whether or not the y-axis should be scaled in log units.

  • title: A string specifying the title of the plot.

  • template: A string or Plotly.js layout object specifying the layout template to be used for the plot.

  • width: A number (integer) indicating the plot’s width in pixels.

  • height: A number (integer) indicating the plot’s height in pixels.

Return type

The scatter function returns a Plotly graph object, which can be further customized and manipulated using the functions provided by the Plotly library.

Implementation

In the following playground, we create a density heatmap plot using a sample dataset called “iris” provided by Plotly Express. The attributes used are as follows:

  • sepal_length: It represents the length of the sepal, which is the outer part of the flower that protects the petals. It’s typically measured in centimeters.

  • sepal_width: It represents the width of the sepal, measured in centimeters. It’s the measurement taken at the widest part of the sepal.

  • species: It refers to the different types of iris flowers (setosa, versicolor, and virginica).

cd /usercode && python3 main.py
python3 -m http.server 5000 > /dev/null 2>&1 &
Create a scatter plot of the iris dataset

Explanation

The code above is explained in detail below:

  • Lines 23: We import the required libraries for the code, i.e., plotly.express as px for creating the density heatmap plot and pandas as pd for handling data in a DataFrame.

  • Line 6: We load a sample dataset called tips using the px.data.iris() function provided by Plotly Express. The dataset contains information about restaurant tips.

  • Line 9: We print the first five rows of the loaded dataset. The head() function retrieves the top rows of the DataFrame and print() displays the result in the console. It helps to inspect the data and verify its structure quickly.

  • Line 12: We create a scatter plot using Plotly Express. The px.scatter() function is used to generate the scatter plot. We pass the DataFrame data (which contains the loaded dataset) as the data_frame parameter. We specify the column to be plotted on the x-axis using the x parameter, which is set to sepal_width. The y parameter is set to sepal_length, representing the column to be plotted on the y-axis. The color parameter is set to species, allowing different species of iris flowers to be color-coded. Finally, we set the title parameter to “Sepal Width vs. Sepal Length” to give the plot a title.

  • Line 15: We display the plot using the fig.show() method, which shows the interactive plot.

Conclusion

The scatter function of Plotly Express is a versatile and interactive tool that creates scatter plots from two variables, x and y. It supports a wide range of data, including relationships between continuous variables, patterns in categorical data, and trends over time. Key features include customizable markers and colors, support for categorical variables, trend lines, error bars, and simple syntax for creating and customizing plots.

Unlock your potential: Plotly Graphing and Visualization series, all in one place!

If you've missed any part of the series, you can always go back and check out the previous Answers:

Plotly express: quick and intuitive visualization

Plotly Graph Objects: Customization and advanced features

Free Resources

HowDev By Educative. Copyright ©2025 Educative, Inc. All rights reserved