In Julia, parallel coordinates plots are commonly used for visualizing and analyzing multivariate numerical data. They are ideal for comparing multiple variables or features and analyzing their relationships.
A parallel coordinates plot is a visualization technique where a separate vertical axis represents each variable or dimension, and these axes are laid parallel. Depending on the unit of measurement for each variable, each axis can have a different scale, or the axes can all be uniformly normalized. Each data element is displayed as a series of connected points along these axes.
When working with the parallel coordinates plot, the order of the axes affects the reader's interpretation of the data and can help discover patterns or correlations. It is worth noting that rendering too many variables or features may result in a cluttered chart with a confusing appearance.
Let's look at the following example showing how to draw a parallel coordinates plot in Julia for multidimensional exploratory data analysis:
This example illustrates leveraging the
Built on top of the Plotly JavaScript library, it allows the users of Julia's ecosystem to create attractive, insightful, and interactive web-based visualizations.
Click the "Run" button in the widget below to execute the following code.
using PlotlyJS, DataFramesdf = DataFrame(product_id = [1, 2,3,4,5],product_name = ["Oven", "Microwave", "Dishwasher", "Refrigerator", "Toaster"],price = [800, 250, 700, 1400, 120],height = [200, 150, 230, 540 , 40],width = [350, 250, 180, 120 , 30]);mytrace = parcoords(;line = attr(color=df.product_id),dimensions = [attr(range = [0,10000], label = "price", values = df.price),attr(range = [0,1000], label = "height", values = df.height),attr(range = [0,1000], label = "width", values = df.width)]);layout = Layout(title_text="Parallel Coordinates Plot", title_x=0.5, title_y=0)myplot = plot(mytrace,layout)
Let's go through the code widget above to get a better understanding of this topic:
Line 1: Load the modules PlotlyJS.jl
and DataFrames.jl
.
Lines 3–9: Construct a sample DataFrame
holding the data related to some product appliances. It includes several features or dimensions like price, height, and width.
Lines 11–22: Invoke the method parcoords
to generate a parallel coordinates plot.
For each product, we set a different line color based on its identifier: line = attr(color=df.product_id)
Moreover, we specify the dimensions to be considered based on the features of the products, like price, height, and width. For each dimension, we specify the scale, a label to be assigned for the related axis, and the series of related values.
Lines 23–26: Define a layout for the generated plot. Within this layout, we specify a title and indicate its location using the parameters title_x
and title_y
.
Line 27: Invoke the function plot
to generate a parallel coordinates plot.
Free Resources