How to create a boxplot in R

A boxplot, or a whisker plot, is a graphical representation of a dataset’s distribution. It summarizes the data’s median, quartiles, range, and outliers. The plot provides a visual summary of the data’s central tendency, spread, and potential outliers.

Box and whisker plot
Box and whisker plot

The plot comprises a box representing the interquartile range (IQR), the range between the first and third quartiles. The median is shown as a line within the box. Whiskers extend from the box to indicate the range of the data, typically spanning 1.5 times the IQR. The data points outside this range are considered outliers and are usually depicted as individual points.

R provides users with the built-in boxplot() function to create boxplots. It at least requires one variable, the numerical data.

Let’s create a simple boxplot using the mtcars dataset.

# Load the dataset
data(mtcars)
# Create the boxplot
boxplot(mtcars$mpg)

Explanation

Line 2: This line loads the mtcars dataset, a built-in dataset.

Line 4: It creates a box plot of the mpg variable from the mtcars dataset.

Now let’s add some more arguments to the plot.

# Load the dataset
data(mtcars)
# Create the boxplot
boxplot(mpg ~ factor(cyl), data = mtcars,
main = "Box Plot of MPG by Number of Cylinders",
xlab = "Number of Cylinders",
ylab = "Miles per Gallon",
col = "pink",
border = "black",
lwd = 2)

Explanation

  • Line 2: This line loads the mtcars dataset, a built-in dataset.

  • Line 4: This formula specifies that the variable mpg should be plotted against the factor variable cyl. It indicates that the mpg values should be grouped and plotted based on the different levels of cyl from the specified dataset, which in our case is mtcars.

  • Line 5: This argument sets the main title of the plot.

  • Lines 6–7: These arguments set the label for the x-axis and y-axis.

  • Line 8: This argument sets the fill color of the boxes in the box plot.

  • Line 9: This argument sets the color of the lines surrounding the boxes.

  • Line 10: This argument sets the line width of the lines in the box plot, which is 2 in our case.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved