Box plot in R using ggplot2
Last Updated :
15 Jul, 2025
A box plot (or box-and-whisker plot) visually summarizes the distribution, central value and spread of a dataset, helping to quickly identify outliers and variability. It consists of the following:
- Box: The box shows the interquartile range (IQR), containing the middle 50% of the data. The lower edge is the 25th percentile (Q1) and the upper edge is the 75th percentile (Q3). This box shows where most data points are concentrated.
- Median: A line inside the box marks the median (Q2), the middle value of the dataset.
- Whiskers: The lines extending from the box (whiskers) show the range of data within a certain distance from the quartiles (usually 1.5 times the IQR).
- Outliers: Data points outside the whiskers are marked as outliers, indicating values far from the rest of the data (typically beyond 1.5 times the IQR).
Why Use a Box Plot
- Central Tendency: It shows where the center of the data lies (median), helping to understand the typical value.
- Spread and Variability: The box and whiskers give an immediate sense of the range and how spread out the values are.
- Skewness: If the median line is closer to the top or bottom of the box, it indicates skewness in the data.
- Outliers: Box plots are excellent for spotting outliers—values that are much higher or lower than the rest of the data.
Plotting Box Plots in R using ggplot2
We can plot a box plot in R programming language using the ggplot2 library.
Syntax:
geom_boxplot(mapping = NULL, outlier.colour = NULL, outlier.shape = 19, outlier.size = 1.5, notch = FALSE)
Parameters:
- mapping: Used to define aesthetic mappings like x, y, fill, or color using aes().
- outlier.colour: Sets the color of the outlier points (if not specified, default color is used).
- outlier.shape: Specifies the shape of the outlier points (e.g.,
19
for solid circle). - outlier.size: Sets the size of the outlier points.
- notch: If
TRUE
, adds a notch to the box to show a confidence interval around the median.
1. Creating a Basic Box Plot
To create a regular boxplot, we first have to import all the required libraries and datasets in use. Then put all the attributes to plot in ggplot() function along with geom_boxplot.
You can download the dataset from here: Crop_recommendation
- ggplot: Initializes a ggplot2 plot object with dataset and aesthetic mapping.
- aes: Sets aesthetic mappings for x and y axes.
- geom_boxplot: Adds the box plot layer to the chart.
R
install.packages("ggplot2")
library(ggplot2)
ds <- read.csv("/content/Crop_recommendation.csv", header = TRUE)
ggplot(data=ds, mapping=aes(x=label, y=temperature))+
geom_boxplot()
Output:
Box plot in R using ggplot22. Adding Mean Value to the Box Plot
To add the mean value on the box plot, we can make use of the stat_summary() function. It enables we to add summary statistics such as the mean, which will be included directly in the plot.
Syntax:
stat_summary( fun, geom)
- fill: Fills the interior of each box according to group.
- stat_summary: Adds summary statistics like mean or median.
- fun: Function to apply mean.
- geom (in stat_summary): Defines how the summary is displayed.
- shape: Shape of the mean point (e.g., 8 = star).
- size: Size of the point.
- color: Color of the point.
R
library(ggplot2)
ds <- read.csv("Crop_recommendation.csv", header = TRUE)
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot() +
stat_summary(fun = "mean", geom = "point", shape = 8,
size = 2, color = "white")
Output:
Box plot in R using ggplot23. Change Legend Position of Box Plot
The position of the legend on the plot is easy to customize with the use of the theme() function. For instance, we can include the legend on top, at the bottom, or suppress it altogether.
- theme: Customizes non-data parts of the plot like background, title, legend.
- legend.position: Changes the position of the legend.
R
library(ggplot2)
ds <- read.csv("Crop_recommendation.csv", header = TRUE)
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot() +
theme(legend.position = "top")
Output:
Box plot in R using ggplot2Explanation: This will put the legend in the top of the plot. The theme() function offers further customizations of plot titles, axes and background.
4. Creating a Horizontal Box Plot
Box plots can also be placed horizontally using coord_flip() function. This function just switches the x and y-axis.
- coord_flip: Flips the axes.
R
library(ggplot2)
ds <- read.csv("c://crop//archive//Crop_recommendation.csv", header = TRUE)
# Creating a Horizontal Boxplot using ggplot2 in R
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot() +
coord_flip()
Output:
Box plot in R using ggplot25. Changing Box Plot Line Colors
We can change the outline colors of box plots in different ways depending on how we want to represent the grouping variable.
5.1. Default Line Colors by Groups
We can reverse the outline color of the boxes according to a grouping variable. This can be achieved by mapping the color aesthetic onto a variable.
- color: Changes the outline color of each box by group.
R
crop2<-ggplot(ds, aes(x=label, y=temperature, color=label)) +
geom_boxplot()
crop2
Output:
Box plot in R using ggplot25.2. Custom Line Colors
We will use the scale_color_manual() function to specify certain colors for each group to have greater control over the box outline colors.
- scale_color_manual: defines specific colors for each group manually.
R
ggplot(ds, aes(x = label, y = temperature, color = label)) +
geom_boxplot() +
scale_color_manual(values = c("#999999", "#E69F00", "#56B4E9", "Red", "Green"))
Output:
Box plot in R using ggplot25.3. Using Brewer Color Palettes
We can change the outline color of the box plot with brewer color palettes. For doing so we just need to use the scale_color_brewer() function and set the palette argument within this function.
- scale_color_brewer: Applies a predefined ColorBrewer palette to line colors.
- palette: The name of the palette used (e.g., "Dark2").
R
ggplot(ds, aes(x = label, y = temperature, color = label)) +
geom_boxplot() +
scale_color_brewer(palette = "Dark2")
Output:
Box plot in R using ggplot26. Fill the Box Plot with color
We can fill the interior of box plots using solid colors, grouped fills, or custom palettes to improve visual clarity or aesthetics.
6.1. Default Filling
To fill the boxes with color, we can use the fill attribute inside the geom_boxplot() function.
R
ggplot(data = ds, aes(x = label, y = temperature)) +
geom_boxplot(fill = 'green')
Output:
Box plot in R using ggplot26.2. Fill by Group
If we want to fill the boxes with different colors based on the label variable, we can map the fill aesthetic to this variable.
R
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot(outlier.colour = "black", outlier.shape = 16, outlier.size = 2)
Output:
Box plot in R using ggplot26.3. Custom Fill Colors
To manually specify colors for the fills, use scale_fill_manual().
- scale_fill_manual: Lets we assign custom fill colors to groups.
R
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot(outlier.colour = "black", outlier.shape = 16, outlier.size = 2) +
scale_fill_manual(values = c("#999999", "#E69F00", "#56B4E9", "Red", "Green"))
Output:
Box plot in R using ggplot26.4. Using Brewer Color Palettes for Filling
Similar to the outline color, we can use scale_fill_brewer() to apply a color palette to the fill.
R
ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot(outlier.colour = "black", outlier.shape = 16, outlier.size = 2) +
scale_fill_brewer(palette = "Dark1")
Output:
Box plot in R using ggplot26.5. Using Grayscale
To fill color of box plots with grayscale use scale_fill_grey() with theme_classic().
- scale_fill_grey: Applies a grayscale color scheme.
- theme_classic: Uses a clean white background with no grid.
R
crop3<-ggplot(ds, aes(x = label, y = temperature, fill = label)) +
geom_boxplot(outlier.colour="black", outlier.shape=16, outlier.size=2)
crop3 + scale_fill_grey() + theme_classic()
Output:
Box plot in R using ggplot27. Adding Jitters in Box Plots
Jitters assist in minimizing over plotting when data points coincide. We can control the location of jittered points using the position_jitter() function.
- geom_jitter: Adds random noise to the data points to avoid overlap.
- position_jitter: Controls how much noise to add (e.g., 0.2 on x-axis).
R
ggplot(ds, aes(x = label, y = temperature)) +
geom_boxplot() +
geom_jitter(position = position_jitter(0.2))
Output:
Box plot in R using ggplot28. Notched Box Plot
A notched box plot gives the added information of emphasizing the confidence interval of the median. To plot a notched box plot, use the notch parameter as TRUE.
R
ggplot(ds, aes(x = label, y = temperature)) +
geom_boxplot(notch = TRUE) +
geom_jitter(position = position_jitter(0.2))
Output:
Box plot in R using ggplot2
Similar Reads
Data visualization with R and ggplot2 The ggplot2 ( Grammar of Graphics ) is a free, open-source visualization package widely used in R Programming Language. It includes several layers on which it is governed. The layers are as follows:Layers with the grammar of graphicsData: The element is the data set itself.Aesthetics: The data is to
7 min read
Introduction to ggplot2
Working with External Data
Basic Plotting with ggplot2
Plot Only One Variable in ggplot2 Plot in RIn this article, we will be looking at the two different methods to plot only one variable in the ggplot2 plot in the R programming language. Draw ggplot2 Plot Based On Only One Variable Using ggplot & nrow Functions In this approach to drawing a ggplot2 plot based on the only one variable, firs
5 min read
How to create a plot using ggplot2 with Multiple Lines in R ?In this article, we will discuss how to create a plot using ggplot2 with multiple lines in the R programming language. Method 1: Using geom_line() function In this approach to create a ggplot with multiple lines, the user need to first install and import the ggplot2 package in the R console and then
3 min read
Plot Lines from a List of DataFrames using ggplot2 in RFor data visualization, the ggplot2 package is frequently used because it allows us to create a wide range of plots. To effectively display trends or patterns, we can combine multiple data frames to create a combined plot.Syntax: ggplot(data = NULL, mapping = aes(), colour())Parameters:data - Defaul
3 min read
How to plot a subset of a dataframe using ggplot2 in R ?In this article, we will discuss plotting a subset of a data frame using ggplot2 in the R programming language. Dataframe in use: Â AgeScoreEnrollNo117700521880103177915419752051885256199630717903581971409188345 To get a complete picture, let us first draw a complete data frame. Example: R # Load ggp
9 min read
Change Theme Color in ggplot2 Plot in RA theme in ggplot2 is a collection of settings that control the non-data elements of the plot. These settings include things like background colors, grid lines, axis labels, and text sizes. we can use various theme-related functions to customize the appearance of your plots, including changing theme
4 min read
Modify axis, legend, and plot labels using ggplot2 in RIn this article, we are going to see how to modify the axis labels, legend, and plot labels using ggplot2 bar plot in R programming language. For creating a simple bar plot we will use the function geom_bar( ). Syntax: geom_bar(stat, fill, color, width) Parameters :Â Â stat : Set the stat parameter to
5 min read
Common Geometric Objects (Geoms)
Advanced Data Visualization Techniques
Combine two ggplot2 plots from different DataFrame in RIn this article, we are going to learn how to Combine two ggplot2 plots from different DataFrame in R Programming Language. Here in this article we are using a scatter plot, but it can be applied to any other plot. Let us first individually draw two ggplot2 Scatter Plots by different DataFrames then
2 min read
Annotating text on individual facet in ggplot2 in RIn this article, we will discuss how to annotate a text on the Individual facet in ggplot2 in R Programming Language. To plot facet in R programming language, we use the facet_grid() function from the ggplot2 library. The facet_grid() is used to form a matrix of panels defined by row and column face
5 min read
How to annotate a plot in ggplot2 in R ?In this article, we will discuss how to annotate functions in R Programming Language in ggplot2 and also read the use cases of annotate. What is annotate?An annotate function in R can help the readability of a plot. It allows adding text to a plot or highlighting a specific portion of the curve. Th
4 min read
Annotate Text Outside of ggplot2 Plot in RGgplot2 is based on the grammar of graphics, the idea that you can build every graph from the same few components: a data set, a set of geomsâvisual marks that represent data points, and a coordinate system. There are many scenarios where we need to annotate outside the plot area or specific area as
2 min read
How to put text on different lines to ggplot2 plot in R?ggplot2 is a plotting package in R programming language that is used to create complex plots from data specified in a data frame. It provides a more programmatic interface for specifying which variables to plot onto the graphical device, how they are displayed, and general visual properties. In thi
3 min read
How to Connect Paired Points with Lines in Scatterplot in ggplot2 in R?In this article, we will discuss how to connect paired points in scatter plot in ggplot2 in R Programming Language. Scatter plots help us to visualize the change in two more categorical clusters of data. Sometimes, we need to work with paired quantitative variables and try to visualize their relatio
2 min read
How to highlight text inside a plot created by ggplot2 using a box in R?In this article, we will discuss how to highlight text inside a plot created by ggplot2 using a box in R programming language. There are many ways to do this, but we will be focusing on one of the ways. We will be using the geom_label function present in the ggplot2 package in R. This function allo
3 min read
Adding labels, titles, and legends in r
Working with Legends in R using ggplot2A legend in a plot helps us to understand which groups belong to each bar, line, or box based on its type, color, etc. We can add a legend box in R using the legend() function. These work as guides. The keys can be determined by scale breaks. In this article, we will be working with legends and asso
6 min read
How to Add Labels Directly in ggplot2 in RLabels are textual entities that have information about the data point they are attached to which helps in determining the context of those data points. In this article, we will discuss how to directly add labels to ggplot2 in R programming language. To put labels directly in the ggplot2 plot we add
5 min read
How to change legend title in ggplot2 in R?In this article, we will see how to change the legend title using ggplot2 in R Programming. We will use ScatterPlot. For the Data of Scatter Plot, we will pick some 20 random values for the X and Y axis both using rnorm() function which can generate random normal values, and here we have one more p
3 min read
How to change legend title in R using ggplot ?A legend helps understand what the different plots on the same graph indicate. They basically provide labels or names for useful data depicted by graphs. In this article, we will discuss how legend names can be changed in R Programming Language. Let us first see what legend title appears by default.
2 min read
Customizing Visual Appearance
Handling Data Subsets: Faceting
How to create a faceted line-graph using ggplot2 in R ?A potent visualization tool that enables us to investigate the relationship between two variables at various levels of a third-category variable is the faceted line graph. The ggplot2 tool in R offers a simple and versatile method for making faceted line graphs. This visual depiction improves our co
6 min read
How to Combine Multiple ggplot2 Plots in R?In this article, we will discuss how to combine multiple ggplot2 plots in the R programming language. Combining multiple ggplot2 plots using '+' sign to the final plot In this method to combine multiple plots, here the user can add different types of plots or plots with different data to a single p
2 min read
Change Labels of GGPLOT2 Facet Plot in RIn this article, we will see How To Change Labels of ggplot2 Facet Plot in R Programming language. To create a ggplot2 plot, we have to load ggplot2 package. library() function is used for that. Then either create or load dataframe. Create a regular plot with facets. The labels are added by default
3 min read
Change Font Size of ggplot2 Facet Grid Labels in RIn this article, we will see how to change font size of ggplot2 Facet Grid Labels in R Programming Language. Let us first draw a regular plot without any changes so that the difference is apparent. Example: R library("ggplot2") DF <- data.frame(X = rnorm(20), Y = rnorm(20), group = c("Label 1",
2 min read
Remove Labels from ggplot2 Facet Plot in RIn this article, we will discuss how to remove the labels from the facet plot in ggplot2 in the R Programming language. Facet plots, where one subsets the data based on a categorical variable and makes a series of similar plots with the same scale. We can easily plot a facetted plot using the facet_
2 min read
Grouping Data: Dodge and Position Adjustments