R Tutorial R Charts & Graphs R Statistics R References

R - Box Plot



A box plot also known as a whisker plot is a method for graphically describing groups of numerical data through their quartiles. The box is drawn from first quartile (Q1) to third quartile (Q3) , indicating position of the median in the dataset. The whiskers extend from the box to show the range of the data.

Box Plot

The R boxplot() function makes a box and whisker plot for each column of x or each vector in sequence x.

Syntax

boxplot(x, data, main, xlab, ylab, notch, 
        names, horizontal, col, border)

Parameters

x Required. Specify a vector or a formula.
data Optional. Specify tha data frame.
main, xlab, ylab Optional. Used to specify main title, x axis label and y axis label respectively.
notch Optional. Specify boolean value, when TRUE draws a notch.
names Optional. Specify group labels give name for each box plot.
horizontal Optional. Specify boolean value, when TRUE horizontal box plot is drawn.
col Optional. Specify a color to be used to fill the box plot.
border Optional. Specify the color of the border around the bars.

Example:

In the example below, a random normally distributed dataset is created using rnorm() function. The sample contains 1000 records. Using this dataset, a box plot is drawn.

#fixing the seed to maintain the
#reproducibility of the result
set.seed(10)

#creating a vector containing 1000
#normally distributed random values  
x <- rnorm(1000, 0, 2)

#naming the file
png(file = "boxplot.png")

#drawing the boxplot
boxplot(x)

#saving the file
dev.off()

The output of the above code will be:

Box Plot

Example: Add features to a box plot

More features in the plot can be added using more parameters in the function, for example: to add title to the plot, main parameter is used and notch can be set TRUE to get notched box plot. Similarly, horizontal can be set TRUE to get horizontal box plot and col parameter can be used to fill color in the box plot.

#fixing the seed to maintain the
#reproducibility of the result
set.seed(10)

#creating a vector containing 1000
#normally distributed random values  
x <- rnorm(1000, 0, 2)

#naming the file
png(file = "boxplot.png")

#drawing the boxplot
boxplot(x, notch=TRUE, horizontal=TRUE,
        main="Box plot", col="red")

#saving the file
dev.off()

The output of the above code will be:

Box Plot

Multiple Box Plots

To draw multiple box plots in a single plot a formula can be used. It can also be achieved by passing multiple vectors in the function.

Example:

In the example below, three random normally distributed dataset is created using rnorm() function. Each samples contains 1000 records. Using these dataset, a box plot is drawn.

#fixing the seed to maintain the
#reproducibility of the result
set.seed(10)

#creating three vector containing 1000 
#normally distributed random values each 
#with different mean and standard deviation  
x1 <- rnorm(1000, 2, 4)
x2 <- rnorm(1000, 1, 3)
x3 <- rnorm(1000, 0, 2)

#naming the file
png(file = "boxplot.png")

#drawing the boxplot
boxplot(x1, x2, x3, main="Box plot",
        col = c("red", "blue", "green"),
        names = c("x1", "x2", "x3"))

#saving the file
dev.off()

The output of the above code will be:

Box Plot