R - Scatter Plot

A scatter plot is a type of plot which uses Cartesian coordinates to display values for typically two variables for a set of data. One additional variable can be added as color or shape or size of the marker. The data are displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.

The R plot() function makes scatter plot of y vs. x.

Syntax

plot(x, y, main, xlab, ylab,
xlim, ylim, col)

Parameters

 x Required. Specify a vector of horizontal coordinates. y Optional. Specify a vector of vertical coordinates. main, xlab, ylab Optional. Used to specify main title, x axis label and y axis label respectively. xlim, ylim Optional. Used to specify range of values on x-axis and y-axis respectively. col Optional. Used to specify colors for points.

Example:

In the example below, the plot() function is used to create scatter plot of 12 peoples of different age and weight.

#creating age and weight dataset
age <- c(40, 45, 32, 89, 65, 55, 35, 61, 75, 99, 65, 45)
weight <- c(82, 92, 81, 89, 94, 88, 82, 73, 93, 78, 80, 85)

#naming the file
png(file = "scatterplot.png")

#drawing the scatterplot
plot(age, weight)

#saving the file
dev.off()

The output of the above code will be: Example: Add features to a scatter plot

More features in the plot can be added using more parameters in the function, for example: to add title to the plot, main parameter is used and to add color, col parameter is used.

#creating age and weight dataset
age <- c(40, 45, 32, 89, 65, 55, 35, 61, 75, 99, 65, 45)
weight <- c(82, 92, 81, 89, 94, 88, 82, 73, 93, 78, 80, 85)

#naming the file
png(file = "scatterplot.png")

#drawing the scatterplot
plot(age, weight, main="Scatter Plot",
col="blue", , xlab="Age", ylab="Weight")

#saving the file
dev.off()

The output of the above code will be: R gives the flexibility to add fit lines in the scatter plot. Consider the example below where mtcar database is used.

attach(mtcars)

#naming the file
png(file = "scatterplot.png")

plot(wt, mpg, main="Scatter plot",
xlab="Car Weight ", ylab="Miles Per Gallon ")

# regression line (mpg~wt)
abline(lm(mpg~wt), col="red")

# lowess line (wt,mpg)
lines(lowess(wt,mpg), col="blue")

#saving the file
dev.off()

The output of the above code will be: Scatterplot Matrices

When the database contains more than two variables and the correlation between each variable with respect to another variables are required, the scatterplot matrix can be used. The syntax for using pairs() function of scatterplot matrix is given below:

Syntax

pairs(formula, data, main)

Parameters

 formula Required. Specify formula represents the series of variables used in pairs. data Required. Specify data represents the data set from which the variables will be taken. main Optional. Used to specify main title.

Example:

In the example below, mtcar database is used. The four variables of this database is used to create scatterplot matrices.

attach(mtcars)

#naming the file
png(file = "scatterplot matrix.png")

pairs(~mpg+disp+cyl+wt,data=mtcars,
main="Scatterplot Matrix")

#saving the file
dev.off()

The output of the above code will be: 5