Category Archives: Plots

Creating a scree plot in R

Scree plot

A scree plot displays the proportion of the total variation in a dataset that is explained by each of the components in a principle component analysis. It helps you to identify how many of the components are needed to summarise the data.

To create a scree plot of the components, use the screeplot function.

> screeplot(modelname)

where modelname is the name of a previously saved principle component analysis, created with the princomp function as explained in the article Performing a principle component analysis in R.

Example: Scree plot for the iris dataset

In a the article Performing a principal component analysis with R we performed a principle component analysis for the iris dataset, and saved it to an object named irispca.

To create a scree plot of the components, use the command:

> screeplot(irispca)

The result is shown below.

Scree plot

From the scree plot we can see that the amount of variation explained drops dramatically after the first component. This suggests that just one component may be sufficient to summarise the data.

Creating an interaction plot in R

Interaction plot

An interaction plot is a visual representation of the interaction between the effects of two factors, or between a factor and a numeric variable. It is suitable for experimental data.

You can create an interaction plot with the interaction.plot function. The command takes the general form:

> interaction.plot(dataset$var1, dataset$var2, dataset$response)

where var1 and var2 are the names of the explanatory variables and response is the name of the response variable.

If one of the explanatory variables is numeric and the other is a factor, list the numeric variable first and the factor second. This way the numeric variable is displayed along the x-axis and the factor is represented by separate lines on the plot.

Example: Interaction plot with ToothGrowth data

Consider the ToothGrowth dataset, which is included with R. The dataset gives the results of an experiment to determine the effect of two supplements (Vitamin C and Orange Juice), each at three different doses (0.5, 1 or 2 mg) on tooth length in guinea pigs. The len variable gives the tooth growth, the supp variable gives the supplement type and the dose variable gives the supplement dose. You can view more information about the ToothGrowth dataset by entering help(ToothGrowth).

To create an interaction plot illustrating the interaction between supplement type and supplement dose, use the command:

> interaction.plot(ToothGrowth$dose, ToothGrowth$supp, ToothGrowth$len)

The results are shown below.

Interaction plot
Interaction plot for the ToothGrowth data

Creating a 3D scatter plot in R

3D scatter plot

A 3D scatter plot is similar to an ordinary scatter plot, but for three continuous variables. It allows you to examine the relationship between them.

R has a special function for creating three dimensional scatter plots, called scatterplot3d. This function is not part of the base R installation, but part of an add-on package written by Uwe Ligges which is also called scatterplot3d. You must install and load the package before you can use the scatterplot3d function. You can learn how to do this in the article Installing and loading add-on packages in R

Once you have loaded the package, you can create a 3D scatter plot with the command:

> scatterplot3d(dataset$xvar, dataset$yvar, dataset$zvar)

where xvar, yvar and zvar are the variables that you want to display on the x, y and z axes.

Example: 3D scatter plot using theĀ trees data

Consider the trees dataset (included with R), which gives the girth, height and volume of 31 trees. You can view more information about the dataset by entering help(trees). To view the dataset, enter the dataset name:

> trees
   Girth Height Volume
1    8.3     70   10.3
2    8.6     65   10.3
3    8.8     63   10.2
4   10.5     72   16.4
5   10.7     81   18.8
6   10.8     83   19.7
7   11.0     66   15.6
8   11.0     75   18.2
9   11.1     80   22.6
10  11.2     75   19.9
31  20.6     87   77.0

To create a 3D scatter plot of Volume against Girth and Height, use the command:

> scatterplot3d(trees$Girth, trees$Height, trees$Volume)

The result is shown below.

3D scatter plot


Creating a normal probability plot in R

Normal probability plot

A normal probability plot is a plot for a continuous variable that helps to determine whether a sample is drawn from a normal distribution. If the data is drawn from a normal distribution, the points will fall approximately in a straight line. If the data points deviate from a straight line in any systematic way, it suggests that the data is not drawn from a normal distribution.1

You can create a normal probability plot using the qqnorm function. The command takes the following general form:

> qqnorm(dataset$variable)

For example, to plot the Height variable from the trees dataset (included with R), use the command:

> qqnorm(trees$Height)

You can also add a reference line to the plot, which makes it easier to determine whether the data points are falling into a straight line. To add a reference line, use the qqline function directly after the qqnorm function as shown.

> qqnorm(trees$Height)
> qqline(trees$Height)

The result is shown below.

Normal probability plot

There is also a function called qqplot which allows you to create quantile plots for comparing data with other standard probability distributions besides the normal distribution. Enter help(qqplot) for more details.

[1] Devore, J. and Peck, R., 2001. Statistics: The Exploration and Analysis of Data. Pacific Grove, CA: Brooks/Cole. p266.

Social Widgets powered by