R has powerful graphical capabilities, and its possible to create almost any kind of graph, chart or plot. It also has powerful annotation options, allowing you to write and draw all over your plot, using labels, shapes, highlighting, and more.

You might have previously created plots in R, and annotated them using a different graphic program (e.g. Photoshop, Corel Draw etc.). But, you could just do it all in R! This guide will show you some of the ways in which you can scribble on your plots, which can be useful for keeping notes, or to highlight certain features of your data...

## Plot annotation

In this guide, I'll show you some of the ways in which you can annotate your plots. I'll cover text annotation with `text()`

and `mtext()`

, drawing arrows `arrows()`

, and drawing shapes including rectangles `rect()`

and polygons `polygon()`

.

This guide refers to plot and figure regions - for a quick overview of these, take a look at my guide to layout.

First, we'll generate some random data for use in this guide, and plot it:

```
# Generate random data
set.seed(421)
x <- runif(50, min=1, max=100)
y <- rnorm(50)
# Create data frame and add random row names
df <- data.frame(x, y)
rownames(df) <- c(letters[1:26], LETTERS[1:24])
# Plot
par(mar=c(5, 5, 5, 5)) # Make large margins
plot(df)
```

Which will look like this:

## Adding text

If you want to annotate your plot or figure with labels, there are two basic options: `text()`

will allow you to add labels to the plot region, and `mtext()`

will allow you to add labels to the margins.

For the plot region, to add labels you need to specify the coordinates and the label. For example:

`text(x=50, y=-1.5, labels="1st label")`

This code would add single label to the plot at the specified co-ordinates. Using `text()`

gives you complete control on the positioning of your labels. If you wanted to label the first 5 points from the dataframe, you could use the following code:

`text(x=df$x[1:5], y=df$y[1:5], labels=rownames(df[1:5,]), pos=4, col="red")`

This time, the coordinates are taken directly from the data frame, and the labels are the row names for the first 5 points. So that the labels are not plotted directly on top of the points, `pos=4`

will plot them on the right-hand side of the point, and `col="red"`

will make them red, so they are easily distinguishable.

You could also use a vector to specify either the coordinates or the labels:

`text(x=df$x[c(10, 20, 30)], y=df$y[c(10, 20, 30)], labels=c("Point 10", "Point 20", "Point 30"), pos=4, col="blue")`

In this code, the x and y coordinates are pulled from the data frame for the specified points `[c(10, 20, 30)]`

, and the labels are now user defined (you can probably come up with better labels for your own data!).

So far we have added labels to the plot region. If you want to add labels the the margins (within the figure region), use `mtext()`

instead. For example, lets add some labels to the x axis:

`mtext(c("Lower", "Higher"), side=1, line=3, at=c(10, 80), col=c("blue", "red"))`

In this code, we add two labels `c("Lower", "Higher")`

to the bottom x axis `side=1`

, positioning them on the "third line" of the margin `line=3`

, at specific locations on the x axis `at=c(10, 80)`

, and colour each label differently `col=c("blue", "red")`

.

If you add a label to the y axis, it will automatically be rotated to 90 degrees, unless you use `las=1`

.

```
mtext("Another label", side=4, line=1, at=2, col="green2") # Rotated y axis label
mtext("Another \nlabel", side=4, line=1, at=-1, col="green2", las=1) # Horizontal label
```

The first line of code will add a label rotated 90 degrees to match the y axis, while the second line will not rotate the label. The addition of `\n`

before "label" will start a new line.

Let's see what all the labels look like on our plot:

You'll notice that one of the labels ("Point 30") is cut off. This is because the label extends outside the plot region, and into the figure region. In order for the label to be displayed fully, you should add `xpd=TRUE`

as an argument in the `text()`

function.

## Identifying points and labelling

Another way to label points is to use the `identify()`

function. This lets you click on the points you want to "identify" and will add a label. To stop identifying points, hit the `ESC`

key, or press the stop button on the R menu bar. Alternatively, you can specify how many points you want to identify by adding `n=10`

to the command (to identify 10 points).

For example, to automatically label points with the row name use the following code:

```
# Identify points
identify(x=df, label=rownames(df), col="red")
```

Here, `x`

specifies the plot from which you want to identify points, and `label=rownames(df)`

tells R to label the points using the row names from the data frame. You could also use a character string or vector to labels the points.

If you do not specify the `label`

argument, then it will default to plotting the row number from the dataframe (or if using vectors, the position of the point in the vector). This is useful if you want to find out which data the point belongs too. You could also create this as an object, for example:

`ident <- identify(df)`

And typing "ident" into the R console will give you a vector of the points you clicked (which in our example, relate to the row numbers from our data frame).

```
> ident
[1] 13 35 42 49
```

(Your results will vary)

## Drawing arrows

You can draw arrows on your plot to point to specific data points. Like `text()`

, you need to specify the x and y coordinates for the arrow, but you need to do this for the "start" and "end" (where the arrow head is drawn) positions.

Lets say we want to draw an arrow pointing at the first data point, you can get the coordinates from the data frame for the "end" position `x1`

and `y1`

, and then specify any coordinates for the "start" position (depending on where you want the arrow to be drawn from). For example:

`arrows(x0=40, y0=-1, x1=df$x[1], y1=df$y[1], col="blue", lwd=2)`

Will draw a blue arrow from the starting coordinates `x0=40, y0=-1`

to the specified data point `x1=df$x[1], y1=df$y[1]`

. But, the arrow head will now cover the data point. To avoid this, you can offset the coordinates for the "end" point:

`arrows(x0=40, y0=-1, x1=df$x[1]-2, y1=df$y[1]+0.02, col="blue", lwd=2)`

The exact numbers you offset by will depend on your own data, as they are linked to the values of the axis.

This method might be a bit cumbersome if you want to draw several arrows, as you'll need to figure out the best coordinates to use for each arrow. An easier way to get the coordinates is to use the `locator()`

function. Similar to `identify()`

, you click on a position on the plot and/or figure region to get the coordinates for where you clicked.

For example, lets draw three arrows using `locator()`

, we'll also start the arrows outside the plot region. First, we'll get the coordinates:

```
# Get coordinates
a1 <- locator(2)
a2 <- locator(2)
a3 <- locator(2)
# Create a matrix of the coordinates
co.x <- cbind(a1$x, a2$x, a3$x)
co.y <- cbind(a1$y, a2$y, a3$y)
```

This code will create a list object that contains coordinates for the two points `locator(2)`

that you clicked on the figure. You should click the "start" position first, followed by the "end" position, for where you want the arrow to point to. We then combine the coordinates into a matrix, which we'll use to draw the arrows on our plot:

```
arrows(x0=co.x[1,], y0=co.y[1,], x1=co.x[2,], y1=co.y[2,], col=c("red", "green", "blue"), lwd=2, xpd=TRUE)
```

In this code we point the coordinates to the matrix we created, we use a vector of colours to colour each arrow differently, and specify `xpd=TRUE`

to draw the arrows outside the plot region. The resulting plot will look something like this (your results will vary):

## Drawing shapes

You can draw shapes on your plot using `rect()`

(to add squares or rectangles) or `polygon()`

(to add polygons).

For `rect()`

, you need to specify the four corners of the rectangle as plot coordinates. For example:

`rect(xleft=20, ybottom=-1, xright=80, ytop=2, col=NA, border="orange", lwd=2)`

This code will draw a large orange rectangle on our plot at the specified coordinates (which correspond to the x and y axis). `col=NA`

stops the rectangle from being "filled", i.e. it will be transparent - specifying a colour will create a filled rectangle.

Polygon might be more useful for drawing on the plot, for example, to draw around a group of data points. To draw the polygon, you would need to specify the x and y coordinates of the shape. The easiest way to do this is with `locator()`

.

```
# Get coordinates
p1 <- locator(8)
p2 <- locator(12)
# Draw polygons
polygon(p1, border="green", lwd=2)
polygon(p2, border="blue", lwd=2)
```

In this code we created two polygons, the first used 8 sets of coordinates (from clicking on the plot 8 times), and the second used 12 sets of coordinates. They were added to the plot using `polygon()`

. If you want the polygons to be filled, you would need to specify a colour `col`

.

The plot will now look similar to this:

## Bringing it all together

So far in this guide, I have shown you different ways to annotate your plot. Lets bring them all together, in a single plot:

```
par(mar=c(5, 5, 5, 5)) # Make large margins
plot(df)
# Add polygons
polygon(p1, border="green", lwd=2)
polygon(p2, border="blue", lwd=2)
# Label polygons
text(x=c(18, 62), y=c(-0.7, -0.9), labels=c("Green Group", "Blue Group"), col=c("green2", "blue2"))
# Add arrows
arrows(x0=co.x[1,], y0=co.y[1,], x1=co.x[2,], y1=co.y[2,], col=c("red2", "green2", "blue2"), lwd=2, xpd=TRUE)
# Label arrows
text(x=co.x[1,], y=co.y[1,], labels=c("Red Arrow", "Green Arrow", "Blue \nArrow"), col=c("red2", "green2", "blue2"), pos=c(3, 3, 4), xpd=TRUE)
# Add margin text
mtext(c("Lower", "Higher"), side=1, line=3, at=c(10, 80), col=c("blue", "red"))
# Margin arrows
arrows(x0=c(40, 60), y0=-2.65, x1=c(20, 72), y1=-2.65, col=c("blue", "red"), length=0.15, lwd=3, xpd=TRUE)
```

Which results in a plot with some crazy annotations! Hopefully, this guide has given you some ideas for your own plots. Another way in which you might "annotate" your plot, is to highlight specific data points on your plot by over plotting them with larger, or different symbols. This was covered in the second part of my guide to PCA in R.

Thanks for reading, please leave any comments or questions below.

## Further reading

A quick guide to pch symbols - A quick guide to the different pch symbols which are available in R, and how to use them. [R Graphics]

A quick guide to line types (lty) - A quick guide to the different line types available in R, and how to use them. [R Graphics]

A quick guide to layout() in R - How to create multi-panel plots and figures using the layout() function. Also covers plot and figure regions. [R Graphics]

Principal components analysis (PCA) - Part 2 - The second part of this guide for PCA, that covers loadings plots, convex hulls, specifying/limiting labels and/or variable arrows, and more biplot customisations - including over plotting data points.

## No comments

## Post a Comment