Pages (Desktop)

Pages (Mobile)

Pollen diagrams in R using rioja - Part 1

If you have ever needed to create a pollen or stratigraphic diagram, you'll know there are various ways to do this. And of course, it can be done in R! In this two-part guide series I will show you how to plot a pollen diagram using the "rioja" package in R. You can also use this guide to create other stratigraphic diagrams, for example, diatom diagrams. Read on for more...

! Never used R before? Check out my getting started with R guide first.

Guide Information

Title Pollen diagrams in R using rioja - Part 1
Author Benjamin Bell
Published February 07, 2018
Last updated
R version 3.4.2
Packages base; rioja; vegan

This is a 3 part guide:

Part 1: Pollen diagrams using rioja How to plot pollen diagrams using rioja in R.
Part 2: Pollen diagrams using rioja "Hacks" to further customise the plot.
Part 3: Pollen diagrams using rioja How to use different plot types on the same plot.

Pollen diagrams using rioja

In this guide I will show you how to plot a pollen diagram in R using the strat.plot() function in the "rioja" package. "rioja" is written by Steve Juggins who also wrote the excellent "C2" program for visualising palaeoenvironmental data. This guide can also be used for plotting other palaeoenvironmental / chemical / biological etc. data. Part 1 (this guide) covers all of the possible plot types, and the main options which are available for creating pollen diagrams, while part 2 looks at some additional features and "hacks" you can do to change your pollen diagram. Part 3 shows you how to combine different plot types into a single pollen diagram.

Go ahead and install.packages("rioja") to download it.

Pollen diagrams consist of multiple plots side by side, each showing a different species. Of course, it's possible to construct a pollen diagram in R "manually" (and this guide shows you how using ggplot2), but rioja makes it nice and simple - once you know how to use all the different plotting options available.

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Importing pollen data into R

For this guide, I'll use the pollen data from my first published paper (Modern surface pollen assemblages from the Middle and High Atlas, Morocco: insights into pollen representation and transport.), which you can download from Mendeley Data. The data is freely available, and you do not need to sign in to download it. The file can be found under "Experiment data files" titled "Morocco Pollen Surface Sample Data.csv". If you have your own pollen data, feel free to use that when following this guide!

This dataset contains pollen data for 33 modern surface samples collected in the Middle, and High Atlas Mountains in Morocco. Typically, pollen diagrams would show pollen changes down a "core", with each pollen sample representing a different depth or age. Surface samples on the other hand are collected from "the surface" (usually the first few centimetres of soil), and their age should be representative of the last few years. For the moment, we'll ignore that these are surface samples.

First, we'll import the data into R, and remove the unnecessary columns.

# Import data
ma.pollen.raw <- read.csv("Morocco Pollen Surface Sample Data.csv", header=TRUE, row.names=1, sep=",", check.names=FALSE)
# Remove the first four columns
ma.pollen <- ma.pollen.raw[-1:-4]
# Remove total AP/NAP and pollen concentration columns
ma.pollen <- ma.pollen[-49:-51]

In total, there are 48 pollen types in the dataset. Since some of the pollen types occur very infrequently, we'll remove them from the dataset.

# Remove pollen types where total abundanace is less than 10%
ma.sum <- colSums(ma.pollen) # Calculate column sums
ma.pollen1 <- ma.pollen[, which(ma.sum > 10)] # Subset data

In this code, we created a new object (ma.sum) which contains the total values for each column in the pollen dataset, which represent the different pollen types. The second part of the code creates another object, by subsetting the pollen dataset (ma.pollen) to keep only the pollen types where total abundance is more than 10%.

If you are using your own pollen data (or other data), the data frame should be formatted with columns as the pollen type or species, and the rows should contain the abundances. Row names should be the sample number, while column names should be the species.

Now the dataset is ready for plotting!

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Plotting pollen diagrams

Bar plots

The strat.plot() function can plot pollen diagrams in several different ways, and this guide will cover them all. If you type ?strat.plot into the R console, you will invoke the help page, which details all of the arguments (options) available. As you can see, there is quite a lot available, and some arguments only work when used in combination with others.

We'll start by plotting a pollen diagram using bar plots.

# Load package
library("rioja")
# Define colour scheme
p.col <- c(rep("forestgreen", times=7), rep("gold2", times=20))
# Define y axis 
y.scale <- 1:33
# Plot bar plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col, lwd.bar=10, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

Which results in the following plot:

As you can see from the code, lots of arguments were used to produce this diagram! We'll go through each of these arguments to explain what it does and how to use it, starting with the first argument (and most crucial), which tells R which object your pollen data is stored in (ma.pollen1). The actual argument is d, although it is not necessary to write this in your code.

yvar
This argument is a vector that specifies the y axis. This can be the sample number, core depth or age. By default, it will use the row number if this is not specified.
y.tks
This argument is a numerical vector that specifies where to draw tick lines on the y axis, as well as what to label them. This cannot be a character vector (i.e. it can only consist of numbers), but there is a way around this (shown in part 2).

In this example, I have used the "y.scale" object to specify both the yvar, and y.tks arguments, depending on your own data, you may need to specify different objects for these arguments.

y.rev
This argument is a logical argument (TRUE or FALSE) to specify whether the y axis should be reversed. By default, the value is FALSE which would plot the diagram from bottom to top e.g. "sample 1" would appear at the bottom of the pollen diagram rather than the top. As pollen diagrams are usually plotted from top to bottom, you should specify this as TRUE.

Next, you'll notice that I have included three plot type arguments plot.line, plot.poly and plot.bar (each of which is logical), even though we are just plotting a bar plot. By default, the strat.plot() function will always plot a line graph unless told not to, so its a good idea to state explicitly what you want to plot, and what you do not want to plot.

col.bar
This argument specifies the colours of the bars in the diagram. This can either be a vector of colours, or a single colour to plot all bars the same.

In this example, the colours were defined in the "p.col" object, which tells R to use "forestgreen" for the first 7 bar charts (arboreal pollen), and "gold2" for the remaining (20) bar charts (non-arboreal pollen). If you wanted, you can specify a different colour for each bar chart, just create your vector object specifying each colour.

lwd.bar
This argument specifies the width of the bars that are plotted.
scale.percent
This is a logical argument which specifies whether the width of each graph (each pollen type or species) is scaled relative to the max abundance for that species. e.g. a pollen type which has a max abundance of 40% will have a wider graph than a pollen type with a max abundance of 10%. If the value of this argument is FALSE, then each graph will have the same width. You can also specify the width of each graph manually by using the graph.widths argument.
xSpace
This argument specifies the space between each graph, the values are "relative units". You can experiment with the value to increase or decrease the space between the graphs. If you have a lot of pollen types (or species) you may need to use a smaller value e.g. 0.005
x.pc.lab
This is a logical argument that tells R whether or not to draw the x-values (along the bottom of the diagram) when you use the scale.percent=TRUE argument.
x.pc.omit0
This is a logical argument that if TRUE, will not draw the "0" value label on the bottom x axis.
las
This argument specifies the style of axis labels - "2" tells R to draw the axis labels perpendicular to the axis. Type ?par in the R console for all options.
© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Bar plots with different coloured rows

With a bar plot pollen diagram, it is also possible to colour each row differently. This is instead of specifying the colour for each separate chart (pollen type or species).

# Specify row colours
p.col.group <- c(rep("blue", times=16), rep("red", times=14), rep("green2", times=3))
# plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col.group, lwd.bar=10, sep.bar=TRUE, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

In this example, sep.bar is also included. This is a logical argument which if TRUE tells R to colour each row separately, instead of colouring each chart separately. Note, that col.bar now specifies a different vector containing a different colour scheme.

This option is useful if you want to highlight or group certain samples together. For example, in the following plot, I have colour-coded the samples by their geographical location.

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Line plots

To plot your pollen diagram as a line plot, use the following code:

# Line plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=TRUE, plot.poly=FALSE, plot.bar=FALSE, col.line=p.col, lwd.line=2, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

This code uses most of the same arguments used to plot the bar plot, with the addition of:

col.line
This argument specifies the colour of the line plots. Works the same as bar.col
lwd.line
This argument specifies the width of the line plots. Works the same as lwd.bar
© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Symbol plots

You could also plot your pollen diagram as symbols (although the customisation options are more limited)

# Symbol plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=FALSE, plot.symb=TRUE, symb.pch=16, symb.cex=1, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)


plot.symb
This is a logical argument telling R whether to plot the diagram as symbols.
symb.pch
This argument tells R which symbol to plot. Type ?pch into the R console to see a list of symbols. It is possible to use more than 1 symbol, but the different symbols will be plotted on the same chart i.e. it won't plot different symbols for different charts.
symb.cex
This argument specifies the symbol size.

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Filled (silhouette) plots

To plot a filled, or silhouette plot, use the following code:

# Filled plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=FALSE, col.poly=p.col, col.poly.line="black", lwd.poly=1, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)


col.poly
This argument specifies the colour of the filled region. Works the same as bar.col
col.poly.line
This specifies the colour of the border line for the filled region. This can be a single colour, or a vector of colours.
lwd.poly
This specifies the width of the border line.
© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

To make it easier to pick out individual samples (or depths) on the silhouette plot, you could also add bars for each sample.

# Filled plot with bars
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

In this code, we have specified both plot arguments plot.poly and plot.bar as TRUE, which adds the bars to the plot. You can configure the appearance of the bars using the options previously discussed in the bar plot example.

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Filled (silhouette) plots with exaggerated curves

Often, when you plot a pollen diagram, you'll have several pollen types where the abundances are low, and so changes in abundance between samples are more difficult to see. To make changes in abundance easier to see, you could additionally plot an exaggerated curve for these pollen types.

# Specify which pollen types will have an exaggeration curve
ex <- c(FALSE, rep(TRUE, times=5), FALSE, rep(TRUE, times=7), FALSE, rep(TRUE, times=6), FALSE, rep(TRUE, times=3), FALSE, TRUE)
# Plot
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE,plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2, exag=ex, col.exag="auto", exag.alpha=0.50, exag.mult=3)


exag
This is a logical argument to add exaggerated curves to the plot. This only works with filled (silhouette) plots plot.poly=TRUE. This can be a single value (TRUE or FALSE) or a vector which specifies a TRUE or FALSE value for every chart in the diagram. In this example, we created a vector of logical values (ex), and specify this in the argument exag=ex
col.exag
This specifies the colours to use for the exaggerated curves. You could specify a single colour or a vector of colours. col.exag="auto" tells R to use the same colours as the main curve - which will be semi-transparent.
exag.alpha
This argument sets the transparency of the exaggerated curves. This should be a value between 0 and 1. This only works if you specify col.exag="auto"
exag.mult
This specifies the multiplier for the exaggerated curves. e.g. a value of 3, will multiply all the abundances by 3. This can be a single value, or you can use a vector to specify different multiplier values for different curves.

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Additional options, including text rotation

There are several other arguments (options) which you can also specify to add or change things on your pollen diagram. Check the help page for full details, but some of the common ones you might want to include are:

title
This argument adds a title to your diagram.
x.names
This is a character vector that specifies the names for each graph. By default, it will use the column names from your data frame (hence, why it was not specified in the examples), but should you want to use different names, you can define them with this argument.
ylabel
This argument adds a label to the y axis e.g. "Depth" and/or "Age"
srt.xlabel
This is a numerical argument that allows you to rotate the text for the pollen types (or species). They are usually rotated by 45 degrees.
x.pc.inc
This will change the increments on the x axis to the value specified. This only works when scale.percent=TRUE. For example, to draw the x axis with increments every 5%, you would use x.pc.inc=5

© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Cluster analysis

If you want to add cluster analysis to your pollen diagram to show pollen assemblage zones, you can do this by creating a "chclust" object, and then adding it to your plot. You'll need to install the "vegan" package to calculate the dissimilarity indices which are used to create the "chclust" object. Go ahead and install "vegan" if you do not already have it installed install.packages("vegan")

Then, use the following code to create the "chclust" object for the pollen data.

# Cluster analysis
library("vegan")
ma.dist <- vegdist(ma.pollen1, method="bray", binary=FALSE, diag=FALSE, upper=FALSE, na.rm = FALSE) 
ma.chclust <- chclust(ma.dist, method="coniss")

The first part of the code calculates "dissimilarity indices" for the pollen data using the "bray" method. There are many options available, and are detailed in the help page, type ?vegdist in the R console for full details.

The second part of the code creates the "chclust" object from the dissimilarity indices, using the "coniss" method (Constrained Incremental Sums of Squares), which you can use to add to your pollen diagram.

To plot the pollen diagram and the cluster analysis, use the clust argument to specify the "chclust" object:

# Filled plot with cluster analysis
pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.inc=5, x.pc.lab=TRUE, x.pc.omit0=TRUE, srt.xlabel=45, las=2, exag=ex, col.exag="auto", exag.mult=3, exag.alpha=0.50, title="Morocco Pollen Surface Samples", ylabel="Sample", clust=ma.chclust)

Next, to show the pollen assemblage zones on the diagram, you'll need to run the following code:

# Add cluster zones
addClustZone(pol.plot, ma.chclust, nZone=6, lwd=1.5, lty=2, col="grey25")

The addClustZone() function plots lines on your pollen diagram (pol.plot), based on the "chclust" object (ma.chclust), to create the number of "zones" as specified by the nZone argument.

This will result in the following plot:

And that's how you create pollen diagrams in R using rioja. Thanks for reading! As always, please leave any questions or comments below.

If you want to take these plots a little further, part 2 of this guide covers some additional options and "hacks" that you can do to add or change features on your pollen diagram. And part 3 shows you how to create a pollen diagram using multiple plot types. See below for details.

This is a 3 part guide:

Part 1: Pollen diagrams using rioja How to plot pollen diagrams using rioja in R.
Part 2: Pollen diagrams using rioja "Hacks" to further customise the plot.
Part 3: Pollen diagrams using rioja How to use different plot types on the same plot.
© Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Further reading: Advanced techniques

Pollen diagrams using rioja - Part 2 - In the second part of this guide, I show you some additional options available for plotting pollen diagrams.

Pollen diagrams using rioja - Part 3 - In the third part of this guide, I show you how to use multiple plot types on the same pollen diagram.

95 comments

  1. Can you define the nZone argument using a broken stick method? Something like:
    nZone = min( which(bstick(clust)$bstick < bstick(clust)$dispersion) )-1

    ReplyDelete
    Replies
    1. As far as I know, the nZone argument should be an integer - the addClustZone() function uses the cutree function in stats... take a look at ?cutree for more details.

      You could use addZone(x, upper=) for total control on the placement of zones.

      Also take a look at: plot(ma.chclust), and bstick(ma.chclust) to see the broken stick model versus the cluster analysis.

      Delete
  2. I have a few questions: 1) How would you add a secondary y-axis to the stratigraphic plot (e.g. containing 210Pb Ages); 2) How do you have different x labels at the top for different types of plots (e.g. species names at an angle in italics, dendrogram with label CONISS at no angle, and PCA sample scores (line graph), also at no angle etc..); 3) How do you add x-axis labels at the bottom of the stratigraphic plot for Percentage abundance (%) (under all graphs with species), Total sum of squares (for the dendrogram), etc? Thank you.

    ReplyDelete
    Replies
    1. Hi Sarah, thanks for your questions!

      1) This should be a case of using something like: axis(2, at="", labels="", line=3) - which will add a second y axis to your plot.Take a look at part 3 of this guide for an example.

      The problem will be lining up the axis correctly - this will involve adjusting the par settings (There's an example in part 3 of this guide). Best way is to play around with the different settings and see what works. Unfortunately, i don't think there is an easy way to do this.

      2) There doesn't appear to be an easy way to do this either - it appears the srt.xlabel= argument can only take a single value.

      I would suggest that you use mtext - this will add text to the plot "margins" - again, you'll need to play around with the settings, something like: mtext("CONISS", side=3, line=-2, at=6), Experiment with different values until it appears in the right place :)

      3) This is fairly easy, mtext will do what you want - something like: mtext("Percentages", side=1, line=4) and mtext("Sum of Squares", side=1, line=4, at=6).

      Again, you'll need to experiment with the arguments here, adjusting the "at=" value, and "line=" values.

      If you have access to a graphics editor - it might be easier to add them in that way :)

      Delete
  3. Hi, I tried to plot the data but I get this message after

    > pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col, lwd.bar=10, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

    Error in if (col.exag[1] == "auto") col.exag <- make.col(cc.poly, exag.alpha) :
    valeur manquante là où TRUE / FALSE est requis

    What's wrong? Thanks for your help!

    ReplyDelete
    Replies
    1. Hi Manon,

      I am unable to reproduce your error. The col.exag function relates to the colour of the exaggeration curves - but in the plot code in your comment, exaggeration curves are not included, so I am unsure why you would encounter this error.

      What version of R and rioja are you using? try updating to the latest versions to see if this resolves your problem.

      Best wishes,
      Ben



      Delete
    2. Hi Ben, thanks for your answer. When I use R 3.4.2, I have this message

      This is rioja 0.9-15.1
      Warning message:
      le package ‘rioja’ a été compilé avec la version R 3.4.4

      So I tried with R 3.4.4, also 3.5.1
      and always the same message

      Error in if (col.exag[1] == "auto") col.exag <- make.col(cc.poly, exag.alpha) :
      valeur manquante là où TRUE / FALSE est requis

      Could it be a problem with grDevices package? It seems not available any more on Cran.
      Thanks,
      Manon

      Delete
    3. Hi Manon,

      Okay, rioja 0.9-15.1 is the latest version of the package :) I have tried again to reproduce the error - testing on MacOS using R 3.5.0, and Fedora using 3.5.0, and Windows 10 using 3.5.1, and it is all working okay at my end.

      Are you using the Moroccan pollen data from this guide, or your own data? I am wondering if there may be a problem with the data that is causing this issue?

      You could also try adding the argument: exag=FALSE to the code to see if that resolves the issue.

      grDevices is a core package, and is installed with R, so it shouldn't be causing an issue. But, try also using: update.packages() - this should update all the packages you have installed to the latest version.

      Sorry I can't be of any more help here - looking on Google, this error message has appeared for other people using other packages, so this may relate to a more general problem, although my French is not the best :)

      Best wishes,
      Ben

      Delete
  4. Hi, it works now!
    I have to change sep="," by ";"
    ma.pollen.raw <- read.csv("pollen1.csv", header=TRUE, row.names=1, sep=";", check.names=FALSE)

    ;-) I am a beginner!
    Thank you very much!
    Best!
    Manon

    ReplyDelete
    Replies
    1. Hi Manon,

      Great! glad it's now working for you now.

      I should update the guide regarding importing csv files as different countries have different conventions for decimal places and the separator that is used.

      You might be better off using the read.csv2() function, rather than read.csv(), as this should have the correct defaults.

      In the R console, type ?read.csv and it will show you the best options to use.

      Best wishes,
      Ben

      Delete
    2. Hi!
      I got the same error message when working with the Morocco Pollen Data. I originally downloaded the data file from the supplemental section on the paper's website (https://www.tandfonline.com/doi/full/10.1080/00173134.2015.1108996?journalCode=sgra20). However when I downloaded the file from Mendeley, everything worked out perfectly!
      I noticed that in R, the environment tab changed when working with the datasets. The number of variables seemed to change.

      Delete
  5. Hi, How can I manually adjust the x axis limits for each panel?

    ReplyDelete
    Replies
    1. Hi, I actually have the same question. I'm using strat.plot to create one plot that shows species abundances as well as chemical parameters (e.g. metal concentrations). However when I plot the chemical parameters, I can't control the limit of the x-axis. So for example I have concentrations that range from 2.34 to 24.32 but strat.plot automatically gives me an x-axis with the range 5 to 30. All the data is plotted correctly, but I would like to be able to control what values the x-axis displays (in this case I would like it to show 0 to 25). Thanks!

      Delete
    2. Hi L.

      Now I understand, in my previous answer I was confusing x and y axis, oops! :)

      But - as far as I can see, there is no way to control the x axis for each panel using strat.plot.

      It appears the x axis will always start at 0, while the last tick will always be rounded up to the "x increment". So, if your x increment is 10, and your highest value is 24.32, the last tick will be 30.

      You can however customise the x increment using: x.pc.inc - but, this only applies when using scale.percent=TRUE. So, if you were to use x.pc.inc=5, the last tick should now be 25 for your panel.

      If you want to show the starting zero, add or change x.pc.omit0 to x.pc.omit0=FALSE.

      However, any changes you make to the x increment will apply to every panel. As yet, there appears to be no way to control this on a panel by panel basis.

      Hope this helps, although it doesn't solve the problem :)

      Delete
  6. Hi, for my plot I have to make the Y axis as names of lakes (data was collected using a top/bottom approach for several lakes). Is there a way of doing this using strat.plot?

    ReplyDelete
    Replies
    1. Yep - take a look at part 2 of this guide which shows you how to do this.

      Best wishes,
      Ben

      Delete
  7. Hi! Do you know if there's a way to add names to the pollen assemblage zones using the code (addClustZone(pol.plot, ma.chclust, nZone=6, lwd=1.5, lty=2, col="grey25"))?

    Kristina

    ReplyDelete
    Replies
    1. Hi Kristina,

      As far as I know there is not an option to do this with addclustZone() - (at least I could not find one, but I could be wrong!)

      But you can add text to the pollen plot once you have finished plotting everything using text(). There is a bit of trial and error involved here as you'll need to know the coordinates of where you want the text to appear. This is not straight forward due to how the pollen diagram is created.

      You'll also need to add the argument xpd=NA to the text code to plot "on top" of your pollen diagram.

      For example:

      text(x=0, y=10, labels="your text here", xpd=NA, col="red")

      - Play around with the coordinates until the labels appear where you want.

      Hope that helps.

      Ben


      Delete
    2. Hi,

      That's what I ended up doing, thanks! :)

      Kristina

      Delete
  8. Hello, I was wondering if you could help with a custom code for adding error bars to a line plot? For example in the rioja documentation, Steve Juggins has a custom function code for adding a smoother but I need some help in doing the same but for adding error bars instead. I would like to plot a diatom-inferred variable to my strat plot but I need to be able to show the errors associated with the transfer function. Thank you!

    ReplyDelete
    Replies
    1. On a standard plot, you can add error bars using the arrows() function, for example:

      x <- rnorm(10)
      y <- 1:10
      plot(x,y)
      arrows(x0=x-0.1, y0=1:10, x1=x+0.1, y1=1:10, length=0.05, angle=90, code=3)

      In this example, we are using horizontal error bars, so x0 and x1 control the size of the error bar. x0 should always be the x value minus the size of the error bar, while x1 is the x value plus the size of the error bar. (In this example, I have just used 1, but you could use the standard deviation or error).

      Hopefully that makes sense... (run the code in R, and it should be clear)


      In strat.plot(), you can add error bars to the diagram using the fun1 argument (as you have seen from the documentation). So, you could do something like:

      z <- data.frame(x, y)
      z <- cbind(x,x,x,x)

      errorb <- function(x,y,i,nm) {
      arrows(x0=z-1, y0=1:10, x1=z+1, y1=1:10, length=0.05, angle=90, code=3, lwd=1.6)
      }

      pol.plot <- strat.plot(z, yvar=y, y.tks=y, y.rev=TRUE, plot.line=TRUE, plot.poly=FALSE, plot.bar=FALSE)


      Unfortunately, this method will add the error bars to every plot, which is probably not what you want.

      As yet, I do not know a workaround for this problem - but editing the strat.plot() source code will likely be involved using: fix(strat.plot)

      Let me know if you can come up with a solution! :)









      Delete
    2. Hi Ben, thanks for the code! It really worked. To prevent adding errors bars to every plot, I included the add = T argument in my strat plot for the diatom-inferred variable and only error bars were added to that plot. I have another question - I have a specific variable with the prediction errors (e.g. RMSEP = 0.4). How would you included this 0.4 in the code above so that the plotted error bars are reflective of this 0.4? Thanks!

      Delete
    3. Great!

      For the error bars, in the example code I used above, you would simply change the x0 and x1 arguments to be + or - 0.4.

      I’ll put together a blog post on error bars and get that online next week to show how thy work more clearly.

      Best wishes

      Delete
  9. Hi is there a way to plot two lines onto the same graph in a strat plot?

    ReplyDelete
    Replies
    1. Hi Z,

      Yes - please see my reply to the comment above about using the fun1 and fun2 arguments - these will allow you to plot additional features (e.g. two lines) on the strat plot graph. The only drawback is they will draw the additional features on every graph.

      The alternative is to "overplot" the graph - please see part 3 of this guide which shows you how to do this.

      Best wishes,
      Ben

      Delete
  10. Hi Ben,
    Do you know how to add the sample numbers to the clusters next to each line in my case from each row?

    ReplyDelete
    Replies
    1. Hi,

      According to the documentation, it should be possible to add sample numbers next to the clusters using the "labels" argument - but I have been playing around with it and can't seem to get it to work :\

      For now I would suggest manually adding them - but i'll take another look later and see if I can get it to work.

      You can also check out the help file for the function, ?chclust, and also ?hclust (which the function is based on).

      Apologies for not having a better answer!

      Ben

      Delete
    2. I have tried that many times but no success! although if you plot the cluster without using the strat.plot it will show the number for each line.

      Thank you very much

      Majed

      Delete
  11. Hi

    I tried to plot the hclust with pollen but it did not work, since I want to use another method than coniss? is there a way to do so?

    ReplyDelete
    Replies
    1. Hi MicroPaleo,

      To create the cluster diagram, you first need to calculate the distances using the dist function (base R).

      Something like:

      pollen.dist <- dist(yourpollendata, method="euclidean")

      Check out ?dist to see all the methods available ("euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski").

      Then you can run hclust to create the diagram e.g.

      pollen.h <- hclust(pollen.dist, "average").


      Hope this helps,
      Best wishes

      Delete
    2. I did the same as you said above but I mean when I try to add the cluster to the diagram an error message said it not possible. because the clust in the stratplot() only accept an constrained calssification object class chclust.


      this is what I did:
      ma.NC.dis <- dist(ma.NC, method = "euclidean")
      ma.NC.clust <- hclust(ma.NC.dis, "ward.D2")


      pol.plot <- strat.plot(ma.NC, yvar=y.scale, y.tks=y.scale, y.rev=FALSE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.02, x.pc.inc=5, x.pc.lab=TRUE, x.pc.omit0=TRUE, srt.xlabel=45, las=2, title="Ferron-Notom Delta - Cainveill North Samples", ylabel="Samples", clust=ma.NC.clust)

      error message:
      Error in strat.plot(ma.NC, yvar = y.scale, y.tks = y.scale, y.rev = FALSE, :
      object 'ma.NC.clust' not found

      Delete
    3. Hi MicroPaleo,

      I understand, the strat.plot function will only accept a "chlust" object - which is created using the chlust function from rioja. As you have already discovered, this is limited to CONISS method of clustering, and it will not accept any other.

      Unfortunately, this appears to be a limitation with the functions from rioja package.

      My suggestion would be to run your cluster analysis separately using R base functions, or take a look at the options in the vegan package for clustering. Plot the pollen diagram and cluster dendrogram separately, and then put them together in a graphics program. Not ideal, but should work around the issue you have.

      Best wishes,
      Ben

      Delete
  12. Hello, thank you for the great blog! This has been really helpful for me.

    I just have a question about the chclust function. I have a pollen data-set with additional data on pollen and charcoal concentrations. I want to show the pollen and charcoal concentrations on my diagram but I don't want the chlust function to take these concentrations into account. I want the chlust function to ignore the concentration columns.

    I tried to work around this by adding the concentrations as new columns after I used the chlust function. But I got the following message:

    #add a new df containing the pollen concentrations
    > Conc <- read.csv("Conc.csv")
    > #merge the 2 dfs
    > pollenandconc <- cbind(ma.pollen1, Conc)
    > # Filled plot with cluster analysis
    > pol.plot <- strat.plot(pollenandconc, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.inc=5, x.pc.lab=TRUE, x.pc.omit0=TRUE, srt.xlabel=45, las=2, exag=1, col.exag="auto", exag.mult=3, exag.alpha=0.50, title="Ngofe Marsh Pollen Percentage Diagram", ylabel="Depth (cm)", clust=ma.chclust)

    Error in if (inc < 0) stop("Too many variables, curves will be too small. Reduce xSpace.") :
    missing value where TRUE/FALSE needed

    Is there anyway to get chclust to ignore certain rows of data? or to add columns of data after I have run the chclust function?

    I hope that makes sense! Thanks
    Nichola

    ReplyDelete
    Replies
    1. Hi Nichola,

      Yes this is possible - im not sure of your specific error, this may be related to something else - perhaps you have too many pollen species?

      Anyway - this is how i did it (using the example code as in this blog). Rather than trying to add the concentrations back onto the diagram, i created a new pollen data object (removing the concentration columns), run the cluster analysis on this, then plotted the main pollen data (including concentration), and then added the cluster analysis to the plot.

      Here's an example using the code as in this guide.

      After importing all the pollen data, lets pretend the final two columns contain the concentration data. First remove this and create a new object:

      # Remove columns from pollen data
      ma.pollen2 <- ma.pollen1[c(-26, -27)]

      # Run cluster analysis on this data
      ma.dist <- vegdist(ma.pollen2, method="bray", binary=FALSE, diag=FALSE, upper=FALSE, na.rm = FALSE)
      ma.chclust <- chclust(ma.dist, method="coniss")

      # Plot original pollen data, and the cluster analysis (from above).
      pol.plot <- strat.plot(ma.pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black",
      col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.inc=5, x.pc.lab=TRUE, x.pc.omit0=TRUE, srt.xlabel=45, las=2,
      exag=ex, col.exag="auto", exag.mult=3, exag.alpha=0.50, title="Morocco Pollen Surface Samples", ylabel="Sample", clust=ma.chclust)


      Hopefully this makes sense, and you can adjust it for your own data.

      Best wishes - let me know if you still have further problems.
      Ben

      Delete
    2. HI Ben,

      Do you know if there is a way I can add a silhouette from PAM plot rather than a dendrogram?

      Thanks,
      Graham

      Delete
    3. Hi Graham,

      As far as i know, this may be possible using a custom function to add to the plot.

      If you look at all the arguments available for strat.plot, you'll find two arguments called: fun1 and fun2 - these arguments allow you to write your own functions, and run them when the plot is created to add custom features to it.

      It isn't something i have played around with - so i can't offer any further advice on this unfortunately.

      Best wishes,
      Ben

      Delete
    4. This is probably a bit late to be helpful for the first person who asked about it, but the 'Too many variables, curves will be too small. Reduce xSpace" can be fixed by clearing your plots, then removing the 'x.space = ' parameter completely from your plot instructions. I asked Steve Juggins about it, and the general jist was that if the number of curves multiplied by the x.space is going to total more that 1 (the size of the plot space) it will throw up this message. e.g. if you have x.space = 0.05 and there are 20 or more plots (taxa) then r will get upset. Hope that helps!

      Delete
  13. Hi Ben,
    Thanks very much for a very helpful blog.
    I'm having trouble with the graphical output - the species names are cut off (not enough space for them), and the spaces in my .csv file are converted to full stops. This happens with my data, and when i used your data and code the same thing happened (i.e. I didn't get the same nicely-formatted output that you did).
    Have you come across this problem before?

    Here's the code i used (for my data):
    > p.col.group <- c(rep("blue", times=14), rep("red", times=5))
    > y.scale <- pollen$depth
    > pol.plot <- strat.plot(bc.pollen, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col.group, lwd.bar=10, sep.bar=TRUE, scale.percent=TRUE, xSpace=0.02, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=1,srt.xlabel=45,title = "M3101 Pollen Data",ylabel = "Depth",cex.ylabel = 3)

    Thanks kindly,
    Nick

    ReplyDelete
    Replies
    1. Hi Nick,

      When you import your pollen data, use the argument: check.names=FALSE

      This should preserve the formatting of the names, and should also fix the problem with the names being cut off due to space.

      Best wishes,
      Ben

      Delete
    2. Thanks very much Ben.
      It fixed the formatting, but not the space-above-the-plot issue. I'll keep fiddling and see what i can figure out. I can reduce the text size so more text fits, but it's still not ideal, and the 45 degree text for the last 1 or 2 taxa is also cut off to the right of the plot.
      Thanks kindly,
      Nick

      Delete
  14. Hi,

    I am relatively new to R. I am trying to us strat.plot for a pollen dataset using the following code:

    > strat.plot(broad.pollen, yvar=y.scale, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col, lwd.bar=10, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

    I have applied everything to the data's restrictions (i.e. scale etc.), yet the following error comes up:

    Error in segments(rep(0, nsam), yvar, d[, i], yvar, lwd = lwd.bar, col = cc.bar[i]) :
    invalid third argument

    I'm not entirely sure why, and am unsure how to correct it :(
    Thank you!!

    Janb

    ReplyDelete
    Replies
    1. Hi Janb,

      It’s not an error I am familiar with, but usually errors with Rioja occur when there is something wrong with the input data. So, firstly I would check your data for any errors, e.g. formatting errors, missing data etc. to see if that resolves it.

      If you use the example data from this guide with your code, do you get the same error? The code looks okay to me :)

      Best wishes

      Delete
    2. Hi Ben,

      I checked the data and surely enough instead of a comma, there was a ';' as I am working on a macbook.

      Thank you very much!

      Janb

      Delete
  15. Hi Ben,

    I have been following through your guide using my own pollen data and editing parameters such as the y axis length accordingly. However, I am frequently encountering an issue when creating the plot, producing the following error: 'Error in xy.coords(x, y, setLab = FALSE) : 'x' and 'y' lengths differ'.

    This is the code I am using:

    pol.plot <- strat.plot(ma.pollen1, y.tks=y.scale, y.rev=TRUE, plot.line=FALSE, plot.poly=TRUE, plot.bar=TRUE, col.bar="black", col.poly=p.col, col.poly.line="black", scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

    Thanks,

    Dan

    ReplyDelete
    Replies
    1. Hi Dan,

      It’s probably an issue with your imported data.


      Did you create a csv file using Excel? Sometimes it adds blank columns or rows onto the file which can cause problems.

      One way to check is to type: fix(your data) in the r console, replacing with the actual name of your data object. Have a look to see if you can spot any blank rows/columns, and if so delete them. Try the plot again.

      If this does not fix the issue, can you check if the code works with the data used in this guide?

      Best wishes
      Ben

      Delete
    2. Hi Ben,

      Thanks for your quick reply. My data is currently in .xlsx format and does not appear to have any blank rows or columns when viewing it with the fix() function. I have previously run this code successfully on both your dataset and a dataset of my own, although now I am running into issues using it on an identical dataset (albeit with one additional pollen taxa).

      Thanks,

      Dan

      Delete
    3. Hi Dan,

      Apologies for the late reply - did you get this sorted in the end? If not, i would be happy to take a look at the file, if you could email me that, and the code you are using. My contact info on the contact page.

      Best wishes

      Delete
  16. Hi! This tutorial was incredibly helpful as I am totally new to R. I was wondering about a couple of things: a) would it be possible to add a stack area graph to the end of the plot to show the proportion of trees/herbs/etc.? b) is there a way to append the count numbers for each sample?, and c) is there a command to control the spacing of each sample in the yAxis? It seems that by default it tries to take over as much page as possible but this looks a bit silly when you have just 4 or 6 samples. Thank you very much!

    ReplyDelete
    Replies
    1. Hi Eduardo,

      Glad the guide was useful! Regarding your questions:

      a) I think you would have to create this manually in R, and add it to the pollen diagram using a graphics editor.

      b) Where do you want to append the count numbers?

      c) Do you have an example you can send me, as im not sure of the issue here? My contact details are available on the contact page.

      Best wishes,
      Ben

      Delete
  17. Hi Benjamin,
    Thank you so much for your generous sharing.I have a question about how can I plot my dinoflagellate data in the sediment core against time? I tried to change the code of y-axis but never worked properly.
    Best wishes,
    Suning

    ReplyDelete
    Replies
    1. Hi Suning,

      Can you provide more details about the error you are getting?

      Whether the y-axis is age or depth, they are just numbers, so the code should work the same way. With age, you should use line graphs to display the data rather than bar graphs, which may result in over lapping bars.

      If its easier, please feel free to send the code/data via email and i'll take a look.

      Best wishes,
      Ben

      Delete
    2. Hi Ben, I just sent an email to you. In case it went to the spam, please have a check. Thank you very much for your help!

      Delete
    3. Hi again Suning,

      I have replied to your email, but for the benefit of everyone else who may have this issue - the problem being how to control the location of the y axis tick marks and labels.

      Use the "y.tks" argument in the plot code, and specify where you want these labels/ticks to appear. This can be a simple vector of numbers of a sequence. For example:

      y.tks=seq(from=1, to=5, by=1)

      would create tick marks and corresponding labels at 1, 2, 3, 4, and 5. You could also use the following code to generate the same data

      y.tks=1:5 or y.tks=c(1,2,3,4,5)

      Best wishes,

      Delete
  18. Hello everyone.

    I am a very beginner in R. Can anyone help my with the issue?
    I imported file and created objects but during plotting the diagram I get this message: "missing value where TRUE/FALSE needed". I just did the same, just changing name to my object.

    And thanks Ben for your tutorial.

    ReplyDelete
    Replies
    1. Hi Matiush,

      Apologies for the very late reply! - Did you manage to resolve this? if not please include the code you are using, and i will try and help.

      Best wishes,
      Ben

      Delete
    2. Hello Ben,

      Thanks for your reply. Unfortunately not. Here is the code and file just in case.

      library("rioja")

      pollen <- read.csv("file", header=TRUE,row.names = 1, sep=",", check.names = FALSE)

      pollen.sum <-colSums(pollen)

      pollen1 <-pollen[, which(pollen.sum > 10)]

      p.col <- c(rep("forestgreen", times=20), rep("gold2", times=27))

      y.scale <- 1:40

      pol.plot <- strat.plot(pollen1, yvar=y.scale, y.tks=y.scale, y.rev=TRUE,
      plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE,
      col.bar=p.col, lwd.bar=10, scale.percent=TRUE,
      xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)

      ###################################

      Link to the file: shorturl.at/ejJSX

      Thank you very much.

      Delete
    3. Hi Matiush - i cannot access the file, perhaps you can email it to me - my details are on the contact or about me page.

      But, i just run the code using my data and i cannot reproduce the error.

      Is the problem with your input data i wonder - send us the file and i'll take a look.

      Best wishes,
      Ben

      Delete
  19. Hi!

    Thank you for the great tutorial! I am using my own data and I have two questions.

    - The line of my x-axis shows up very thick, unlike my nice thin y-axis. It seems like two different scales are used. How could I change this? I used the following code:

    pol.plot <- strat.plot(ma.pollen2, yvar=age, y.tks=age, y.rev=FALSE, plot.line=FALSE, plot.poly=FALSE, plot.bar=TRUE, col.bar=p.col, lwd.bar=5, scale.percent=TRUE, xSpace=0.01, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2, cex.xlabel=0.5, cex.ylabel=0.5, srt.xlabel=60, cex.yaxis=0.5, cex.axis=0.5, title="Haraldstadmyr Bog Pollen", ylabel="Age (Yr)")

    - Some of my graphs turned out empty or almost empty. I was hoping this data would be filtered out by using
    > ma.pollen1 <- ma.pollen[, which(ma.sum > 10)].
    However, a lot of them still end up being 'empty', even when I adjust the number 10 to 100. Another problem I encountered with this code is that it uses the sum of the entire column and thus filters out columns that for example have a low total sum but some important peaks for individual samples. Is there any other way to filter the data set?

    Thank you!

    Best wishes,
    Anna

    ReplyDelete
    Replies
    1. Hi Anna,

      I'm not sure why the x-axis would be thicker than the y axis - there does not appear to be anything in your code that would cause this.

      Are you saving the diagram as a pdf?


      The filtering code i use is just an example, specific to the data i used in the guide. You can filter your data in any way which suits you, you'll just need to amend the code, or add new code. Alternatively, if you are more comfortable with Excel, you could filter your data there before you import into R. Then you can skip this step in R.

      Best wishes,
      Ben

      Delete
    2. Thank you for your quick reply! That is good advice, thank you.

      Yes, I have saved it as a pdf. I can send you an email with the pdf to show the x-axis.
      I can't seem to figure out how to change it.

      Best wishes,
      Anna

      Delete
    3. Hi Anna,

      Yes received the email - i believe the problem is because you are plotting bar plots against age.

      I think you can only use bar plots when you are plotting against depth. For Age, you should use a line or polygon chart.

      That should resolve the problem.

      Best wishes,
      Ben

      Delete
  20. Hi!

    Thank you for the nice tutorial!
    I am using the following code:

    pol.plot <- strat.plot(ma.pollen1,
    yvar=age,
    y.tks=seq(from=-3350, to=2000, by=50),
    y.rev=FALSE,
    plot.line=FALSE,
    plot.poly=TRUE,
    plot.bar=FALSE,
    col.poly=p.col,
    col.poly.line="black",
    lwd.poly=100,
    scale.percent=TRUE,
    min.width=0.2,
    xSpace=0.01,
    x.pc.lab=TRUE,
    x.pc.omit0=TRUE,
    las=2,
    cex.xlabel=0.5,
    cex.ylabel=0.5,
    srt.xlabel=60,
    cex.yaxis=0.5,
    cex.axis=0.5,
    ylabel="Age (CE)")

    However, the black outlline/line around the polygons does not show up, though I did set col.poly.line to "black".

    What could be the cause of this?

    ReplyDelete
    Replies
    1. Hi Anna,


      I have had a look into this, because i also cannot get the border to be plotted around the polygon.

      It looks like this is being caused by an error in the function itself, rather than the code you are using.

      It can be fixed by editing the script code. To do this, use the command:

      fix(strat.plot)

      This should open a new window with a text editor showing the code of the function. At least, this is what happens when using the default R gui program on Windows. RStudio might be different.

      If you take a look at part 2 of this guide, there is a walk through for editing the code.

      Anyway - you need to find the text string that says "border = NA", and you need to replace this with: "border = col.poly.line" - Do this for each occurrence of the string in the function code, except the one where the line starts with rect().

      Save the function code, close the text editor, and then re-run your script, it should now work properly. Although, change the lwd.poly=100, otherwise you will have very big lines indeed!.

      You would have to do this every time you restart R, unless the issue is fixed by the package author.

      Best wishes,
      Ben

      Delete
    2. Thank you so much for your help and figuring it out! It works now! Amazing!
      Thank you for the quick reply!

      Best wishes,
      Anna

      Delete
  21. First of all, thanks to Ben for creating such a useful guide. I've found it really helpful. I was wondering if anyone can help? Is there a way to 'read' the values that the cluster zonation has chosen? I want to be able to say 'Zone 1 is between Xcm and Ycm'), but my samples are very close together (1cm interval) so it's hard to see if the line has been drawn at say 74.5 or 75.5cm.

    ReplyDelete
    Replies
    1. Hi there,

      You can "read" all of the information from the cluster object you created by using the unclass() function.

      In the example i use, i create a cluster object called "ma.chclust", so if you type unclass(ma.chclust) into the R console, it will give you lots of information, and hopefully somewhere will be the info you require :)

      Best wishes,

      Delete
    2. Ah that's brilliant, many thanks!

      Delete
  22. Hi!

    Thank you for the amazing guide!
    I was wondering whether there is any possibility to rotate the diagram, such that the x-axis shows the agse and the y-axis the pollen percentages.

    Thank you!

    ReplyDelete
    Replies
    1. Hi Anna,

      I don't think its possible with rioja - you'd have to create the diagram another way.

      Best wishes

      Delete
  23. Thanks for the great tutorial. Is there any way to re-order the graphs (pollen taxa) aside from re-ordering the columns in the data frame?

    ReplyDelete
    Replies
    1. Hi,

      You can try the "wa.order" argument to re-order the taxa, the options are "topleft" or "bottomleft", and they will re-order the taxa according to their weighted average in relation to the y axis. Check out the help page for more info: ?strat.plot

      But, i think changing the column order in the data frame is the easiest way.

      Best wishes,
      Ben

      Delete
  24. Hi Ben,

    I was wondering if it is possible to add a secondary y axis on the left of the one that the pollen diagram already has?

    Thanks
    Racin

    ReplyDelete
    Replies
    1. Hi Racin,

      In part 3 of this guide I show how to add the y axis manually, and you can use this method to add a secondary y axis as well.

      Unfortunately, rioja does not have an easy way of adding the second axis, so you have to play around to get it right - e.g. adjusting the labels and positioning etc. Check out the third part of this guide for details of the manual plotting... and just run the code twice, and you'll have two y axis.

      Best wishes,
      Ben

      Delete
    2. Hi, Ben thanks a lot for the information, I will look into part 3
      Racin

      Delete
  25. I hope it's ok to jump in here - I can help. There's a package called ggpalaeo written by Richard Telford that will allow you to do this (I assume you want age BP secondary axis?). You need to get it from github. https://github.com/richardjtelford/ggpalaeo Make sure you have updated your version of R or it will throw up an interesting array of error messages. You'll need to mess around with xLeft settings to make sure it doesn't plot over the existing diagram.

    ReplyDelete
    Replies
    1. OOH, yes I want the age BP secondary axis, I'll give it a try
      thank you very much

      Delete
  26. Hi!

    The tutorial was very helpful. I just had one issue, while plotting the bar graphs, I was failing to get different depths from my own data as the y axis. I just got the default row numbers (as I used y.scale <- 2:8). I was unable to set the "yvar" with respect to the different sample depths of my data.
    I used the same code as yours (the first one). If u could kindly guide me about that.

    Thank you.

    ReplyDelete
    Replies
    1. Hello -

      If you used y.scale <- 2:8, and then used yvar=y.scale, this would result (or it should result) in the "depth" values being 2 through 8.

      Perhaps you could provide a bit more information - what values are you wanting to appear in the y axis? and what do you mean by the "default row numbers"?

      Ben


      Delete
    2. Hi,

      Thank you for replying back. So, I used y.scale <- 2:8, and then used yvar=y.scale and the divisions of the Y axis of the graph showed the range from 2-8 (2,3,4,5,6,7,8). Now I have 7 variables as my samples are from 7 different depths. I want to change the y axis values to the depths of my data (20,25,32,45,65,120).

      As you have mentioned earlier "yvar: This argument is a vector that specifies the y axis. This can be the sample number, core depth or age. By default, it will use the row number if this is not specified." how do i specify the yvar according to the core depth?

      Thank you again for helping.

      Delete
    3. Change y.scale to: y.scale <- c(20, 25, 32, 45, 65, 120) and it should produce the result you want.

      Delete
    4. Ok, thank you very much for your help.

      Delete
    5. No worries - I guess you have these values as part of your pollen data that you imported into R.

      The guide is perhaps a little confusing here... and you don't really need to set "y.scale".

      Assuming you imported some pollen data, that contains a column of depth values, you could equally get the result you want using:

      yvar=data$depth, or yvar=data[,1]

      where data = your pollen data, and $depth or [,1] = the name or number of the column.

      I should update the guide to make this clearer! Hopefully you have the result you wanted! :)

      Delete
  27. Hi there, what is the code to reduce the size of the pollen labels? Mine are overlapping but I don't want to reduce my X_Space any further. Thank you!

    ReplyDelete
    Replies
    1. Hi Emma,

      Apologies for late reply - add the argument cex.xlabel to your code, and this can control the label size for the pollen names. To make them smaller, you could try cex.xlabel=0.5 for example.

      Best wishes,
      Ben

      Delete
  28. Hi Ben,
    Thanks so much for this tutorial, as an infrequent used of R I keep going back to it and still find it so useful after (ahem) all those years.
    This time I find myself with a set of counted taxa (ostracodes) I'd like to express in abundances and not in percent. The problem is that it'd work better with an x axis in log scale, which I can't seem to do with strat.plot. I tried to add log="x" in the arguments but it seems to mess so much with the rest of the code that I only get the y and x axes of each graph plus some squiggles.
    Do you know if there's a way to do it without directly transforming the data in the csv?

    Thank you!

    ReplyDelete
    Replies
    1. Hi,

      You could try the log() function in R and then plot using strat.plot()?

      e.g.

      strat.plot(log(YOURDATA), scale.percent=FALSE)

      Best wishes,
      Ben

      Delete
    2. Thanks a lot!
      I tried it and it sorts of works (except with scale.percent=TRUE because I still need the graph widths to be the same when the number of individuals is the same), but of course when the number is 0 you get log(0) -> error which interrupts the graph line.
      I ended transforming the csv (and replacing the errors with 0) before making the diagram with strat.plot and then uploaded the result in Inkscape to change the x labels to the counted data values instead of their logarithm.

      Best wishes,
      Helene

      Delete
  29. Hello Ben,

    I was wondering if you could help. I am trying to add total pollen concentration as a separate column to my pollen percentage data diagram that has gone through cluster analysis. And once done, ensuring that the x axis scale isn't negatively impacted (enlarged/shrunk).

    I have searched and seen other questions/solutions on the page about removing the total pollen concentration column from the plot and plotting the pollen data with pollen concentration as the cluster analysis (22 January 2020 at 19:43) but not plotting pollen percentage data (with cluster analysis) alongside total pollen concentration.

    If this helps, I have imported my data set, removed the depth, age and total pollen concentration column, gone through the steps of filtering out rare pollen taxa, run cluster analysis on the filtered pollen percentage data and plotted this diagram. It's the last step I need help with of then adding the total pollen concentration back into this diagram.

    Thank you in advance.

    Felix.

    ReplyDelete
    Replies
    1. Hi Felix,


      This should be possible - take a look at the help page for the function - ?strat.plot - and it gives the following example:

      # use fig to contol diagram size and position
      x <- strat.plot(spec, xRight = 0.7, yvar = depth, y.rev=TRUE,
      scale.percent=TRUE, title="Round Loch of Glenhead", ylabel="Depth (cm)")
      # add curves for first two DCA components of diatom data
      dca <- decorana(spec, iweigh=1)
      sc <- scores(dca, display="sites", choices=1:2)
      strat.plot(sc, xLeft = 0.7, yvar = depth, y.rev=TRUE, xRight=0.99,
      y.axis=FALSE, clust=clust, clust.width=0.08, add=TRUE)

      You should be able to apply something similar for your scenario.

      Essentially, you need to plot the first part of the pollen diagram without concentration and without the cluster analysis - but, you should add the argument xRight=0.7 to the plot code. (You will likely have to play around with the value).

      Then, you need to plot the second part of the diagram with just the concentration and cluster analysis, and this time add the argument xLeft=0.5 (again adjust value as needed), and also, y.axis=FALSE and add=TRUE to the plot code.

      This should hopefully achieve the desired outcome - let me know if any problems - if this doesn't work, then you'll likely need to manually add the additional graph/curves.

      Best wishes,
      Ben

      Delete
  30. Hello! Thank you for this very usefull blog! I was trying to create a line plot of my data but it seems that whichever combination of xSpace= and graph.width= I use the outcome is very compressed and unredable. My code is : pol.plot <- strat.plot(data2[,2:9], y.tks=y.scale, y.rev=TRUE, plot.line=TRUE, plot.poly=FALSE, plot.bar=FALSE, col.line=p.col, lwd.line=2, scale.percent=FALSE,
    xSpace=0.000000001, graph.widths =1, x.pc.lab=TRUE, x.pc.omit0=TRUE, las=2)
    Do you have any suggestions?
    Thank you,
    Stavroula

    ReplyDelete
    Replies
    1. Hi Stavroula,

      Is this your own data, or the example data? It's a little hard to diagnose, so if you have the data you can share, please drop me an email (details on about me page) and i'll take a look.

      Ben

      Delete
  31. Hi Ben, thanks for all your work on this code and tutorial which has been really useful to me. My samples are irregularly spaced but I have row names which are the depth of samples in cm. Is there any way to show the data using these values as the y-axis? (Apologies if this is covered above!) I've followed the guide on how to change the labels to show the depth.
    Mike

    ReplyDelete
    Replies
    1. Hi Mike,

      Check out part 2 and 3 of the guide which shows a couple of ways to do that.

      Best wishes,
      Ben

      Delete
  32. Hello Ben,

    Can you please help on how to do DCA (Detrended Correspondence Analysis) ? Can you please share some resources where I can find more about it?

    ReplyDelete
    Replies
    1. Not entirely sure, but perhaps the vegan r package might have some functions to do this. I haven't used R to do DCA before.

      Delete

Comments are moderated. There may be a delay until your comment appears.