RasterStacks and raster::plot

Following on from the previous guide looking at WorldClim data, in this guide I take a look at using RasterStacks to import WorldClim datasets. In the previous guide, we used RasterLayers to import the data, so what is the difference? A RasterStack is a collection of RasterLayers bundled together, or stacked into one object. If you're only dealing with a few RasterLayers, then a RasterStack might not be necessary. However, when the number of RasterLayers increase, it can be easier to work with RasterStacks.

For example, if you wanted to look at minimum, maximum, and mean temperature data, along with precipitation data from WorldClim, using RasterLayers would require you to create 48 separate objects. Using a RasterStack, you could instead create 4 objects containing all the data for each variable.

This guide will also take a look at some of the additional options you have when plotting RasterStacks and other raster objects. Did you know that when you plot a raster object in R using plot(), it actually invokes the raster::plot function, while standard objects (e.g. scatterplots) would invoke graphics::plot? The previous guide suggested you might want to plot a figure which includes separate plots for each month of the year, or you might want a minimum and maximum temperature plot? I'll show you how you can do this easily using RasterStacks and raster::plot.


© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Importing data into a RasterStack

! This guide was written using R version 3.4.2 on Windows 10.

For this guide, we're going to use mean, minimum and maximum temperature data from WorldClim. This guide follows on from the previous guide, it is helpful, but not necessary to have completed it. If you did follow the previous guide, then you should already have the mean temperature data. Go ahead and download the min and max temperature data. Instructions for downloading can be found in the previous guide section: Downloading WorldClim Data

We'll be using the "raster" package, and you should put the additional WorldClim data into your "WorldClim" R project folder. You should extract the mean, min, and max temperature files into separate folders, e.g. ../WorldClim/wc2.0_30s_tavg/, ../WorldClim/wc2.0_30s_tmin/, ../WorldClim/wc2.0_30s_tmax/. If you have a different folder setup, you should amend the code below (e.g. replace ../ with your file path).

# Load packages
library(raster)

# Create RasterStack objects
t.mean.files <- list.files("../WorldClim/wc2.0_30s_tavg/", ".tif", full.names=TRUE)
t.mean <- stack(t.mean.files)

t.min.files <- list.files("../WorldClim/wc2.0_30s_tmin/", ".tif", full.names=TRUE)
t.min <- stack(t.min.files)

t.max.files <- list.files("../WorldClim/wc2.0_30s_tmax/", ".tif", full.names=TRUE)
t.max <- stack(t.max.files)

The first part of the code creates a list of all the .tif files within the specified folder, while the full.names=TRUE argument tells R to return the complete file path. Type "t.mean.files" into the R console to see the list. If you are using Windows and you setup your R directory structure as C:\R\WorldClim\, then "t.mean.files" would look something like this:

> t.mean.files
 [1] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_01.tif"
 [2] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_02.tif"
 [3] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_03.tif"
 [4] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_04.tif"
 [5] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_05.tif"
 [6] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_06.tif"
 [7] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_07.tif"
 [8] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_08.tif"
 [9] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_09.tif"
[10] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_10.tif"
[11] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_11.tif"
[12] "C:/R/WorldClim/wc2.0_30s_tavg/wc2.0_30s_tavg_12.tif"

The second part of the code t.mean <- stack(t.mean.files) creates the RasterStack object from the list of files in "t.mean.files".

If you were to take a look at the "t.mean" object in the R console, you would see it has an additional "dimension", nlayers, with a value of 12. These layers are the individual data files from the list you created ("t.mean.files"). You'll also notice that the "names" attribute is automatically populated. Lets rename these layers to something more useful.

# Rename layers in the RasterStack
month <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

names(t.mean) <- month
names(t.min) <- month
names(t.max) <- month

If you look at the "t.mean" object now, it should look something like this:

> t.mean
class       : RasterStack 
dimensions  : 21600, 43200, 933120000, 12  (nrow, ncol, ncell, nlayers)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
names       :   Jan,   Feb,   Mar,   Apr,   May,   Jun,   Jul,   Aug,   Sep,   Oct,   Nov,   Dec 
min values  : -46.9, -45.5, -57.2, -63.0, -63.6, -63.2, -66.8, -64.7, -62.7, -54.5, -42.0, -45.7 
max values  :  34.4,  33.2,  34.6,  34.8,  36.4,  38.4,  47.1,  46.2,  36.5,  34.4,  33.7,  33.8

In the previous guide we extracted data for our sample sites from separate RasterLayers, and "built" a data frame using separate commands for each month. With a RasterStack, it is possible to extract all the data using a single command.

# Create a data.frame with sample site coordinates
site <- c("Manchester", "Liverpool", "Oxford", "London")
lon <- c(-2.24, -2.98, -1.25, -0.11)
lat <- c(53.47, 53.4, 51.74, 51.49)
samples <- data.frame(site, lon, lat, row.names="site")

# Extract data from RasterLayer
temp.data <- extract(t.mean, samples)

The data frame should now contain mean temperature data for your samples sites for every month.

> temp.data
     Jan Feb Mar Apr  May  Jun  Jul  Aug  Sep  Oct Nov Dec
[1,] 4.2 4.3 6.1 8.0 11.3 14.2 16.4 16.1 13.4 10.4 7.0 4.9
[2,] 4.5 4.5 6.2 8.1 11.4 14.2 16.3 16.0 13.6 10.5 7.3 5.4
[3,] 4.5 4.6 6.7 8.5 12.0 15.0 17.3 16.9 14.1 10.8 7.2 5.3
[4,] 5.2 5.1 7.5 9.4 13.1 16.1 18.3 18.2 14.8 11.6 8.2 6.1

If you compare this data frame to the one we created in the previous guide, you'll see it contains the same data, but it was much easier to create.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Subsetting a RasterStack

In the previous guide we subsetted the individual RasterLayers to a smaller geographical area, and we defined the extent() using coordinates we input. It's also possible to define the extent using another object, rather than typing in the coordinates, for example, using the outline of the country we want to subset data for.

Lets use the outline of Morocco to define our extent. First, we need to get the data from the Global Administrative Areas (GADM) database. You can do this using a command within R.

Morocco <- getData('GADM', country="MAR", level=0)

This code will automatically download a "SpatialPolygonsDataFrame" to the R working directory, which contains an outline of Morocco. The country code refers to ISO-country codes, and you can find a list here, either the two or three letter code should work.

You could plot this object to confirm you have the right country: e.g.

plot(Morocco)

Now, to subset the data, you simply define "Morocco" as the extent() and then crop() the RasterStack object.

# Crop data to Morocco
M.area <- extent(Morocco)

ma.t.mean <- crop(t.mean, M.area)
ma.t.min <- crop(t.min, M.area)
ma.t.max <- crop(t.max, M.area)

This will create a new RasterBrick object, containing data subsetted from each layer within the RasterStack object to a square area large enough to fit all of Morocco within it. A RasterBrick is similar to a RasterStack, except the data source is now a single file rather than a collection of files. (R creates a temporary data file when you create a new object using crop). Type "ma.t.mean" into the R console for details:

> ma.t.mean
class       : RasterBrick 
dimensions  : 991, 1460, 1446860, 12  (nrow, ncol, ncell, nlayers)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -13.16667, -1, 27.66667, 35.925  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : ..\raster\r_tmp_2018-01-30_151001_9116_42606.grd 
names       :  Jan,  Feb,  Mar,  Apr,  May,  Jun,  Jul,  Aug,  Sep,  Oct,  Nov,  Dec 
min values  : -8.4, -7.1, -4.9, -1.0,  2.5,  7.9, 12.0, 11.3,  7.2,  1.1, -4.2, -7.0 
max values  : 17.9, 18.8, 21.5, 25.8, 30.2, 35.1, 38.5, 37.6, 33.2, 26.8, 21.0, 18.7 

Note that the number of data points, the extent, and the min/max values have all changed from the larger "t.mean" object.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Plotting raster objects with raster::plot

When you plot a raster object in R using plot() the command is sent to the plot() function contained within the raster package, rather than the standard plot() function which is part of the default "graphics" package. To see what I mean, type ?graphics::plot and ?raster::plot into the R console, which will invoke their respective help pages.

R is smart enough to know what type of object it is that you are plotting and sends it to the correct function automatically. Hence, why you do not need to use the full command raster::plot(ma.t.mean) when plotting your raster objects. Why is this important? Well, raster::plot() contains some additional arguments which you can use when plotting, that are not available in graphics::plot().

Lets plot mean August temperatures for Morocco. Because our data is now contained within a single multi-layer RasterBrick object, we have to specify which layer we want to plot for August temperatures.

tempcol <- colorRampPalette(c("purple", "blue", "skyblue", "green", "lightgreen", "yellow", "orange", "red", "darkred"))
plot(ma.t.mean$Aug, zlim=c(-10, 40), col=tempcol(100))

Which results in the following plot:

In the previous guide, we added the country outline, place names, and a title to the plot figure. You can do that using the same commands when plotting a single layer from a RasterBrick/RasterStack object.

But, what if you want to plot all 12 months of the year contained within the RasterBrick/RasterStack object on a single figure? The following code will produce a multi-panel figure:

plot(ma.t.mean, zlim=c(-10, 40), col=tempcol(100))

Simple! But, we wanted to add the place names and country outline to each figure. To do this, we need to firstly create a function() which will contain the additional commands, then we need to use the addfun= argument in our plot command:

# Create place names (This is not part of the function)
place <- c("RABAT", "Casablanca", "Marrakech", "Agadir")
place.lon <- c(-6.83,  -7.60, -7.99, -9.6)
place.lat <- c(33.96,  33.544, 31.63, 30.421)

## Create function to add additional things to the map
ma.map <- function() {
plot(Morocco, add=TRUE)
points(place.lon, place.lat, pch=15, col="black", cex=1.5)
text(place.lon, place.lat, labels=place, pos=2, cex=1.5)
}

# Plot
plot(ma.t.min, zlim=c(-10, 40), col=tempcol(100), addfun=ma.map)

What if we also want to change the layout of our figure? When plotting "normal" objects, or a single RasterLayer, you would use layout() to control the layout of the plot window. For multi-layer raster objects, you instead control the layout within the plot() command itself using the arguments nc= and nr= which define the number of columns and rows respectively. e.g.

plot(ma.t.mean, zlim=c(-10, 40), col=tempcol(100), addfun=ma.map, nc=3, nr=4)

Which results in the following plot:

Have a look at the help page ?raster::plot for all the plot options available and experiment with your plots.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Minimum and maximum temperature map, using mask()

Lets create a figure that shows the coldest and hottest places in Morocco during the year. First, we'll need to create two new raster objects containing the minimum and maximum values for the year using the calc() function from the raster package.

ma.t.MIN <- calc(ma.t.min, min)
ma.t.MAX <- calc(ma.t.max, max)

This code creates a new raster object, calculated from either a RasterStack/Brick (ma.t.min) or a collection of RasterLayers. The second argument in the code tells R what to calculate, in this case, the min and max values.

Looking at these new objects in the R console, you can easily see the min and max values, which is useful to know when plotting to specify the scale. You'll also see that they are now a single layer RasterLayer object.

> ma.t.MIN
class       : RasterLayer 
dimensions  : 991, 1460, 1446860  (nrow, ncol, ncell)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -13.16667, -1, 27.66667, 35.925  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : in memory
names       : layer 
values      : -9.6, 13.7  (min, max)

> ma.t.MAX
class       : RasterLayer 
... omitted ...
values      : 13.2, 45.8  (min, max)

In the plot we created earlier, it showed Morocco and Algeria. If you want to only show a specific area, e.g. just Morocco, you can do this by masking the raster object to the country outline using mask() from the raster package. e.g.

# Mask raster 
ma.t.MIN.mask <- mask(ma.t.MIN, Morocco)
ma.t.MAX.mask <- mask(ma.t.MAX, Morocco)

Note, that mask() is not the same as crop() in that it "hides" the extra data, by changing the values outside the mask area to NA (not available), rather than removing them. Take a look at "ma.t.MIN.mask" in the R console:

> ma.t.MIN.mask
class       : RasterLayer 
dimensions  : 991, 1460, 1446860  (nrow, ncol, ncell)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -13.16667, -1, 27.66667, 35.925  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : in memory
names       : layer 
values      : -9.6, 13.7  (min, max)

You can see that the masked object contains the same number of data points (ncell) as the unmasked object (ma.t.MIN). However, since the values outside the mask have been changed to NA, when you plot the mask, you'll only see Morocco.

mask() is also useful if you want to do data analysis for a specific area or region on the map. To demonstrate the differences between the two objects, let's convert them to a data frame, remove the missing values, compare the number of rows in each data frame, and calculate the mean values.

First, convert the masked, and unmasked raster objects to a data frame - we'll just use the min temperature data for this example.

df.t.MIN <- as.data.frame(ma.t.MIN, xy=TRUE, na.rm=TRUE)
df.t.MIN.mask <- as.data.frame(ma.t.MIN.mask, xy=TRUE, na.rm=TRUE)

Next, lets compare the number of rows of the two data frames in the R console:

> lengths(df.t.MIN)
     x      y  layer 
909102 909102 909102 
> lengths(df.t.MIN.mask)
     x      y  layer 
567647 567647 567647 

You can see that in fact the masked object has significantly fewer data points (containing actual data) than the masked object. Finally, lets calculate the mean minimum temperature values for each object.

> colMeans(df.t.MIN)
     x      y  layer 
-5.322814 31.049857  2.633266  
> colMeans(df.t.MIN.mask)
     x      y  layer 
-6.276108 31.846655  2.285625 

Looking at the masked object, we can see that Morocco has an average minimum temperature of -2.3 ℃, compared to -2.6 ℃ for the wider area. Using this technique, you could for example, also calculate the average maximum temperatures for the entire country to know the average temperature range. We'll come back to using mask() in later guides.

Back to our minimum and maximum temperature map for Morocco, lets go ahead and plot this! Because these objects are RasterLayers, we'll have to use layout() to setup the plot layout.

layout(matrix(1:2, ncol=2))
plot(ma.t.MIN.mask, zlim=c(-10, 50), col=tempcol(200), main="Minimum Temperatures")
plot(Morocco, add=TRUE)
plot(ma.t.MAX.mask, zlim=c(-10, 50), col=tempcol(200), main="Maximum Temperatures")
plot(Morocco, add=TRUE)

Which will result in the following plot:

Masks can be useful if you want to highlight a specific area. E.g. you might want to create a map that shows climate data for only certain States in America. Or, like in the above example you want to highlight one specific country.

Lets add the countries surrounding Morocco to our map - we'll just plot maximum temperatures this time.

Firstly, we need to download the additional country outlines from the GADM database. Why don't we just use the "world" outline from the "maps" package as we did in the previous guide? Well, the resolution of that outline is much lower than the individual country outlines, so will not look as good.

# Get country outlines
Algeria <- getData('GADM', country="DZ", level=0)
Spain <- getData('GADM', country="ES", level=0)
Portugal <- getData('GADM', country="PT", level=0)
WesternSahara <- getData('GADM', country="EH", level=0)
Mauritania <- getData('GADM', country="MR", level=0)

Next, since the extent for the "ma.t.MAX.mask" object is limited to Morocco, we'll first plot a dummy plot with a bigger extent. We'll then plot our max temperatures raster of Morocco, and additional country outlines.

However, before we plot the additional countries, we need to first set the clipping clip() region for our plot to match the coordinates defined in the dummy plot. If we skip this step, then the additional countries will be plotted outside the main figure.

# Create a "masked" plot. Full guide available at http://www.benjaminbell.co.uk
par(mar=c(5, 5, 5, 5)) # Sets plot margins
plot(0, xlim=c(-14, 0), ylim=c(26, 38), xaxs="i", yaxs="i", ann=FALSE) # Dummy plot with larger extent
plot(ma.t.MAX.mask, zlim=c(-10, 50), col=tempcol(200), add=TRUE) # Our max temperature object
title(main="Maximum Temperatures \nin Morocco") # Add title
clip(-14, 0, 26, 38) # Set clipping area
plot(Algeria, col="#F0F0E2", add=TRUE) # Add additional countries
plot(WesternSahara, col="#F0F0E2", add=TRUE)
plot(Spain, col="#F0F0E2", add=TRUE)
plot(Portugal, col="#F0F0E2", add=TRUE)
plot(Mauritania, col="#F0F0E2", add=TRUE)
plot(Morocco, lwd=2, border="black", add=TRUE) # Add a thicker border around Morocco
box() # Draw a box around the figure

Which results in the following figure:

Thanks for reading! As ever, if you have any comments of questions please leave them below.


Ad

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Further Reading

Extracting Worldclim climate data - Downloading, extracting and plotting WorldClim data - using RasterLayers.


No comments:

Post a Comment