Extracting data and making climate maps using WorldClim datasets

In this guide, i'll show you another way in which you can get climate data, by using WorldClim global climate datasets. This guide will take you through all the steps for downloading, opening, extracting, and plotting the data using R.

I'll show you how you can make some good looking climate maps using WorldClim data, for anywhere in the world. And, the guide will also look at some of the differences between the WorldClim datasets and CRU datasets (we looked at this in earlier guides), and why you might choose to use one or the other.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

WorldClim global climate datasets

! This guide was updated on 01/02/2018.
Added reference to RasterStacks, and links to a new guide showing how to use them.
! This guide was written using R version 3.4.2 on Windows 10.

This is part three of my guides for getting climate data and working with it using R. If you missed part one or part two, you can go back and read them, but it is not necessary to have worked through them to follow this guide (although it may help!).

WorldClim is a collection of global gridded climate data "layers", which include temperature (mean, max, min), precipitation, solar radiation, wind speed and water vapour pressure. Unlike CRU datasets which contain data for every month between 1901 and 2016, WorldClim contains monthly averages for the period between 1970 and 2000. However, the resolution of WorldClim is much higher, with grid squares representing an area ~1 km2, versus the ~55 km2 resolution of CRU data. Both datasets are based on weather station data.

WorldClim data is also stored in a different file format. You might remember that CRU data came as NetCDF files, which are basically large data arrays. WorldClim data on the other hand can be download as GeoTiff files. These are essentially image files that contain additional metadata - such as map projection and coordinates.

Ultimately, the dataset you choose to work with will depend on the application, what kind of data you need, and personal preference. CRU data is better for looking at long-term data and identifing trends, while WorldClim might be better if you just want to know the typical climate conditions for an area.

Of course, WorldClim and CRU are not the only climate datasets out there, there are many, many more! But, since most datasets tend to provide either NetCDF or GeoTiff files, you should be able to use these guides as a basis for working with other datasets.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Downloading WorldClim data

For this guide, we'll look at average temperature. The latest version of WorldClim datasets is version 2.0. This includes files for "current" climate conditions (1970 - 2000), the previous version 1.4, also contains data for past climate (6000, and 20,000 years ago!), and future climate scenarios. For detailed information about the dataset, check out the associated research paper.

So, head over to the WorldClim site to download version 2.0 of the average temperature data. For this guide, you should download the "30 seconds" file (which is the highest resolution), and the link should be named "tavg 30s". It'll look something like this:

The .zip file will contain 12 separate files named "wc2.0_30s_tavg_01.tif", "wc2.0_30s_tavg_02.tif" etc. Each file contains data for one month with 01 being January, 02 February etc. Extract the files to a new R project folder, e.g. "WorldClim".

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Extracting WorldClim data

To work with GeoTiff files in R, we'll once again use the "raster" package. If you have not already installed this package, go ahead and install.packages("raster") to install it. And remember to load the package after installation library(raster)

For this guide, we are going to load all the data into R by creating separate raster objects: RasterLayers, for each month.

Update: You can also load all the data into R at the same time using a RasterStack. Have a look at this guide to find out how.

# Import WorldClim Mean Temp data
temp1 <- raster("wc2.0_30s_tavg_01.tif")
temp2 <- raster("wc2.0_30s_tavg_02.tif")
temp3 <- raster("wc2.0_30s_tavg_03.tif")
temp4 <- raster("wc2.0_30s_tavg_04.tif")
temp5 <- raster("wc2.0_30s_tavg_05.tif")
temp6 <- raster("wc2.0_30s_tavg_06.tif")
temp7 <- raster("wc2.0_30s_tavg_07.tif")
temp8 <- raster("wc2.0_30s_tavg_08.tif")
temp9 <- raster("wc2.0_30s_tavg_09.tif")
temp10 <- raster("wc2.0_30s_tavg_10.tif")
temp11 <- raster("wc2.0_30s_tavg_11.tif")
temp12 <- raster("wc2.0_30s_tavg_12.tif")

Lets take a look at one of the files in more detail. Simply type "temp1" into the R console:

> temp1
class       : RasterLayer 
dimensions  : 21600, 43200, 933120000  (nrow, ncol, ncell)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : wc2.0_30s_tavg_01.tif 
names       : wc2.0_30s_tavg_01 
values      : -46.9, 34.4  (min, max)

This provides some information about the file. For example, we can see how many data points there are (ncell), the coordinate system used, and the min/max values of the data. The temperature data is in degrees Celsius (℃), and we can see that the coldest average temperature in January was -46.9℃ somewhere in the world!

To extract data for your sample sites from WorldClim datasets, we'll use the same commands as we did for CRU data. Even though the raw data files are in different formats, loading them as raster objects in R means we can manipulate them in the same way.

For this example, we'll create a data frame with the sample site coordinates.

# Create a data.frame with sample site coordinates
site <- c("Manchester", "Liverpool", "Oxford", "London")
lon <- c(-2.24, -2.98, -1.25, -0.11)
lat <- c(53.47, 53.4, 51.74, 51.49)

samples <- data.frame(site, lon, lat, row.names="site")

Here, we created three vector objects containing our data, then we created the data frame object by telling R which vectors to use to construct the data frame. The vectors will become columns in the data frame. Specifying row.names="site" tells R that the "site" vector should be used for the row names in the data frame. To check that the data frame was created as you intended, simply type "samples" into the R console.

> samples
             lon   lat
Manchester -2.24 53.47
Liverpool  -2.98 53.40
Oxford     -1.25 51.74
London     -0.11 51.49

In the previous guide we extracted all the raw data to a new data frame using one command. Since our WorldClim data is stored in separate RasterLayer objects, we will instead "build" our new data frame by extracting each month separately, adding this data as a new column to the data frame.

# Extract data from WorldClim for your sites
temp.data <- samples 
temp.data$Jan <- extract(temp1, samples)
temp.data$Feb <- extract(temp2, samples)
temp.data$Mar <- extract(temp3, samples)
temp.data$Apr <- extract(temp4, samples)
temp.data$May <- extract(temp5, samples)
temp.data$Jun <- extract(temp6, samples)
temp.data$Jul <- extract(temp7, samples)
temp.data$Aug <- extract(temp8, samples)
temp.data$Sep <- extract(temp9, samples)
temp.data$Oct <- extract(temp10, samples)
temp.data$Nov <- extract(temp11, samples)
temp.data$Dec <- extract(temp12, samples)

First, we duplicated the "samples" data frame, and called it "temp.data" - this is the data frame that will contain our extracted temperature data. extract(temp1, samples) tells R to extract the data contained in the "temp1" object, using the coordinates in the "samples" object. temp.data$Jan tells R that the extracted data should be added as a new column named "Jan" to the existing data frame "temp.data" - Then we simply repeat for each month.

The data frame should now contain average temperature data for each month for your sample sites.

Update: With RasterStack objects it is possible to extract all the data in one command, rather than "building" the data frame. This guide shows you how to do it. So why use RasterLayers instead of a RasterStack? Well, if you only want to look at a small number of files, it might be simpler to create a RasterLayer. You also might want more control over the individual layers in a RasterStack, or want more ways to manipulate the data (e.g. such as creating the seasonal data), so creating separate RasterLayers might be preferable. The beauty of R is there is almost always more than one way to do something, with different options to suit different applications. It is important when learning R to be aware of different approaches to working with data, to improve your understanding of the language.

> temp.data
             lon   lat Jan Feb Mar Apr  May  Jun  Jul  Aug  Sep  Oct Nov Dec
Manchester -2.24 53.47 4.2 4.3 6.1 8.0 11.3 14.2 16.4 16.1 13.4 10.4 7.0 4.9
Liverpool  -2.98 53.40 4.5 4.5 6.2 8.1 11.4 14.2 16.3 16.0 13.6 10.5 7.3 5.4
Oxford     -1.25 51.74 4.5 4.6 6.7 8.5 12.0 15.0 17.3 16.9 14.1 10.8 7.2 5.3
London     -0.11 51.49 5.2 5.1 7.5 9.4 13.1 16.1 18.3 18.2 14.8 11.6 8.2 6.1

So that's how you can extract data for your sample sites from WorldClim datasets. You could now perform different kinds of analysis on the data, or plot some climate graphs. Refer back to part two of this guide for help. For example, you could calculate mean annual temperature for your sites by using rowsMeans() e.g.

> rowMeans(temp.data[3:14])
Manchester  Liverpool     Oxford     London 
  9.691667   9.833333  10.241667  11.133333 

Or mean summer temperatures:

> rowMeans(temp.data[8:10])
Manchester  Liverpool     Oxford     London 
  15.56667   15.50000   16.40000   17.53333  
© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Plotting climate maps using WorldClim data

Since the resolution of WorldClim datasets is very high, its possible to plot some good looking climate maps without too much work! So, lets make some climate maps - first, we'll create a colour scale for our temperature data - you are free to use any colours you like for your maps, so why not experiment creating different schemes? try ?colorRampPalette and ?colorRamp in the R console for help.

tempcol <- colorRampPalette(c("purple", "blue", "skyblue", "green", "lightgreen", "yellow", "orange", "red", "darkred"))

Next, we'll plot a map of global January temperatures using this colour scheme:

plot(temp1, col=tempcol(100))

The number in brackets after "tempcol" tells R how many colours to "create", using the colours defined in the "tempcol" object as its palette. i.e. it blends these colours together. So, tempcol(100) creates 100 colours which will create a "smooth" colour scale.

You could also create a map of the January temperatures for the UK by defining the boundaries of the plot, using xlim=c() and ylim=c() e.g.

plot(temp1, xlim=c(-12, 4), ylim=c(48, 64), col=tempcol(100))

Which will result in the following map:

But now the colour scale looks a bit misleading. This is because R automatically sets the scale to the min and max values that you are plotting, so our colour scale no longer makes sense. It is possible to override this using zlim=c() e.g.

plot(temp1, xlim=c(-12, 4), ylim=c(48, 64), zlim=c(-10,30), col=tempcol(100))

Now this map looks more realistic!

You can also use zlim=c() to define the range of values you want to plot. Lets re-plot the world map, but this time we'll just plot the areas with temperatures above 25℃ in August. We'll also use the inbuilt colour scale "heat.colors".

plot(temp8, zlim=c(25, 50), col=rev(heat.colors(20)))

Could be useful for planning your next holiday? - But, perhaps it is now a bit hard to see where these places are. Lets add a simple outline of the world using the "maps" package - install.packages("maps") to install it.

plot(temp8, zlim=c(25, 50), col=rev(heat.colors(20)))
map("world", add=TRUE)

Which results in the following map:

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Subsetting WorldClim data

You may have noticed that it can take R quite a long time to plot maps using the WorldClim dataset (unless you have a very powerful computer!) - this is because, as their name implies, they contain data for the whole world, so when loaded into R, they use a lot of memory. If you're only interested in a small part of the world, it makes sense to subset the data before you plot it. We did this in part one of the guide when we plotted a map of the UK with the CRU dataset.

After subsetting the WorldClim raster object to a smaller area, you can also combine the objects - for example, to create seasonal or annual temperature maps. (You can also do this with the full data, if your computer is powerful enough!)

Lets take a look at another part of the World. Most of my research has been focused in Morocco, so lets look at temperatures there. First, we'll define the extent() and then use crop() to subset our WorldClim data to Morocco.

# Subset Morocco data
ma.area <- extent(-15, 0, 27, 37)
ma.temp1 <- crop(temp1, ma.area)
ma.temp2 <- crop(temp2, ma.area)
ma.temp3 <- crop(temp3, ma.area)
ma.temp4 <- crop(temp4, ma.area)
ma.temp5 <- crop(temp5, ma.area)
ma.temp6 <- crop(temp6, ma.area)
ma.temp7 <- crop(temp7, ma.area)
ma.temp8 <- crop(temp8, ma.area)
ma.temp9 <- crop(temp9, ma.area)
ma.temp10 <- crop(temp10, ma.area)
ma.temp11 <- crop(temp11, ma.area)
ma.temp12 <- crop(temp12, ma.area)

When you plot the these new objects, it should be much quicker. Lets plot a map of August temperatures in Morocco. We'll also add some place names for reference points.

# Plot
plot(ma.temp8, zlim=c(-10,40), col=tempcol(100))
map("world", add=TRUE, lwd=1.5) # Add country outlines
# Add place names
place <- c("RABAT", "Casablanca", "Marrakech", "Agadir")
place.lon <- c(-6.83,  -7.60, -7.99, -9.6)
place.lat <- c(33.96,  33.544, 31.63, 30.421)
points(place.lon, place.lat, pch=15, col="black", cex=1.5)
text(place.lon, place.lat, labels=place, pos=2, cex=1.5)
# Add title
title(main="Mean August Temperatures\n(1970 - 2000)")

From this map we can easily see the temperature range across Morocco during August. Looks like fairly nice summer temperatures for most of the country, getting much cooler in the mountains, and much hotter along the southern border with Algeria.

If you wanted to quickly know the temperature value for any point on the map, you could use the click() command from the raster package. After inputting the command in the R console, simply click anywhere on the map, and the value of where you clicked will be output to the R console. To stop clicking, press escape on the keyboard, or "stop" on the menu bar.

> click(ma.temp8, xy=TRUE)
        x        y value
1 -5.8625 34.22917  26.5
        x        y value
1 -6.9625 33.00417  25.7
        x        y value
1 -8.4125 32.70417  25.4
          x       y value
1 -4.629167 30.8625  33.5
          x        y value
1 -8.245833 29.27917  31.3

(Your results will vary!)

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Creating seasonal plots

Using the subsetted WorldClim data for Morocco, we can easily create seasonal temperature maps. First, we'll create the seasonal data.

ma.temp.spring <- mean(ma.temp3, ma.temp4, ma.temp5)
ma.temp.summer <- mean(ma.temp6, ma.temp7, ma.temp8)
ma.temp.autumn <- mean(ma.temp9, ma.temp10, ma.temp11)
ma.temp.winter <- mean(ma.temp12, ma.temp1, ma.temp2) 

In this code we are simply creating new raster objects which contain the average temperature values for each season.

Now, we'll plot all four seasons in one figure:

# Full guide available at http://www.benjaminbell.co.uk
# 4-part seasonal temperature plot using subsetted WorldClim data

layout(matrix(1:4, ncol=2, byrow=TRUE)) # Set up plot layout

# Spring
par(mar=c(4, 4, 4, 4)) # Set margin
plot(ma.temp.spring, zlim=c(-10,40), col=tempcol(100))
map("world", add=TRUE, lwd=1.5) # Add country outlines
points(place.lon, place.lat, pch=15, col="black", cex=1) # Add places
text(place.lon, place.lat, labels=place, pos=2, cex=1) # Add place names
title(main="Mean Spring Temperatures") # Title

# Summer
par(mar=c(4, 4, 4, 4)) 
plot(ma.temp.summer, zlim=c(-10,40), col=tempcol(100))
map("world", add=TRUE, lwd=1.5) 
points(place.lon, place.lat, pch=15, col="black", cex=1) 
text(place.lon, place.lat, labels=place, pos=2, cex=1)
title(main="Mean Summer Temperatures") 

# Autumn
par(mar=c(4, 4, 4, 4)) 
plot(ma.temp.autumn, zlim=c(-10,40), col=tempcol(100))
map("world", add=TRUE, lwd=1.5) 
points(place.lon, place.lat, pch=15, col="black", cex=1)
text(place.lon, place.lat, labels=place, pos=2, cex=1)
title(main="Mean Autumn Temperatures")

# Winter
par(mar=c(4, 4, 4, 4)) 
plot(ma.temp.winter, zlim=c(-10,40), col=tempcol(100))
map("world", add=TRUE, lwd=1.5) 
points(place.lon, place.lat, pch=15, col="black", cex=1)
text(place.lon, place.lat, labels=place, pos=2, cex=1)
title(main="Mean Winter Temperatures")

Which results in the following plot:

And that's it! - Making climate maps is really easy and straight forward using WorldClim data and R. You can use what you've learned in this guide to create all kinds of climate plots for anywhere in the world. Perhaps you want to plot a figure which has all 12 months of data on it instead? Or perhaps you want to plot a map which compares maximum and minimum temperatures for an area? - Just download the appropriate datasets from the WorldClim website. Update: I've now created a guide to show you exactly how to do this. Check it out!

Of course, its possible to create the same plots using other climate datasets, including the CRU data - although they might not look as nice since the resolution of the data is lower. However, its possible to interpolate the data to make the maps look much better. This will be the subject of a future guide!

Thanks for reading, please leave any questions or comments below.

© 2018 Benjamin Bell. All Rights Reserved. http://www.benjaminbell.co.uk

Further reading

Extracting CRU climate data - Part 1 of this guide!

Working with the extracted climate data - Part 2 of this guide!

Working with RasterStack objects and WorldClim Data - Part 4 of this guide!


No comments:

Post a Comment