Generation of ENVIREM variables from a set of worldclim monthly rasters

Pascal O. Title

2017-02-01

Back to main project site.

In this vignette we will demonstrate how to generate the ENVIREM variables from a set of monthly temperature, precipitation and solar radiation variables. Some of the steps presented here are purely organizational, and worked well for the authors, but could be handled differently. For high resolution datasets (for example, 2.5 arcmin resolution or finer), computer memory limits may be an issue. The authors succeeded in generating all datasets down to 30 arcsecond resolution on a quadcore machine with 32 Gb of memory.

Goal

The goal in this tutorial is to generate the climatic ENVIREM variables for North America for a future climate projection.

Set up

In order to accomplish this, we will need:

GDAL can be installed in a number of ways, for example from source via github, or via homebrew for Mac OS X. For our purposes, we want to make sure that two of GDAL’s functions, gdalinfo and gdal_translate are accessible from within R. To check that these functions will work, we can do the following:

Sys.which('gdalinfo')
##                  gdalinfo 
## "/usr/local/bin/gdalinfo"
Sys.which('gdal_translate')
##                  gdal_translate 
## "/usr/local/bin/gdal_translate"

If the functions were found, a path will be returned. If they were not found, an empty path will be returned. If these functions are not found in the default search paths, but you know specifically where they are, you can always provide the full path later on.

Acquiring monthly rasters

In this tutorial, we will generate the ENVIREM rasters for a future climate scenario. We downloaded min temperature, max temperature, precipitation and the bioclimatic variables for the CCSM4 global circulation model with the rcp85 greenhouse gas scenario from Worldclim, at a 2.5 arcminute resolution (this particular scenario was randomly selected).

Once we uncompress the downloaded directories, we have directories of GeoTiff rasters. In this particular case, we acquired 4 directories from WorldClim: cc85bi50 (bioclim), cc85pr50 (precip), cc85tn50 (min temp) and cc85tx50 (max temp).

For example, here are the 19 bioclimatic variables.

knitr::opts_knit$set(root.dir = '/tmp')
setwd("~/ccsm4_2050_2.5arcmin/cc85bi50")
list.files()
##  [1] "cc85bi501.tif"  "cc85bi5010.tif" "cc85bi5011.tif" "cc85bi5012.tif"
##  [5] "cc85bi5013.tif" "cc85bi5014.tif" "cc85bi5015.tif" "cc85bi5016.tif"
##  [9] "cc85bi5017.tif" "cc85bi5018.tif" "cc85bi5019.tif" "cc85bi502.tif" 
## [13] "cc85bi503.tif"  "cc85bi504.tif"  "cc85bi505.tif"  "cc85bi506.tif" 
## [17] "cc85bi507.tif"  "cc85bi508.tif"  "cc85bi509.tif"

Now we need to acquire a solar radiation dataset. Several such datasets likely exist, but we have used the dataset from CGIAR-CSI. Please see the CGIAR-CSI documentation for detailed information regarding this dataset. Once we uncompress the downloaded directory, we have ArcInfo rasters. We placed this directory in the same master directory as the worldclim data. Our working directory now looks like this:

setwd("~/ccsm4_2050_2.5arcmin")
list.files()
## [1] "cc85bi50"     "cc85pr50"     "cc85tn50"     "cc85tx50"    
## [5] "envirem"      "ET_SolRad"    "NorthAmerica" "processed"

Raster renaming and conversion

Worldclim rasters

The next step will involve renaming all of these rasters to a standardized naming system, and converting to the same raster format.

We will create a directory that will contain all of these converted rasters. We will name it processed.

The following script will loop across the main directories we have acquired, will read each raster, rename and write to the processed folder. In this particular case, the naming pattern in the downloaded files is cc85 [variableType] 50 [variableNumber]. The regular expression in the following code has been designed for this particular pattern.

require(raster)

maindir <- '~/ccsm4_2050_2.5arcmin'

subdirs <- c('cc85bi50', 'cc85pr50', 'cc85tn50', 'cc85tx50')

# define a function to recognize the worldclim names, rename, and write as GeoTiffs
rasterToTif <- function(file, outputDir, overwrite = TRUE) {
    r <- raster(file)
    outputDir <- gsub('/?$', '/', outputDir)
    var <- gsub("(.+)\\/(.+)(.bil|.tif)$", "\\2", file)
    var <- gsub("^([a-z]+)(_?)(\\d+)$", "\\1_\\3", var)
    if (!grepl("bio|prec|tmin|tmax|tmean", var)) {
        if (grepl('bi', var)) {
            var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "bio_\\1", var)
        }
        if (grepl('pr', var)) {
            var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "prec_\\1", var)
        }
        if (grepl('tn', var)) {
            var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "tmin_\\1", var)
        }
        if (grepl('tx', var)) {
            var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "tmax_\\1", var)
        }
    }
    fn <- paste(outputDir, var, '.tif', sep='')
    writeRaster(r, filename=fn, format='GTiff', overwrite = overwrite)
}


for (i in 1:length(subdirs)) {
    setwd(paste(maindir, '/', subdirs[i], sep=''))
  files <- list.files(pattern=".bil$|.tif$", full.names = TRUE)
    for (j in 1:length(files)) {
                cat(subdirs[i], '--', files[j], '\n')
                rasterToTif(files[j], '../processed', overwrite = TRUE)
    }
}
setwd('~/ccsm4_2050_2.5arcmin/processed')
head(list.files())
## [1] "bio_1.tif"  "bio_10.tif" "bio_11.tif" "bio_12.tif" "bio_13.tif"
## [6] "bio_14.tif"

Solar radiation rasters

The solar radiation rasters contain data for the entire globe, and are provided at a finer resolution than our worldclim rasters, we will therefore need to read them in, mask them to match the worldclim data, and resample to the same resolution. Ultimately, we will write them to the same processed directory.

This is what the solar radiation rasters look like for January:

setwd('~/ccsm4_2050_2.5arcmin/ET_SolRad')
library(maptools)
## Loading required package: sp
## Checking rgeos availability: TRUE
library(raster)
data(wrld_simpl)
r <- raster('et_solrad_1')
plot(r, legend = FALSE)
plot(wrld_simpl, add = TRUE, lwd = 0.2)

…and we want them to look like this:

setwd('~/ccsm4_2050_2.5arcmin/processed')
library(maptools)
library(raster)
data(wrld_simpl)
r <- raster('et_solrad_1.tif')
plot(r, legend = FALSE)
plot(wrld_simpl, add = TRUE, lwd = 0.2)

We will now read each monthly dataset in, resample, mask and write the output to the processed directory.

require(raster)

solradDir <- '~/ccsm4_2050_2.5arcmin/ET_SolRad'
outputDir <- '~/ccsm4_2050_2.5arcmin/processed/'

# read in the solar radiation rasters
solradStack <- stack(paste(solradDir, '/et_solrad_', 1:12, sep=''))

# read in a worldclim raster as a template
template <- raster('~/ccsm4_2050_2.5arcmin/processed/bio_1.tif')

# resample and mask according to template
solrad <- resample(solradStack, template)
solrad <- mask(solrad, template)

for (k in 1:nlayers(solrad)) {

  fn <- paste(outputDir, names(solrad)[k], '.tif', sep='')
  cat(fn, '\n')
    writeRaster(solrad[[k]], fn, format='GTiff', overwrite=TRUE)
}

# clear temporary file cache
removeTmpFiles(h = 0)

Clip rasters to North America

As we are primarily interested in North America in this particular case, we will clip our input rasters to a relevant extent, which is much less computationally intensive than generating the ENVIREM variables, and clipping those later.

We will first define an extent with a polygon. Here we have already identified coordinates that, when linked, create a polygon that encompasses North America. We will then read in each input raster, crop/mask it and write to disk, to a directory called NorthAmerica.

library(rgeos)
## rgeos version: 0.3-22, (SVN revision 544)
##  GEOS runtime version: 3.4.2-CAPI-1.8.2 r3921 
##  Linking to sp version: 1.2-4 
##  Polygon checking: TRUE
library(maptools)
data(wrld_simpl)

poly <- readWKT("POLYGON((
  -166.9 89.6,
  -10.4 87.4,
  -10.4 80.1,
  -27.2 65.5,
  -36.6 54.2,
  -62.1 21.3,
  -66.0 14.4,
  -77.8 11.8,
  -80.0  4.6,
  -105.3 -0.2,
  -181.5 -0.4,
  -182.1 51.6,
  -178.7 54.9,
  -168.0 61.9,
  -168.9 65.2,
  -166.9 89.6
))", p4s = CRS("+proj=longlat +datum=WGS84"))

plot(poly, border = 'blue', mar = c(0,0,0,0))
plot(wrld_simpl, add=TRUE, lwd=0.2)

# crop and mask input rasters
inputDir <- "~/ccsm4_2050_2.5arcmin/processed"
outputDir <- "~/ccsm4_2050_2.5arcmin/NorthAmerica/"

files <- list.files(inputDir, pattern = '.tif$', full.names = TRUE)

r <- stack(files)
r <- crop(r, poly)
r <- mask(r, poly)

for (i in 1:nlayers(r)) {
  fn <- paste(outputDir, names(r)[i], '.tif', sep = '')
  writeRaster(r[[i]], filename = fn, format = 'GTiff', overwrite = TRUE)
}

Check conversion

We can check that we have the required rasters and have named them appropriately. We expect to have:

library(envirem)
verifyFileStructure("~/ccsm4_2050_2.5arcmin/NorthAmerica", returnFileNames = FALSE, rasterExt = '.tif')

This function should return nothing, because our files have been appropriately prepared. If there was a problem with any of the naming, or if particular files were missing, this function would have said so.

We can now supply the processed input rasters to the envirem package’s main function, generateRasters. As we are inputing high resolution rasters, we supply a tileNum of 4, which means that rasters will be split into 4 pieces. We also specify var = 'all' because we want all available variables to be generated. resName and timeName are just tags for naming the output.

require(envirem)
require(raster)

inputDir <- '~/ccsm4_2050_2.5arcmin/NorthAmerica'
outDir <- '~/ccsm4_2050_2.5arcmin/envirem'

# resolution of 2.5arcmin, so we will split rasters into tiles to ease the computational burden
tileNum <- 4

generateRasters(var = 'all', maindir = inputDir, resName = '2.5arcmin', timeName = 'future2050', outputDir = outDir, nTiles = tileNum)

The set of ENVIREM variables should now be found in the output directory.

setwd('~/ccsm4_2050_2.5arcmin/envirem/future2050/2.5arcmin')
list.files()
##  [1] "future2050_2.5arcmin_annualPET.tif"               
##  [2] "future2050_2.5arcmin_aridityIndexThornthwaite.tif"
##  [3] "future2050_2.5arcmin_climaticMoistureIndex.tif"   
##  [4] "future2050_2.5arcmin_continentality.tif"          
##  [5] "future2050_2.5arcmin_embergerQ.tif"               
##  [6] "future2050_2.5arcmin_growingDegDays0.tif"         
##  [7] "future2050_2.5arcmin_growingDegDays5.tif"         
##  [8] "future2050_2.5arcmin_maxTempColdest.tif"          
##  [9] "future2050_2.5arcmin_minTempWarmest.tif"          
## [10] "future2050_2.5arcmin_monthCountByTemp10.tif"      
## [11] "future2050_2.5arcmin_PETColdestQuarter.tif"       
## [12] "future2050_2.5arcmin_PETDriestQuarter.tif"        
## [13] "future2050_2.5arcmin_PETseasonality.tif"          
## [14] "future2050_2.5arcmin_PETWarmestQuarter.tif"       
## [15] "future2050_2.5arcmin_PETWettestQuarter.tif"       
## [16] "future2050_2.5arcmin_thermicityIndex.tif"