# Generation of ENVIREM variables from a set of worldclim monthly rasters

#### 2018-03-05

Back to main project site.

In this vignette we will demonstrate how to generate the ENVIREM variables from a set of monthly temperature, precipitation and solar radiation variables. Some of the steps presented here are purely organizational, and worked well for the authors, but could be handled differently. For high resolution datasets (for example, 2.5 arcmin resolution or finer), computer memory limits may be an issue. The authors succeeded in generating all datasets down to 30 arcsecond resolution on a quadcore machine with 32 Gb of memory.

## Goal

The goal in this tutorial is to generate the climatic ENVIREM variables for North America for a future climate projection.

## Set up

In order to accomplish this, we will need:

• the raster package
• the envirem package
• the rgeos and maptools packages for some polygon work
• GDAL
• monthly rasters

GDAL can be installed in a number of ways, for example from source via github, or via homebrew for Mac OS X. For our purposes, we want to make sure that two of GDAL’s functions, gdalinfo and gdal_translate are accessible from within R. To check that these functions will work, we can do the following:

Sys.which('gdalinfo')
##                  gdalinfo
## "/usr/local/bin/gdalinfo"
Sys.which('gdal_translate')
##                  gdal_translate
## "/usr/local/bin/gdal_translate"

If the functions were found, a path will be returned. If they were not found, an empty path will be returned. If these functions are not found in the default search paths, but you know specifically where they are, you can always provide the full path later on.

## Acquiring monthly rasters

In this tutorial, we will generate the ENVIREM rasters for a future climate scenario. We downloaded min temperature, max temperature, precipitation and the bioclimatic variables for the CCSM4 global circulation model with the rcp85 greenhouse gas scenario from Worldclim, at a 2.5 arcminute resolution (this particular scenario was randomly selected).

Once we uncompress the downloaded directories, we have directories of GeoTiff rasters. In this particular case, we acquired 4 directories from WorldClim: cc85bi50 (bioclim), cc85pr50 (precip), cc85tn50 (min temp) and cc85tx50 (max temp).

For example, here are the 19 bioclimatic variables.

setwd("~/ccsm4_2050_2.5arcmin/cc85bi50")
list.files()
##  [1] "cc85bi501.tif"  "cc85bi5010.tif" "cc85bi5011.tif" "cc85bi5012.tif"
##  [5] "cc85bi5013.tif" "cc85bi5014.tif" "cc85bi5015.tif" "cc85bi5016.tif"
##  [9] "cc85bi5017.tif" "cc85bi5018.tif" "cc85bi5019.tif" "cc85bi502.tif"
## [13] "cc85bi503.tif"  "cc85bi504.tif"  "cc85bi505.tif"  "cc85bi506.tif"
## [17] "cc85bi507.tif"  "cc85bi508.tif"  "cc85bi509.tif"

Now we need to acquire a solar radiation dataset. Several such datasets likely exist, but we have used the dataset from CGIAR-CSI. Please see the CGIAR-CSI documentation for detailed information regarding this dataset. Once we uncompress the downloaded directory, we have ArcInfo rasters. We placed this directory in the same master directory as the worldclim data. Our working directory now looks like this:

setwd("~/ccsm4_2050_2.5arcmin")
list.files()
## [1] "cc85bi50"     "cc85pr50"     "cc85tn50"     "cc85tx50"
## [5] "envirem"      "ET_SolRad"    "NorthAmerica" "processed"

## Raster renaming and conversion

#### Worldclim rasters

The next step will involve renaming all of these rasters to a standardized naming system, and converting to the same raster format.

We will create a directory that will contain all of these converted rasters. We will name it processed.

The following script will loop across the main directories we have acquired, will read each raster, rename and write to the processed folder. In this particular case, the naming pattern in the downloaded files is cc85 [variableType] 50 [variableNumber]. The regular expression in the following code has been designed for this particular pattern.

require(raster)

maindir <- '~/ccsm4_2050_2.5arcmin'

subdirs <- c('cc85bi50', 'cc85pr50', 'cc85tn50', 'cc85tx50')

# define a function to recognize the worldclim names, rename, and write as GeoTiffs
rasterToTif <- function(file, outputDir, overwrite = TRUE) {
r <- raster(file)
outputDir <- gsub('/?$', '/', outputDir) var <- gsub("(.+)\\/(.+)(.bil|.tif)$", "\\2", file)
var <- gsub("^([a-z]+)(_?)(\\d+)$", "\\1_\\3", var) if (!grepl("bio|prec|tmin|tmax|tmean", var)) { if (grepl('bi', var)) { var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "bio_\\1", var)
}
if (grepl('pr', var)) {
var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "prec_\\1", var) } if (grepl('tn', var)) { var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "tmin_\\1", var)
}
if (grepl('tx', var)) {
var <- gsub("^\\w\\w\\d\\d\\w\\w\\d\\d(\\d\\d?)$", "tmax_\\1", var) } } fn <- paste(outputDir, var, '.tif', sep='') writeRaster(r, filename=fn, format='GTiff', overwrite = overwrite) } for (i in 1:length(subdirs)) { setwd(paste(maindir, '/', subdirs[i], sep='')) files <- list.files(pattern=".bil$|.tif$", full.names = TRUE) for (j in 1:length(files)) { cat(subdirs[i], '--', files[j], '\n') rasterToTif(files[j], '../processed', overwrite = TRUE) } } setwd('~/ccsm4_2050_2.5arcmin/processed') head(list.files()) ## [1] "bio_1.tif" "bio_10.tif" "bio_11.tif" "bio_12.tif" "bio_13.tif" ## [6] "bio_14.tif" #### Solar radiation rasters The solar radiation rasters contain data for the entire globe, and are provided at a finer resolution than our worldclim rasters, we will therefore need to read them in, mask them to match the worldclim data, and resample to the same resolution. Ultimately, we will write them to the same processed directory. This is what the solar radiation rasters look like for January: setwd('~/ccsm4_2050_2.5arcmin/ET_SolRad') library(maptools) ## Loading required package: sp ## Checking rgeos availability: TRUE library(raster) data(wrld_simpl) r <- raster('et_solrad_1') plot(r, legend = FALSE) plot(wrld_simpl, add = TRUE, lwd = 0.2) …and we want them to look like this: setwd('~/ccsm4_2050_2.5arcmin/processed') library(maptools) library(raster) data(wrld_simpl) r <- raster('et_solrad_1.tif') plot(r, legend = FALSE) plot(wrld_simpl, add = TRUE, lwd = 0.2) We will now read each monthly dataset in, resample, mask and write the output to the processed directory. require(raster) solradDir <- '~/ccsm4_2050_2.5arcmin/ET_SolRad' outputDir <- '~/ccsm4_2050_2.5arcmin/processed/' # read in the solar radiation rasters solradStack <- stack(paste(solradDir, '/et_solrad_', 1:12, sep='')) # read in a worldclim raster as a template template <- raster('~/ccsm4_2050_2.5arcmin/processed/bio_1.tif') # resample and mask according to template solrad <- resample(solradStack, template) solrad <- mask(solrad, template) for (k in 1:nlayers(solrad)) { fn <- paste(outputDir, names(solrad)[k], '.tif', sep='') cat(fn, '\n') writeRaster(solrad[[k]], fn, format='GTiff', overwrite=TRUE) } # clear temporary file cache removeTmpFiles(h = 0) ## Clip rasters to North America As we are primarily interested in North America in this particular case, we will clip our input rasters to a relevant extent, which is much less computationally intensive than generating the ENVIREM variables, and clipping those later. We will first define an extent with a polygon. Here we have already identified coordinates that, when linked, create a polygon that encompasses North America. We will then read in each input raster, crop/mask it and write to disk, to a directory called NorthAmerica. library(rgeos) ## rgeos version: 0.3-26, (SVN revision 560) ## GEOS runtime version: 3.6.1-CAPI-1.10.1 r0 ## Linking to sp version: 1.2-5 ## Polygon checking: TRUE library(maptools) data(wrld_simpl) poly <- readWKT("POLYGON(( -166.9 89.6, -10.4 87.4, -10.4 80.1, -27.2 65.5, -36.6 54.2, -62.1 21.3, -66.0 14.4, -77.8 11.8, -80.0 4.6, -105.3 -0.2, -181.5 -0.4, -182.1 51.6, -178.7 54.9, -168.0 61.9, -168.9 65.2, -166.9 89.6 ))", p4s = CRS("+proj=longlat +datum=WGS84")) plot(poly, border = 'blue', mar = c(0,0,0,0)) plot(wrld_simpl, add=TRUE, lwd=0.2) # crop and mask input rasters inputDir <- "~/ccsm4_2050_2.5arcmin/processed" outputDir <- "~/ccsm4_2050_2.5arcmin/NorthAmerica/" files <- list.files(inputDir, pattern = '.tif$', full.names = TRUE)

r <- stack(files)
r <- crop(r, poly)

for (i in 1:nlayers(r)) {
fn <- paste(outputDir, names(r)[i], '.tif', sep = '')
writeRaster(r[[i]], filename = fn, format = 'GTiff', overwrite = TRUE)
}

### Check conversion

We can check that we have the required rasters and have named them appropriately. We expect to have:

• 19 bioclimatic variables named as bio_1.tif
• 12 precipitation variables named as prec_1.tif
• 12 min temp variables named as tmin_1.tif
• 12 max temp variables named as tmax_1.tif
• 12 solar radiation variables named as et_solrad_1.tif
library(envirem)
##
## Important changes have been made regarding units for temperature rasters. Please check out ?envirem
verifyFileStructure("~/ccsm4_2050_2.5arcmin/NorthAmerica", returnFileNames = FALSE, rasterExt = '.tif')
##  tmean files are not properly named or missing. Tmean will therefore be calculated.

If there was a problem with any of the naming, or if particular files were missing, this function would have informed us. The message regarding tmean is telling us that the function was not provided with monthly rasters of mean temperature, therefore it will calculate those as (tmax + tmin) / 2.

We can now supply the processed input rasters to the envirem package’s main function, generateRasters. As we are inputting high resolution rasters, we supply a tileNum of 4, which means that rasters will be split into 4 pieces. We also specify var = 'all' because we want all available variables to be generated. resName and timeName are just tags for naming the output.

The input rasters come from WorldClim v1.4 where temperature rasters are in units of degrees C * 10. Therefore, we will inform the function of this with tempScale = 10.

require(envirem)
require(raster)

inputDir <- '~/ccsm4_2050_2.5arcmin/NorthAmerica'
outDir <- '~/ccsm4_2050_2.5arcmin/envirem'

# resolution of 2.5arcmin, so we will split rasters into tiles to ease the computational burden
tileNum <- 4

generateRasters(var = 'all', maindir = inputDir, resName = '2.5arcmin', timeName = 'future2050', outputDir = outDir, nTiles = tileNum, tempScale = 10)

The set of ENVIREM variables should now be found in the output directory.

setwd('~/ccsm4_2050_2.5arcmin/envirem/future2050/2.5arcmin')
list.files()
##  [1] "future2050_2.5arcmin_annualPET.tif"
##  [2] "future2050_2.5arcmin_aridityIndexThornthwaite.tif"
##  [3] "future2050_2.5arcmin_climaticMoistureIndex.tif"
##  [4] "future2050_2.5arcmin_continentality.tif"
##  [5] "future2050_2.5arcmin_embergerQ.tif"
##  [6] "future2050_2.5arcmin_growingDegDays0.tif"
##  [7] "future2050_2.5arcmin_growingDegDays5.tif"
##  [8] "future2050_2.5arcmin_maxTempColdest.tif"
##  [9] "future2050_2.5arcmin_minTempWarmest.tif"
## [10] "future2050_2.5arcmin_monthCountByTemp10.tif"
## [11] "future2050_2.5arcmin_PETColdestQuarter.tif"
## [12] "future2050_2.5arcmin_PETDriestQuarter.tif"
## [13] "future2050_2.5arcmin_PETseasonality.tif"
## [14] "future2050_2.5arcmin_PETWarmestQuarter.tif"
## [15] "future2050_2.5arcmin_PETWettestQuarter.tif"
## [16] "future2050_2.5arcmin_thermicityIndex.tif"