Precipitation Tutorial

Matthias Schultheiss

Introduction

The following example aims to show how climate data can be analyzed and visualized with the help of Rstudio. For this purpose, the measured precipitation data in Baden-Württemberg from 2018 is used. They were provided by the German Weather Service (DWD) and can be downloaded online from their CDC-server (Climate Data Center).

# Loading basic packages
if (!require(sp)){install.packages("sp"); library(sp)}
if (!require(sf)){install.packages("sf"); library(sf)}
if (!require(dplyr)){install.packages("dplyr"); library(dplyr)}
if (!require(tidyverse)){install.packages("tidyverse"); library(tidyverse)}

Import precipitation data

In total there are three ways to access the data:

  1. Direct (manual) data retrieval from the CDC server: ftp://ftp-cdc.dwd.de/pub/CDC/

  2. Indirect data retrieval from the CDC server (filter options in the browser and interactive map): https://cdc.dwd.de/portal/201902131424/mapview

  3. Direct data download from the CDC server via R-package “rdwd”: https://cran.r-project.org/web/packages/rdwd/vignettes/rdwd.html

In particular, the following information was searched for: Sum of annual precipitation in mm, date of survey, station coordinates (x,y,z), station ID. The data used included the name of the station and the corresponding federal state. The next steps are working with data from option number two.

In total two CSV files were downloaded. The object “metaBW” contains data about the measuring stations and the object “dataBW” contains the measured climate data. These are then cleaned, filtered and combined with each other.

# Import and filter data
bw.meta<-read.csv("data/rohdaten/DWD/CDC_DL/data/sdo_JA_RR_MN006.csv") %>%
  as_tibble() %>%
  select(SDO_ID,
         SDO_Name,
         Geogr_Laenge,
         Geogr_Breite,
         Hoehe_ueber_NN)

bw.data<-read.csv("data/rohdaten/DWD/CDC_DL/data/data_JA_RR_MN006.csv") %>%
  as_tibble() %>% 
  select(SDO_ID,
         Zeitstempel,
         Wert) %>%
  filter(Zeitstempel==2018) %>%
# Get rid of NA values
  na.omit()

# Merge the two datasets and transform to sf object
prec.2018<-merge(bw.meta, 
                 bw.data) %>%
  st_as_sf(coords=c("Geogr_Breite",
                    "Geogr_Laenge"), crs=4326) %>%
# Export your precipitation data
  st_write("data/features/Prec2018.gpkg",
           driver="GPKG", 
           delete_dsn=TRUE)
## Deleting source `data/features/Prec2018.gpkg' using driver `GPKG'
## Updating layer `Prec2018' to data source `data/features/Prec2018.gpkg' using driver `GPKG'
## Writing 192 features with 5 fields and geometry type Point.

Analise your data with ggplot2

With the cleaned, merged, filtered and reformatted data, we will now continue to work and try to analyze a relationship between the absolute precipitation sum and station height. Each station is represented in the data set with the value measured on site. All missing measured values (NA’s) were filtered out in advance. The parameters and time dimension are arbitrarily exchangeable for future analyses.

# Load needed additional package
if (!require(ggplot2)){install.packages("ggplot2"); library(ggplot2)}

# Create new output directory for plots
dir.create(file.path("ergebnisse", "figures"))

outfile<-file.path("ergebnisse", "figures")

  if (!dir.exists(outfile)) {
    dir.create(outfile, recursive = TRUE)
  } else {
    message(paste("Directory already exists, continue."))
  }

# Convert wanted data to data frame (or tibble)
df.prec<-data.frame(prec.2018)

# Create your plot
plot.prec<-ggplot(data=df.prec,
                  mapping=aes(x=Hoehe_ueber_NN, y=Wert)) +
  geom_point(alpha=0.3) +
  geom_smooth() +
  labs(x="Elevation above sea level",
       y="Total precipitation (in mm)",
       title="Precipitation in relation to station height in BW (2018)",
       subtitle="Points = Observation stations",
       caption="Datasource: DWD 2019") +
  # Export and save results
  ggsave(device="png",
         filename=paste0("Precipitation_BaWü_2018.png"),
         path=outfile,
         dpi=360)

# Or look at your plot in Rstudio 
plot(plot.prec)

Mapping precipitation per station with tmap

The station values visualized with ggplot can also be displayed on a map. In retrospective, it is not very useful to display the exact value of the individual stations, because there are simply too many textual overlaps. It should only be shown that there would be the possibility to do so, if desired.

# Load needed additional package
if (!require(tmap)){install.packages("tmap"); library(tmap)}

# Import regional borders
bw.region<-st_transform(
  st_read("data/rohdaten/NUTS/AX_Region.shp"),
  4326)
## Reading layer `AX_Region' from data source `V:\github\nils_website\blogdown\content\showcase\schultheiss\data\rohdaten\NUTS\AX_Region.shp' using driver `ESRI Shapefile'
## Simple feature collection with 12 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XYZ
## bbox:           xmin: 3388369 ymin: 5266823 xmax: 3610189 ymax: 5517400
## z_range:        zmin: 0 zmax: 0
## epsg (SRID):    NA
## proj4string:    +proj=tmerc +lat_0=0 +lon_0=9 +k=1 +x_0=3500000 +y_0=0 +datum=potsdam +units=m +no_defs
# Import data (as sf object)
prec.sf<-st_transform(
  st_read("data/features/Prec2018.gpkg"),
  4326)
## Reading layer `Prec2018' from data source `V:\github\nils_website\blogdown\content\showcase\schultheiss\data\features\Prec2018.gpkg' using driver `GPKG'
## Simple feature collection with 192 features and 5 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 7.6202 ymin: 47.559 xmax: 10.3644 ymax: 49.7636
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
# Map with tmap
tm_shape(bw.region) +
  tm_polygons(col="grey",
              border.col="white",
              alpha=0.5) +
  tm_shape(prec.sf) +
  tm_dots(col="Wert", palette="RdBu",
          title="Annual precipitation\n in BW 2018\n (in mm)",
          size=0.7) +
  tm_text("Wert",
          just="left",
          xmod=.5,
          size=0.7) +
  tm_legend(legend.outside=TRUE)

Create Voronoi-Diagram

In order to represent the spatial measurement density it is much more reasonable to map with Thiessen polygons. Thiessen polygons, also known as Voronoi diagrams or Dirichlet decomposition, are used to divide space into regions determined by a given set of points in that space, the so called centers.

# Load needed additional packages
if (!require(spatstat)){install.packages("spatstat"); library(spatstat)}
if (!require(maptools)){install.packages("maptools"); library(maptools)}
if (!require(raster)){install.packages("raster"); library(raster)}

# load data again and change projection
bw.landesgrenzen<-st_transform(
  st_read("data/rohdaten/NUTS/AX_Gebiet_Bundesland.shp"),
  32632) 
## Reading layer `AX_Gebiet_Bundesland' from data source `V:\github\nils_website\blogdown\content\showcase\schultheiss\data\rohdaten\NUTS\AX_Gebiet_Bundesland.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 388337.8 ymin: 5265159 xmax: 610069.4 ymax: 5515632
## epsg (SRID):    NA
## proj4string:    +proj=utm +zone=32 +ellps=GRS80 +units=m +no_defs
prec.sf<-st_transform(
  st_read("data/features/Prec2018.gpkg"),
  32632)
## Reading layer `Prec2018' from data source `V:\github\nils_website\blogdown\content\showcase\schultheiss\data\features\Prec2018.gpkg' using driver `GPKG'
## Simple feature collection with 192 features and 5 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 7.6202 ymin: 47.559 xmax: 10.3644 ymax: 49.7636
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
# sf to SpatialPointsDataFrame
prec.spdf<-as(prec.sf, "Spatial")
bw.landesgrenzen.spdf<-as(bw.landesgrenzen, "Spatial")

# Replace point boundary extent
prec.spdf@bbox<-bw.landesgrenzen.spdf@bbox

# Create a mosaic surface 
tessellated<-as(dirichlet(as(prec.spdf, "ppp")),
                "SpatialPolygons")
proj4string(tessellated)<-proj4string(prec.spdf)

# Informations do not seem to be transferred, so you have to insert them again
tessellated.z<-over(tessellated,
                    prec.spdf,
                    fn=mean)
tessellated.spdf<-SpatialPolygonsDataFrame(tessellated,
                                           tessellated.z)
# Clip tesselated zu BaWü
tessellated.clp<-raster::intersect(bw.landesgrenzen.spdf,
                                   tessellated.spdf)
# Plot 
tm_shape(tessellated.clp) + 
  tm_polygons(col="Wert", palette="RdBu",
              title="Annual precipitation\n in BW 2018\n (in mm)") +
  tm_legend(legend.outside=TRUE)

IDW interpolation

By far the best way to map our precipitation data is via interpolation. For this purpose Inverse Distance Weighting (IDW) was used. The IDW interpolation determines cell values with a linearly weighted combination of different reference points. The weighting depends on the inverse distance. The surface to be interpolated must be that of a position-related dependent variable. This method allows to define precipitation-rich or precipitation-poor spaces more precisely and without gaps, which could not be implemented very well with the two methods presented so far. For a better understanding of this method read this tutorial: https://gisgeography.com/inverse-distance-weighting-idw-interpolation/.

if (!require(gstat)){install.packages("gstat"); library(gstat)}

# Create empty grid (n=number of grid cells)
newgrid<- as.data.frame(spsample(prec.spdf,
                                 "regular",
                                 n=50000))
names(newgrid)<-c("X", "Y")
coordinates(newgrid)<-c("X", "Y")
gridded(newgrid)<-TRUE  # create SpatialPixel object
fullgrid(newgrid)<-TRUE  # create SpatialGrid object

# Apply proj4string
proj4string(newgrid) <- proj4string(prec.spdf)

# Interpolates the grid cells by 2 (idp=2.0)
PIDW <- gstat::idw(Wert ~ 1,
                   prec.spdf,
                   newdata=newgrid,
                   idp=2.0)

# Convert to raster and clip to Baden-Württemberg
r<-raster(PIDW)
r.mask<-mask(r,bw.landesgrenzen.spdf)
# Plot 
tm_shape(r.mask) +
  tm_raster(n=10,palette="RdBu",
            title="Annual precipitation\n in BW 2018\n (in mm)") +
  tm_shape(prec.spdf) +
  tm_dots(size=0.2) +
  tm_legend(legend.outside=TRUE)
Previous