---
title: 'Intro to r5r: Rapid Realistic Routing with R5 in R'
author: "Rafael H. M. Pereira, Marcus Saraiva, Daniel Herszenhut, Carlos Kaue Braga"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
abstract: "`{r5r}` is an R package for rapid realistic routing on multimodal transport networks (walk, bike, public transport and car) using R5. The package allows users to generate detailed routing analysis or calculate travel time matrices using seamless parallel computing on top of the R5 Java machine "
urlcolor: blue
vignette: >
%\VignetteIndexEntry{Intro to r5r: Rapid Realistic Routing with R5 in R}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = identical(tolower(Sys.getenv("NOT_CRAN")), "true"),
out.width = "100%"
)
## removes files previously created by 'setup_r5()'
#data_path <- system.file("extdata/poa", package = "r5r")
#existing_files <- list.files(data_path)
#files_to_keep <- c(
# "poa_hexgrid.csv",
# "poa_osm.pbf",
# "poa_points_of_interest.csv",
# "poa_eptc.zip",
# "poa_trensurb.zip",
# 'fares'
# )
#files_to_remove <- existing_files[! existing_files %in% files_to_keep]
#invisible(file.remove(file.path(data_path, files_to_remove)))
```
# 1. Introduction
**r5r** is an [R package for rapid realistic routing on multimodal transport networks](https://github.com/ipeaGIT/r5r) (walk, bike, public transport and car). It provides a simple and friendly interface to R5, a really fast and open source Java-based routing engine developed separately by [Conveyal](https://www.conveyal.com/). R5 stands for [Rapid Realistic Routing on Real-world and Reimagined networks](https://github.com/conveyal/r5). More details about **r5r** can be found on the [package webpage](https://ipeagit.github.io/r5r/index.html) or on this [paper](
https://doi.org/10.32866/001c.21262).
# 2. Installation
You can install `{r5r}` from CRAN, or the development version from github.
```{r, eval = FALSE}
# from CRAN
install.packages('r5r')
# dev version with latest features
devtools::install_github("ipeaGIT/r5r", subdir = "r-package")
```
Please bear in mind that you need to have *Java Development Kit (JDK) 21* installed
on your computer to use `{r5r}`. No worries, you don't have to pay for it. There are
numerous open-source JDK implementations, and you only need to install one JDK. Here are a few options:
- [Adoptium/Eclipse Temurin](https://adoptium.net/) (our preferred option)
- [Amazon Corretto](https://aws.amazon.com/corretto/)
- [Oracle OpenJDK](https://jdk.java.net/21/).
The easiest way to install JDK is using the new [{rJavaEnv}](https://www.ekotov.pro/rJavaEnv/) package in R:
```{r, eval = FALSE}
# install {rJavaEnv} from CRAN
install.packages("rJavaEnv")
# check version of Java currently installed (if any)
rJavaEnv::java_check_version_rjava()
## if this is the first time you use {rJavaEnv}, you might need to run this code
## below to consent the installation of Java.
# rJavaEnv::rje_consent(provided = TRUE)
# install Java 21
rJavaEnv::java_quick_install(version = 21)
# check if Java was successfully installed
rJavaEnv::java_check_version_rjava()
```
# 3. Usage
First, we need to increase the memory available to Java. This has to be done **before** loading the `{r5r}` library because, by default, `R` allocates only 512MB of memory for Java processes, which is not enough for large queries using `{r5r}`. To increase available memory to 2GB, for example, we need to set the `java.parameters` option at the beginning of the script, as follows:
```{r, message = FALSE}
options(java.parameters = "-Xmx2G")
# By default, {r5r} uses all CPU cores available. If you want to limit the
# number of CPUs to 4, for example, you can run:
options(java.parameters = c("-Xmx2G", "-XX:ActiveProcessorCount=4"))
```
Note: It's very important to allocate enough memory before loading `{r5r}` or any other Java-based package, since `rJava` starts a Java Virtual Machine only once for each R session. It might be useful to restart your R session and execute the code above right after, if you notice that you haven't succeeded in your previous attempts.
Then we can load the packages used in this vignette:
```{r, message = FALSE, warning = FALSE}
library(r5r)
library(sf)
library(data.table)
library(ggplot2)
```
The `{r5r}` package has seven **fundamental functions**:
1. `setup_r5()` to initialize an instance of `{r5r}`, that also builds a routable
transport network;
2. `accessibility()` for fast computation of access to opportunities considering
a selected decay function;
3. `travel_time_matrix()` for fast computation of travel time estimates between origin/destination pairs;
4. `expanded_travel_time_matrix()` for calculating travel matrices between origin destination pairs with additional information such as routes used and total time disaggregated by access, waiting, in-vehicle and transfer times.
5. `detailed_itineraries()` to get detailed information on one or multiple alternative routes between origin/destination pairs.
6. `pareto_frontier()` for analyzing the trade-off between the travel time and monetary costs of multiple route alternatives between origin/destination pairs.
7. `isochrone()` to estimate the polygons of the areas that can be reached from an origin point at different travel time limits.
Most of these functions also allow users to account for monetary travel costs
when generating travel time matrices and accessibility estimates. More info about
how to consider monetary costs can be found in [this vignette](https://ipeagit.github.io/r5r/articles/fare_structure.html).
The package also includes a few **support functions**.
1. `street_network_to_sf()` to extract OpenStreetMap network in sf format from a `network.dat` file.
2. `transit_network_to_sf()` to extract transit network in sf format from a `network.dat` file.
3. `find_snap()` to find snapped locations of input points on street network.
4. `r5r_sitrep()` to generate a situation report to help debug eventual errors.
## 3.1 Data requirements:
To use `{r5r}`, you will need:
- A road network data set from OpenStreetMap in `.pbf` format (*mandatory*)
- A public transport feed in `GTFS.zip` format (optional)
- A raster file of Digital Elevation Model data in `.tif` format (optional)
Here are a few places from where you can download these data sets:
- OpenStreetMap
- [osmextract](https://docs.ropensci.org/osmextract/) R package
- [geofabrik](https://download.geofabrik.de/) website
- [hot export tool](https://export.hotosm.org/) website
- [BBBike.org](https://extract.bbbike.org/) website
- [Protomaps](https://protomaps.com/downloads/osm) website
- GTFS
- [tidytransit](https://r-transit.github.io/tidytransit/) R package
- [transitland](https://www.transit.land/) website
- [Mobility Database](https://database.mobilitydata.org/) website
- Elevation
- [elevatr](https://github.com/jhollist/elevatr) R package
- Nasa's SRTMGL1 website
Let's have a quick look at how `{r5r}` works using a sample data set.
# 4. Demonstration on sample data
## Data
To illustrate the functionalities of `{r5r}`, the package includes a small sample data for the city of Porto Alegre (Brazil). It includes seven files:
* An OpenStreetMap network: `poa_osm.pbf`
* Two public transport feeds: `poa_eptc.zip` and `poa_trensurb.zip`
* A raster elevation data: `poa_elevation.tif`
* A `poa_hexgrid.csv` file with spatial coordinates of a regular hexagonal grid covering the sample area, which can be used as origin/destination pairs in a travel time matrix calculation.
* A `poa_points_of_interest.csv` file containing the names and spatial coordinates of 15 places within Porto Alegre
* A `fares_poa.zip` file with the fare rules of the city's public transport system.
```{r}
data_path <- system.file("extdata/poa", package = "r5r")
list.files(data_path)
```
The points of interest data can be seen below. In this example, we will be looking at transport alternatives between some of those places.
```{r}
poi <- fread(file.path(data_path, "poa_points_of_interest.csv"))
head(poi)
```
The data with origin destination pairs is shown below. In this example, we will be using 200 points randomly selected from this data set.
```{r}
points <- fread(file.path(data_path, "poa_hexgrid.csv"))
# sample points
sampled_rows <- sample(1:nrow(points), 200, replace=TRUE)
points <- points[ sampled_rows, ]
head(points)
```
## 4.1 Building routable transport network with `setup_r5()`
The first step is to build the multimodal transport network used for routing in R5. This is done with the `setup_r5` function. This function does two things: (1) downloads/updates a compiled JAR file of R5 and stores it locally in the `{r5r}` package directory for future use; and (2) combines the osm.pbf and gtfs.zip data sets to build a routable network object.
```{r, message = FALSE}
# Indicate the path where OSM and GTFS data are stored
r5r_core <- setup_r5(data_path = data_path)
```
## 4.2 Accessibility analysis
The fastest way to calculate accessibility estimates is using the `accessibility()`
function. In this example, we calculate the number of schools and health care
facilities accessible in less than 60 minutes by public transport and walking.
More details in this vignette on [Calculating and visualizing Accessibility](https://ipeagit.github.io/r5r/articles/accessibility.html).
```{r, message = FALSE}
# set departure datetime input
departure_datetime <- as.POSIXct("13-05-2019 14:00:00",
format = "%d-%m-%Y %H:%M:%S")
# calculate accessibility
access <- accessibility(r5r_core = r5r_core,
origins = points,
destinations = points,
opportunities_colnames = c("schools", "healthcare"),
mode = c("WALK", "TRANSIT"),
departure_datetime = departure_datetime,
decay_function = "step",
cutoffs = 60
)
head(access)
```
## 4.3 Routing analysis
For fast routing analysis, **r5r** currently has three core functions:
`travel_time_matrix()`, `expanded_travel_time_matrix()` and `detailed_itineraries()`.
### Fast many to many travel time matrix
The `travel_time_matrix()` function is a really simple and fast function to
compute travel time estimates between one or multiple origin/destination pairs.
The origin/destination input can be either a spatial `sf POINT` object, or a
`data.frame` containing the columns `id, lon, lat`. The function also receives
as inputs the *max walking distance*, in meters, and the *max trip duration*, in
minutes. Resulting travel times are also output in minutes.
This function also allows users to very efficiently capture the travel time
uncertainties inside a given time window considering multiple departure times.
[More info on this vignette](https://ipeagit.github.io/r5r/articles/time_window.html).
```{r, message = FALSE}
# set inputs
mode <- c("WALK", "TRANSIT")
max_walk_time <- 30 # minutes
max_trip_duration <- 120 # minutes
departure_datetime <- as.POSIXct("13-05-2019 14:00:00",
format = "%d-%m-%Y %H:%M:%S")
# calculate a travel time matrix
ttm <- travel_time_matrix(r5r_core = r5r_core,
origins = poi,
destinations = poi,
mode = mode,
departure_datetime = departure_datetime,
max_walk_time = max_walk_time,
max_trip_duration = max_trip_duration)
head(ttm)
```
```{r ttm head, echo=FALSE, message=FALSE, out.width='100%', eval = FALSE}
knitr::include_graphics("https://github.com/ipeaGIT/r5r/blob/master/r-package/inst/img/vig_output_ttm.png?raw=true")
```
### Expanded travel time matrix with minute-by-minute estimates
For those interested in more detailed outputs, the `expanded_travel_time_matrix()`
works very similarly with `travel_time_matrix()` but it brings much more
information. It estimates for each origin destination pair the routes used and
total time disaggregated by access, waiting, in-vehicle and transfer times.
Please note this function can be very memory intensive for large data sets.
```{r, message = FALSE}
# calculate a travel time matrix
ettm <- expanded_travel_time_matrix(r5r_core = r5r_core,
origins = poi,
destinations = poi,
mode = mode,
departure_datetime = departure_datetime,
breakdown = TRUE,
max_walk_time = max_walk_time,
max_trip_duration = max_trip_duration)
head(ettm)
```
### Detailed itineraries
Most routing packages only return the fastest route. A key advantage of the `detailed_itineraries()` function is that is allows for fast routing analysis
while providing multiple alternative routes between origin destination pairs.
The output also brings detailed information for each route alternative at the
trip segment level, including the transport mode, waiting times, travel time and
distance of each trip segment.
In this example below, we want to know some alternative routes between one origin/destination pair only.
```{r, message = FALSE}
# set inputs
origins <- poi[10,]
destinations <- poi[12,]
mode <- c("WALK", "TRANSIT")
max_walk_time <- 60 # minutes
departure_datetime <- as.POSIXct("13-05-2019 14:00:00",
format = "%d-%m-%Y %H:%M:%S")
# calculate detailed itineraries
det <- detailed_itineraries(r5r_core = r5r_core,
origins = origins,
destinations = destinations,
mode = mode,
departure_datetime = departure_datetime,
max_walk_time = max_walk_time,
shortest_path = FALSE)
head(det)
```
```{r detailed head, echo = FALSE, out.width='100%', message = FALSE, eval = FALSE}
knitr::include_graphics("https://github.com/ipeaGIT/r5r/blob/master/r-package/inst/img/vig_output_detailed.png?raw=true")
```
The output is a `data.frame sf` object, so we can easily visualize the results.
#### Visualize results
**Static visualization** with `ggplot2` package: To provide a geographic context
for the visualization of the results in `ggplot2`, you can also use the `street_network_to_sf()` function to extract the OSM street network used in the routing.
```{r, message = FALSE}
# extract OSM network
street_net <- street_network_to_sf(r5r_core)
# extract public transport network
transit_net <- r5r::transit_network_to_sf(r5r_core)
# plot
ggplot() +
geom_sf(data = street_net$edges, color='gray85') +
geom_sf(data = det, aes(color=mode)) +
facet_wrap(.~option) +
theme_void()
```
```{r ggplot2 output, echo = FALSE, out.width='100%', message = FALSE, eval = FALSE}
knitr::include_graphics("https://github.com/ipeaGIT/r5r/blob/master/r-package/inst/img/vig_detailed_ggplot.png?raw=true")
```
### Cleaning up after usage
`{r5r}` objects are still allocated to any amount of memory previously set after they are done with their calculations. In order to remove an existing `{r5r}` object and reallocate the memory it had been using, we use the `stop_r5` function followed by a call to Java's garbage collector, as follows:
```{r, message = FALSE}
r5r::stop_r5(r5r_core)
rJava::.jgc(R.gc = TRUE)
```
```{r, eval = TRUE, include = FALSE, message = FALSE}
# clean cache (CRAN policy)
r5r::r5r_cache(delete_file = 'all')
```
If you have any suggestions or want to report an error, please visit [the package GitHub page](https://github.com/ipeaGIT/r5r).