Using dodgr to generate flows throughout the Bristol street network

Note at the outset that these routines likely require the latest, development version of dodgr, installed with

devtools::install_github ("ATFutures/dodgr")

and not the somewhat older CRAN version installed with

install.packages ("dodgr")

1. Extract street network

Street networks and associated boundaries can be conveniently extracted from Open Street Map with the osmdata package. We first need to get a polygonal boundary for Bristol

library (osmdata)
bp <- getbb (place_name = "Bristol UK", format_out = "polygon")
class (bp)
## [1] "list"
length (bp)
## [1] 2

The returned object is a list with two polygons. The easiest way to work out which one we need is to look at them:

par (mfrow = c (1, 2))
plot (bp [[1]], type = "l")
plot (bp [[2]], type = "l")

The first one is obviously what we want (the second extents to the two islands of Flat Holm and Steep Holm within the River Severn, which are offically part of Bristol). The street network can be downloaded with osmdata using the following lines

net <- osmdata::opq (bp [[1]]) %>%
    osmdata::add_osm_feature (key = "highway") %>%
    osmdata::osmdata_sf (quiet = FALSE) %>%
    osmdata::osm_poly2line () %>%
    osmdata::trim_osmdata (bp [[1]])

Each of those lines does the following:

  1. opq converts the bounding polygon to a query to be submitted to the Open Street Map overpass server;
  2. add_osm_feature says we want to extract objects of class "highway";
  3. osmdata_sf says we want the data in Simple Features (sf) format;
  4. osm_poly2line is necessary because street components that are polygon are stored in a separate item to non-polygonal lines; this command merges the two; and finally
  5. trim_osmdata trims the data to within our specified bounding polygon

The result is an object of class osmdata:

net
## Object of class 'osmdata' with:
##                  $bbox : 51.3972838,-2.7183704,51.5444317,-2.5104192
##         $overpass_call : The call submitted to the overpass API
##             $timestamp : [ Do 5 Apr 2018 10:50:15 ]
##            $osm_points : 'sf' Simple Features Collection with 86803 points
##             $osm_lines : 'sf' Simple Features Collection with 19796 linestrings
##          $osm_polygons : 'sf' Simple Features Collection with 538 polygons
##        $osm_multilines : 'sf' Simple Features Collection with 0 multilinestrings
##     $osm_multipolygons : 'sf' Simple Features Collection with 0 multipolygons

The osm_poly2line function merges the osm_polygons with the osm_lines objects, so that all street segments are then contained in the single osm_lines object:

net <- net$osm_lines
dim (net)
## [1] 19796   254

The city has just under 20,000 streets, and there are 254 columns in the sf data.frame. These contain all kinds of information on each of the streets:

names (net)
##   [1] "osm_id"                                
##   [2] "name"                                  
##   [3] "access"                                
##   [4] "access.backward"                       
##   [5] "access.conditional"                    
##   [6] "access.motor_vehicle"                  
##   [7] "addr.city"                             
##   [8] "addr.housename"                        
##   [9] "addr.housenumber"                      
##  [10] "addr.interpolation"                    
##  [11] "addr.postcode"                         
##  [12] "addr.street"                           
##  [13] "addr.suburb"                           
##  [14] "addr.unit"                             
##  [15] "agricultural"                          
##  [16] "alt_name"                              
##  [17] "ambulance"                             
##  [18] "amenity"                               
##  [19] "arcade.left"                           
##  [20] "arcade.right"                          
##  [21] "area"                                  
##  [22] "attraction"                            
##  [23] "beauty"                                
##  [24] "bicycle"                               
##  [25] "bicycle.oneway"                        
##  [26] "bridge"                                
##  [27] "bridgemaster"                          
##  [28] "bridge.movable"                        
##  [29] "bridge.name"                           
##  [30] "bridge_ref"                            
##  [31] "bridge.ref"                            
##  [32] "building"                              
##  [33] "bus"                                   
##  [34] "bus_lane"                              
##  [35] "bus.lanes"                             
##  [36] "busway"                                
##  [37] "busway.left"                           
##  [38] "busway.right"                          
##  [39] "campus"                                
##  [40] "car"                                   
##  [41] "carriageway_ref"                       
##  [42] "coach"                                 
##  [43] "comment"                               
##  [44] "complete"                              
##  [45] "construction"                          
##  [46] "construction.active_traffic_management"
##  [47] "construction.bridge"                   
##  [48] "conveying"                             
##  [49] "covered"                               
##  [50] "created_by"                            
##  [51] "crossing"                              
##  [52] "crossing_ref"                          
##  [53] "cuisine"                               
##  [54] "cutting"                               
##  [55] "cyclestreets_id"                       
##  [56] "cycleway"                              
##  [57] "cycleway.both"                         
##  [58] "cycleway.both.oneway"                  
##  [59] "cycleway.left"                         
##  [60] "cycleway.left.width"                   
##  [61] "cycleway.oneside.width"                
##  [62] "cycleway.otherside"                    
##  [63] "cycleway.otherside.width"              
##  [64] "cycleway.right"                        
##  [65] "cycleway.right.width"                  
##  [66] "date"                                  
##  [67] "description"                           
##  [68] "designated"                            
##  [69] "designation"                           
##  [70] "destination"                           
##  [71] "destination.lanes"                     
##  [72] "destination.ref"                       
##  [73] "direction"                             
##  [74] "distance"                              
##  [75] "disused"                               
##  [76] "drinking_water"                        
##  [77] "ele"                                   
##  [78] "embankment"                            
##  [79] "emergency"                             
##  [80] "est_width"                             
##  [81] "except"                                
##  [82] "fenced"                                
##  [83] "fhrs.id"                               
##  [84] "fixme"                                 
##  [85] "FIXME"                                 
##  [86] "foot"                                  
##  [87] "footway"                               
##  [88] "ford"                                  
##  [89] "goods"                                 
##  [90] "handrail"                              
##  [91] "hgv"                                   
##  [92] "highway"                               
##  [93] "highways_england.area"                 
##  [94] "historic"                              
##  [95] "home_zone"                             
##  [96] "horse"                                 
##  [97] "hov"                                   
##  [98] "hov.lanes"                             
##  [99] "hov.minimum"                           
## [100] "hvg"                                   
## [101] "image"                                 
## [102] "incline"                               
## [103] "incline.value"                         
## [104] "indoor"                                
## [105] "int_ref"                               
## [106] "invalid_carriage"                      
## [107] "is_in"                                 
## [108] "is_in.city"                            
## [109] "junction"                              
## [110] "landuse"                               
## [111] "lane"                                  
## [112] "lanes"                                 
## [113] "lanes.backward"                        
## [114] "lanes.bus.forward"                     
## [115] "lanes.forward"                         
## [116] "lanes.left"                            
## [117] "lanes.psv"                             
## [118] "lanes.psv.backward"                    
## [119] "lanes.psv.forward"                     
## [120] "layer"                                 
## [121] "lcn"                                   
## [122] "lcn_ref"                               
## [123] "leisure"                               
## [124] "level"                                 
## [125] "lit"                                   
## [126] "loc_name"                              
## [127] "loc_ref"                               
## [128] "man_made"                              
## [129] "maxcc"                                 
## [130] "maxheight"                             
## [131] "maxheight.imperial"                    
## [132] "maxheight.physical"                    
## [133] "maxlength"                             
## [134] "maxspeed"                              
## [135] "maxspeed.type"                         
## [136] "maxspeed.variable"                     
## [137] "maxweight"                             
## [138] "maxweightrating"                       
## [139] "maxwidth"                              
## [140] "memorial"                              
## [141] "moped"                                 
## [142] "motorbike"                             
## [143] "motorcar"                              
## [144] "motorcar.conditional"                  
## [145] "motorcycle"                            
## [146] "motorcycle.conditional"                
## [147] "motorroad"                             
## [148] "motor_vehicle"                         
## [149] "motor_vehicle.conditional"             
## [150] "mtb"                                   
## [151] "mtb.scale"                             
## [152] "mtb.scale.imba"                        
## [153] "mtb.scale.uphill"                      
## [154] "name.botanical"                        
## [155] "name.left"                             
## [156] "name.right"                            
## [157] "narrow"                                
## [158] "natural"                               
## [159] "ncn_ref"                               
## [160] "noname"                                
## [161] "note"                                  
## [162] "note2"                                 
## [163] "note_2"                                
## [164] "note.lit"                              
## [165] "note.maxwidth"                         
## [166] "note.name"                             
## [167] "note.postal_code"                      
## [168] "note.postcode"                         
## [169] "not.name"                              
## [170] "nus"                                   
## [171] "old_name"                              
## [172] "old_ref"                               
## [173] "oneway"                                
## [174] "oneway.bicycle"                        
## [175] "oneway.conditional"                    
## [176] "oneway.psv"                            
## [177] "operator"                              
## [178] "osmarender.renderName"                 
## [179] "overtaking"                            
## [180] "path"                                  
## [181] "paved"                                 
## [182] "phone"                                 
## [183] "placement"                             
## [184] "place_numbers"                         
## [185] "postal_code"                           
## [186] "postcode"                              
## [187] "priority"                              
## [188] "private"                               
## [189] "psv"                                   
## [190] "psv.backward"                          
## [191] "psv.guided"                            
## [192] "railway"                               
## [193] "railway.historic"                      
## [194] "ramp"                                  
## [195] "ramp.bicycle"                          
## [196] "ramp.wheelchair"                       
## [197] "rcn_ref"                               
## [198] "ref"                                   
## [199] "roundabout"                            
## [200] "sac_scale"                             
## [201] "segregated"                            
## [202] "ser"                                   
## [203] "service"                               
## [204] "shared"                                
## [205] "shop"                                  
## [206] "sidewalk"                              
## [207] "ski"                                   
## [208] "smoothness"                            
## [209] "source"                                
## [210] "source.access"                         
## [211] "source.addr"                           
## [212] "source.address"                        
## [213] "source.bicycle"                        
## [214] "source.date"                           
## [215] "source.designation"                    
## [216] "source.maxspeed"                       
## [217] "source.maxweight"                      
## [218] "source.name"                           
## [219] "source.not.name"                       
## [220] "source.outline"                        
## [221] "source.position"                       
## [222] "source.ref"                            
## [223] "source.track"                          
## [224] "sport"                                 
## [225] "step_count"                            
## [226] "steps"                                 
## [227] "surface"                               
## [228] "taxi"                                  
## [229] "temporary.access"                      
## [230] "temporary.date_off"                    
## [231] "temporary.date_on"                     
## [232] "todo"                                  
## [233] "toll"                                  
## [234] "tourism"                               
## [235] "tourist_bus"                           
## [236] "tracktype"                             
## [237] "traffic_calming"                       
## [238] "trail_colour"                          
## [239] "trail_visibility"                      
## [240] "tunnel"                                
## [241] "tunnel.ref"                            
## [242] "turn"                                  
## [243] "turn.lanes"                            
## [244] "turn.lanes.backward"                   
## [245] "turn.lanes.forward"                    
## [246] "type"                                  
## [247] "vehicle"                               
## [248] "vehicle.lanes"                         
## [249] "website"                               
## [250] "wheelchair"                            
## [251] "width"                                 
## [252] "wikidata"                              
## [253] "wikipedia"                             
## [254] "geometry"

The dodgr package (Distances on Directed Graphs) contains a helper function for extracting street networks:

library (dodgr)
net2 <- dodgr_streetnet ("Bristol UK")
dim (net2)
## [1] 31401   257

This function yields more streets because it only extracts rectangular street networks. Direct use of the osmdata package allows finer control. From here on, we’ll use the first, polygonal network.

2. dodgr distances

dodgr can be used to calculate distances between points. Let’s work out how far it is to walk from the main train station (Temple Meads) to the mouth of the Avon (where there is a railway station called “Avonmouth”)

station <- osmdata::opq (bp [[1]]) %>%
    osmdata::add_osm_feature (key = "building") %>%
    osmdata::add_osm_feature (key = "name", value = "Temple Meads",
                              value_exact = FALSE) %>%
    osmdata::osmdata_sf ()
avonmouth <- osmdata::opq (bp [[1]]) %>%
    osmdata::add_osm_feature (key = "railway") %>%
    osmdata::add_osm_feature (key = "name", value = "Avonmouth") %>%
    osmdata::osmdata_sf ()

Those are two osmdata objects with the points we want contained in the osm_polygons item. Let’s just get the distance between the first point of each polygon

station <- station$osm_polygons %>%
    sf::st_coordinates ()
station <- as.numeric (station [1, 1:2])
avonmouth <- avonmouth$osm_polygons %>%
    sf::st_coordinates ()
avonmouth <- as.numeric (avonmouth [1, 1:2])
station; avonmouth
## [1] -2.582369 51.449353
## [1] -2.699559 51.500274

The final step before we can route through the street network is to convert the sf-format network into a dodgr network. This is necessary in order to weight each street segment according to a specified weighting profile. In this case, we want to walk, so,

net_walk <- weight_streetnet (net, wt_profile = "foot")
dim (net_walk)
## [1] 176366     13

The weight_streetnet function breaks the network down into individual street segments, of which there are 176,000. Each of these is allocated a weighted distance dependeing on the kind of way and the preferred mode of transport. Calculating the distances is then as easy as,

dodgr_dists (net_walk, from = station, to = avonmouth)
##            595411468
## 5434139604  13.05817

2.1 Street networks and routing

To find out how far it is to cycle to Avonmouth, we need to generate a new dodgr street network weighted for bicycle travel.

net_bike <- weight_streetnet (net, wt_profile = "bicycle")
dodgr_dists (net_bike, from = station, to = avonmouth)
##            595411468
## 5434139604  12.56132

It is about half a kilometre shorter to cycle than to walk to Avonmouth, presumably because Bristol has put some excellent cyclepaths somewhere along the way. Note that routing different kinds of transport always requires entirely different street networks, weighted for each particular kind.

3 Routes and dodgr

Where do our routes actually go? We can find out with dodgr_paths:

p_foot <- dodgr_paths (net_walk, from = station, to = avonmouth)
class (p_foot)
## [1] "list"
length (p_foot)
## [1] 1
head (p_foot [[1]] [[1]])
## [1] "5434139603" "4975934117" "4975934116" "5467924658" "285888512" 
## [6] "17408409"

dodgr_paths returns lists because it’s designed for many-to-many routing tasks (see below), where the first list item is the [[from]] entry, and the second is the [[to]] entry. Each of these then contains a sequence of Open Street Map node IDs. How do we map these back on to the network to generate the coordindates of this path?

verts <- dodgr_vertices (net_walk)
head (verts)
##           id         x        y component n
## 1     331744 -2.666747 51.50947         1 0
## 2 1591662663 -2.666181 51.51003         1 1
## 3     331756 -2.667861 51.50466         1 2
## 4 2983641845 -2.667944 51.50281         1 3
## 5  991949098 -2.684810 51.49633         1 4
## 6     305900 -2.674981 51.50161         1 5

The dodgr_vertices function extracts all of the nodal vertices from the network. Each of these has an id value, so we just need to match our path to these:

index <- match (p_foot [[1]] [[1]], verts$id)
p_foot <- verts [index, ]

We can view this via mapview (or any other suitable way) by converting this to an sf object

p_foot <- p_foot [, c ("x", "y")] %>%
    as.matrix () %>%
    sf::st_linestring () %>%
    sf::st_sfc ()
sf::st_crs (p_foot) <- 4326 # OSM CRS

Then we’re good to go. The following code will open up an interactive map of our path.

library (mapview)
mapview (p_foot)

We can compare the two of them by adding the bike path in a different colour:

p_bike <- dodgr_paths (net_bike, from = station, to = avonmouth)
verts <- dodgr_vertices (net_bike)
index <- match (p_bike [[1]] [[1]], verts$id)
p_bike <- verts [index, ]
p_bike <- p_bike [, c ("x", "y")] %>%
    as.matrix () %>%
    sf::st_linestring () %>%
    sf::st_sfc ()
sf::st_crs (p_bike) <- 4326 # OSM CRS
mapview (p_foot) %>%
    addFeatures (p_bike, color = "red")

Running this code will reveal why and precisely where the walking path is slightly longer than the cycling path.

3.1 What is the advantage of dodgr?

Thus far, dodgr does nothing more than what stplanr can do, yet takes considerably more code to get there. The real advantage of dodgr, and the purpose of its development, is in massive routing tasks. dodgr can calculate distances between loads of points really quickly. Let’s take a sample of the vertices listed previously (and it suffices to just pass the OSM "id" values to these dodgr routines):

vert_sample <- verts$id [sample (nrow (verts), size = 100)]
system.time (
dists <- dodgr_dists (net_bike, from = vert_sample, to = vert_sample)
)
##    user  system elapsed 
##   2.978   0.020   1.030

That call just calculated \(100\times 100=10,000\) distances in around one second. Such calculations can technically be done by calling routing APIs, but even where that is possible, the big constraint is the number of calls. google, for example, only offers 2,500 free requests per day, so we’re already over the limit there. They then charge US$0.50 per additional 1,000 requests, so that would have cost us \(7.5\times 0.5 = \$US3.25\). dodgr can scale to far larger calls. Try scaling the previous call to 1,000 vertices instead of 100. That generally takes around one minute, and performs one million calculations. Using the google API, that would translate to \(1,000\times 0.5 = \$US500\).

Advantage #1: dodgr is free!

Not only is dodgr free, it is also very highly optimized code. Let’s compare dodgr with the “routing-industry-standard” igraph. We’ll do the comparison on a smaller sub-sample of the graph, extracted using the dodgr_sample function. We also need to do some fiddling around to convert our network to igraph format.

nets <- dodgr_sample (net_bike, nverts = 1000)
netc <- dodgr_contract_graph (nets)$graph # removes all points except junctions
from_id <- unique (netc$from_id)
to_id <- unique (netc$to_id)

# set up igraph:
edges <- cbind (netc$from_id, netc$to_id)
edges <- as.vector (t (edges))
igr <- igraph::make_directed_graph (edges)
igraph::E (igr)$weight <- netc$d_weighted

rbenchmark::benchmark (
                       d <- dodgr_dists (netc, from = from_id, to = to_id),
                       d <- igraph::distances (igr, v = from_id, to = to_id,
                                               mode = "out"),
                       replications = 10, order = "relative")
##                                                                 test
## 1                 d <- dodgr_dists(netc, from = from_id, to = to_id)
## 2 d <- igraph::distances(igr, v = from_id, to = to_id, mode = "out")
##   replications elapsed relative user.self sys.self user.child sys.child
## 1           10   0.076    1.000     0.203    0.004          0         0
## 2           10   0.208    2.737     0.208    0.000          0         0

Advantage #2: dodgr is fast!

dodgr is a few times faster than the faster alternative. In computing times, speed increases of a few percent are often big news. A few times really is an awful lot faster. (Note also that igraph does some clever caching behind the scenes, so re-running the above benchmark code will greatly increase the relative speed of igraph; fair comparisons require re-running in a fresh R session.)

4 dodgr and aggregated flows

Beyond the usual distances and routes, the real workhorse of dodgr is the ability to aggregate flows between multiple points. This can readily be demonstrated using our Bristol network, calculating flows between a random sample of 10 points. Flow aggregation requires specifying a "flow" matrix, equivalent to an OD matrix.

n <- 10
pts <- sample (verts$id, size = n)
flows <- array (runif (n ^ 2), dim = rep (n, 2)) # random OD values
f <- dodgr_flows_aggregate (net_bike, from = pts, to = pts, flows = flows)
head (f)
##   geom_num edge_id    from_id  from_lon from_lat      to_id    to_lon
## 1        1       1     331744 -2.666747 51.50947 1591662663 -2.666181
## 2        1       2 1591662663 -2.666181 51.51003 1953103652 -2.664919
## 3        2       3     331756 -2.667861 51.50466 2983641845 -2.667944
## 4        2       4 2983641845 -2.667944 51.50281     331757 -2.668086
## 5        3       5  991949098 -2.684810 51.49633     305908 -2.684199
## 6        4       6     305900 -2.674981 51.50161 1591662605 -2.670281
##     to_lat          d d_weighted  highway  way_id component flow
## 1 51.51003 0.07407219   7407.219 motorway 3391098         1    0
## 2 51.51115 0.15223452  15223.452 motorway 3391098         1    0
## 3 51.50281 0.20554267  20554.267 motorway 3391104         1    0
## 4 51.50205 0.08595035   8595.035 motorway 3391104         1    0
## 5 51.49676 0.06397381   6397.381 motorway 3391107         1    0
## 6 51.50432 0.44360430  44360.430 motorway 3391109         1    0

The dodgr_flows_aggregate() function returns the input data (net_bike) modified through the additional of an extra flow column. Plotting these flows is best done through first filtering out all flow == 0 values:

f <- f [which (f$flow > 0), ]

This dodgr data.frame can then be converted to an sf object for plotting with

fsf <- dodgr_to_sf (f)

This is not quite a standard sf object, as the two components of data and geometry are split into separate list items. The following lines can be used to plot the results, scaling both line thicknesses and colours according to the flow values.

cols <- topo.colors (30) [ceiling (30 * fsf$dat$flow / max (fsf$dat$flow))]
mapview (fsf$geoms, lwd = fsf$dat$flow, color = cols)

That result is only from a made-up sample of 10 points. It is obviously straightforward both to extend to many more points, and also to insert real OD data. As a final note, mapview - and any other leaflet-based map generators - struggles with large data sets (say, beyond 1,000 or so sf objects). For such large data sets, it is often advantageous to filter out flows below some finite minimal value, through replacing the above line with

flim <- 0.1 # or whatever; examine histogram of flows?
f <- f [which (f$flow > flim), ]

4.1 Dispersed flows

Finally, dodgr includes an additional function, dodgr_flows_disperse. This calculates flows describing movement that simply disperses away from a given set of points according to a simple exponential dispersal model. This model corresponds to a typical spatial interaction model, and this function can be applied in places lacking OD data but for which population densities are available.

f <- dodgr_flows_disperse (net_bike, from = pts, dens = runif (length (pts)),
                           k = 0.5) # k is exponential decay parameter in km

This can also be plotted with the same code as above, although (i) it may not respond as many more line segments will be generated, and (ii) the line thicknesses may need to be adjusted with

line_scale <- 2 # or whatever
mapview (fsf$geoms, lwd = line_scale * fsf$dat$flow, color = cols)