Note at the outset that these routines likely require the latest, development version of dodgr
, installed with
devtools::install_github ("ATFutures/dodgr")
and not the somewhat older CRAN version installed with
install.packages ("dodgr")
Street networks and associated boundaries can be conveniently extracted from Open Street Map with the osmdata
package. We first need to get a polygonal boundary for Bristol
library (osmdata)
bp <- getbb (place_name = "Bristol UK", format_out = "polygon")
class (bp)
## [1] "list"
length (bp)
## [1] 2
The returned object is a list with two polygons. The easiest way to work out which one we need is to look at them:
par (mfrow = c (1, 2))
plot (bp [[1]], type = "l")
plot (bp [[2]], type = "l")
The first one is obviously what we want (the second extents to the two islands of Flat Holm and Steep Holm within the River Severn, which are offically part of Bristol). The street network can be downloaded with osmdata
using the following lines
net <- osmdata::opq (bp [[1]]) %>%
osmdata::add_osm_feature (key = "highway") %>%
osmdata::osmdata_sf (quiet = FALSE) %>%
osmdata::osm_poly2line () %>%
osmdata::trim_osmdata (bp [[1]])
Each of those lines does the following:
opq
converts the bounding polygon to a query to be submitted to the Open Street Map overpass
server;add_osm_feature
says we want to extract objects of class "highway"
;osmdata_sf
says we want the data in Simple Features (sf
) format;osm_poly2line
is necessary because street components that are polygon are stored in a separate item to non-polygonal lines; this command merges the two; and finallytrim_osmdata
trims the data to within our specified bounding polygonThe result is an object of class osmdata
:
net
## Object of class 'osmdata' with:
## $bbox : 51.3972838,-2.7183704,51.5444317,-2.5104192
## $overpass_call : The call submitted to the overpass API
## $timestamp : [ Do 5 Apr 2018 10:50:15 ]
## $osm_points : 'sf' Simple Features Collection with 86803 points
## $osm_lines : 'sf' Simple Features Collection with 19796 linestrings
## $osm_polygons : 'sf' Simple Features Collection with 538 polygons
## $osm_multilines : 'sf' Simple Features Collection with 0 multilinestrings
## $osm_multipolygons : 'sf' Simple Features Collection with 0 multipolygons
The osm_poly2line
function merges the osm_polygons
with the osm_lines
objects, so that all street segments are then contained in the single osm_lines
object:
net <- net$osm_lines
dim (net)
## [1] 19796 254
The city has just under 20,000 streets, and there are 254 columns in the sf data.frame
. These contain all kinds of information on each of the streets:
names (net)
## [1] "osm_id"
## [2] "name"
## [3] "access"
## [4] "access.backward"
## [5] "access.conditional"
## [6] "access.motor_vehicle"
## [7] "addr.city"
## [8] "addr.housename"
## [9] "addr.housenumber"
## [10] "addr.interpolation"
## [11] "addr.postcode"
## [12] "addr.street"
## [13] "addr.suburb"
## [14] "addr.unit"
## [15] "agricultural"
## [16] "alt_name"
## [17] "ambulance"
## [18] "amenity"
## [19] "arcade.left"
## [20] "arcade.right"
## [21] "area"
## [22] "attraction"
## [23] "beauty"
## [24] "bicycle"
## [25] "bicycle.oneway"
## [26] "bridge"
## [27] "bridgemaster"
## [28] "bridge.movable"
## [29] "bridge.name"
## [30] "bridge_ref"
## [31] "bridge.ref"
## [32] "building"
## [33] "bus"
## [34] "bus_lane"
## [35] "bus.lanes"
## [36] "busway"
## [37] "busway.left"
## [38] "busway.right"
## [39] "campus"
## [40] "car"
## [41] "carriageway_ref"
## [42] "coach"
## [43] "comment"
## [44] "complete"
## [45] "construction"
## [46] "construction.active_traffic_management"
## [47] "construction.bridge"
## [48] "conveying"
## [49] "covered"
## [50] "created_by"
## [51] "crossing"
## [52] "crossing_ref"
## [53] "cuisine"
## [54] "cutting"
## [55] "cyclestreets_id"
## [56] "cycleway"
## [57] "cycleway.both"
## [58] "cycleway.both.oneway"
## [59] "cycleway.left"
## [60] "cycleway.left.width"
## [61] "cycleway.oneside.width"
## [62] "cycleway.otherside"
## [63] "cycleway.otherside.width"
## [64] "cycleway.right"
## [65] "cycleway.right.width"
## [66] "date"
## [67] "description"
## [68] "designated"
## [69] "designation"
## [70] "destination"
## [71] "destination.lanes"
## [72] "destination.ref"
## [73] "direction"
## [74] "distance"
## [75] "disused"
## [76] "drinking_water"
## [77] "ele"
## [78] "embankment"
## [79] "emergency"
## [80] "est_width"
## [81] "except"
## [82] "fenced"
## [83] "fhrs.id"
## [84] "fixme"
## [85] "FIXME"
## [86] "foot"
## [87] "footway"
## [88] "ford"
## [89] "goods"
## [90] "handrail"
## [91] "hgv"
## [92] "highway"
## [93] "highways_england.area"
## [94] "historic"
## [95] "home_zone"
## [96] "horse"
## [97] "hov"
## [98] "hov.lanes"
## [99] "hov.minimum"
## [100] "hvg"
## [101] "image"
## [102] "incline"
## [103] "incline.value"
## [104] "indoor"
## [105] "int_ref"
## [106] "invalid_carriage"
## [107] "is_in"
## [108] "is_in.city"
## [109] "junction"
## [110] "landuse"
## [111] "lane"
## [112] "lanes"
## [113] "lanes.backward"
## [114] "lanes.bus.forward"
## [115] "lanes.forward"
## [116] "lanes.left"
## [117] "lanes.psv"
## [118] "lanes.psv.backward"
## [119] "lanes.psv.forward"
## [120] "layer"
## [121] "lcn"
## [122] "lcn_ref"
## [123] "leisure"
## [124] "level"
## [125] "lit"
## [126] "loc_name"
## [127] "loc_ref"
## [128] "man_made"
## [129] "maxcc"
## [130] "maxheight"
## [131] "maxheight.imperial"
## [132] "maxheight.physical"
## [133] "maxlength"
## [134] "maxspeed"
## [135] "maxspeed.type"
## [136] "maxspeed.variable"
## [137] "maxweight"
## [138] "maxweightrating"
## [139] "maxwidth"
## [140] "memorial"
## [141] "moped"
## [142] "motorbike"
## [143] "motorcar"
## [144] "motorcar.conditional"
## [145] "motorcycle"
## [146] "motorcycle.conditional"
## [147] "motorroad"
## [148] "motor_vehicle"
## [149] "motor_vehicle.conditional"
## [150] "mtb"
## [151] "mtb.scale"
## [152] "mtb.scale.imba"
## [153] "mtb.scale.uphill"
## [154] "name.botanical"
## [155] "name.left"
## [156] "name.right"
## [157] "narrow"
## [158] "natural"
## [159] "ncn_ref"
## [160] "noname"
## [161] "note"
## [162] "note2"
## [163] "note_2"
## [164] "note.lit"
## [165] "note.maxwidth"
## [166] "note.name"
## [167] "note.postal_code"
## [168] "note.postcode"
## [169] "not.name"
## [170] "nus"
## [171] "old_name"
## [172] "old_ref"
## [173] "oneway"
## [174] "oneway.bicycle"
## [175] "oneway.conditional"
## [176] "oneway.psv"
## [177] "operator"
## [178] "osmarender.renderName"
## [179] "overtaking"
## [180] "path"
## [181] "paved"
## [182] "phone"
## [183] "placement"
## [184] "place_numbers"
## [185] "postal_code"
## [186] "postcode"
## [187] "priority"
## [188] "private"
## [189] "psv"
## [190] "psv.backward"
## [191] "psv.guided"
## [192] "railway"
## [193] "railway.historic"
## [194] "ramp"
## [195] "ramp.bicycle"
## [196] "ramp.wheelchair"
## [197] "rcn_ref"
## [198] "ref"
## [199] "roundabout"
## [200] "sac_scale"
## [201] "segregated"
## [202] "ser"
## [203] "service"
## [204] "shared"
## [205] "shop"
## [206] "sidewalk"
## [207] "ski"
## [208] "smoothness"
## [209] "source"
## [210] "source.access"
## [211] "source.addr"
## [212] "source.address"
## [213] "source.bicycle"
## [214] "source.date"
## [215] "source.designation"
## [216] "source.maxspeed"
## [217] "source.maxweight"
## [218] "source.name"
## [219] "source.not.name"
## [220] "source.outline"
## [221] "source.position"
## [222] "source.ref"
## [223] "source.track"
## [224] "sport"
## [225] "step_count"
## [226] "steps"
## [227] "surface"
## [228] "taxi"
## [229] "temporary.access"
## [230] "temporary.date_off"
## [231] "temporary.date_on"
## [232] "todo"
## [233] "toll"
## [234] "tourism"
## [235] "tourist_bus"
## [236] "tracktype"
## [237] "traffic_calming"
## [238] "trail_colour"
## [239] "trail_visibility"
## [240] "tunnel"
## [241] "tunnel.ref"
## [242] "turn"
## [243] "turn.lanes"
## [244] "turn.lanes.backward"
## [245] "turn.lanes.forward"
## [246] "type"
## [247] "vehicle"
## [248] "vehicle.lanes"
## [249] "website"
## [250] "wheelchair"
## [251] "width"
## [252] "wikidata"
## [253] "wikipedia"
## [254] "geometry"
The dodgr
package (Distances on Directed Graphs) contains a helper function for extracting street networks:
library (dodgr)
net2 <- dodgr_streetnet ("Bristol UK")
dim (net2)
## [1] 31401 257
This function yields more streets because it only extracts rectangular street networks. Direct use of the osmdata
package allows finer control. From here on, we’ll use the first, polygonal network.
dodgr
can be used to calculate distances between points. Let’s work out how far it is to walk from the main train station (Temple Meads) to the mouth of the Avon (where there is a railway station called “Avonmouth”)
station <- osmdata::opq (bp [[1]]) %>%
osmdata::add_osm_feature (key = "building") %>%
osmdata::add_osm_feature (key = "name", value = "Temple Meads",
value_exact = FALSE) %>%
osmdata::osmdata_sf ()
avonmouth <- osmdata::opq (bp [[1]]) %>%
osmdata::add_osm_feature (key = "railway") %>%
osmdata::add_osm_feature (key = "name", value = "Avonmouth") %>%
osmdata::osmdata_sf ()
Those are two osmdata
objects with the points we want contained in the osm_polygons
item. Let’s just get the distance between the first point of each polygon
station <- station$osm_polygons %>%
sf::st_coordinates ()
station <- as.numeric (station [1, 1:2])
avonmouth <- avonmouth$osm_polygons %>%
sf::st_coordinates ()
avonmouth <- as.numeric (avonmouth [1, 1:2])
station; avonmouth
## [1] -2.582369 51.449353
## [1] -2.699559 51.500274
The final step before we can route through the street network is to convert the sf
-format network into a dodgr
network. This is necessary in order to weight each street segment according to a specified weighting profile. In this case, we want to walk, so,
net_walk <- weight_streetnet (net, wt_profile = "foot")
dim (net_walk)
## [1] 176366 13
The weight_streetnet
function breaks the network down into individual street segments, of which there are 176,000. Each of these is allocated a weighted distance dependeing on the kind of way and the preferred mode of transport. Calculating the distances is then as easy as,
dodgr_dists (net_walk, from = station, to = avonmouth)
## 595411468
## 5434139604 13.05817
To find out how far it is to cycle to Avonmouth, we need to generate a new dodgr
street network weighted for bicycle travel.
net_bike <- weight_streetnet (net, wt_profile = "bicycle")
dodgr_dists (net_bike, from = station, to = avonmouth)
## 595411468
## 5434139604 12.56132
It is about half a kilometre shorter to cycle than to walk to Avonmouth, presumably because Bristol has put some excellent cyclepaths somewhere along the way. Note that routing different kinds of transport always requires entirely different street networks, weighted for each particular kind.
Where do our routes actually go? We can find out with dodgr_paths
:
p_foot <- dodgr_paths (net_walk, from = station, to = avonmouth)
class (p_foot)
## [1] "list"
length (p_foot)
## [1] 1
head (p_foot [[1]] [[1]])
## [1] "5434139603" "4975934117" "4975934116" "5467924658" "285888512"
## [6] "17408409"
dodgr_paths
returns lists because it’s designed for many-to-many routing tasks (see below), where the first list item is the [[from]]
entry, and the second is the [[to]]
entry. Each of these then contains a sequence of Open Street Map node IDs. How do we map these back on to the network to generate the coordindates of this path?
verts <- dodgr_vertices (net_walk)
head (verts)
## id x y component n
## 1 331744 -2.666747 51.50947 1 0
## 2 1591662663 -2.666181 51.51003 1 1
## 3 331756 -2.667861 51.50466 1 2
## 4 2983641845 -2.667944 51.50281 1 3
## 5 991949098 -2.684810 51.49633 1 4
## 6 305900 -2.674981 51.50161 1 5
The dodgr_vertices
function extracts all of the nodal vertices from the network. Each of these has an id
value, so we just need to match our path to these:
index <- match (p_foot [[1]] [[1]], verts$id)
p_foot <- verts [index, ]
We can view this via mapview
(or any other suitable way) by converting this to an sf
object
p_foot <- p_foot [, c ("x", "y")] %>%
as.matrix () %>%
sf::st_linestring () %>%
sf::st_sfc ()
sf::st_crs (p_foot) <- 4326 # OSM CRS
Then we’re good to go. The following code will open up an interactive map of our path.
library (mapview)
mapview (p_foot)
We can compare the two of them by adding the bike
path in a different colour:
p_bike <- dodgr_paths (net_bike, from = station, to = avonmouth)
verts <- dodgr_vertices (net_bike)
index <- match (p_bike [[1]] [[1]], verts$id)
p_bike <- verts [index, ]
p_bike <- p_bike [, c ("x", "y")] %>%
as.matrix () %>%
sf::st_linestring () %>%
sf::st_sfc ()
sf::st_crs (p_bike) <- 4326 # OSM CRS
mapview (p_foot) %>%
addFeatures (p_bike, color = "red")
Running this code will reveal why and precisely where the walking path is slightly longer than the cycling path.
Thus far, dodgr
does nothing more than what stplanr
can do, yet takes considerably more code to get there. The real advantage of dodgr
, and the purpose of its development, is in massive routing tasks. dodgr
can calculate distances between loads of points really quickly. Let’s take a sample of the vertices listed previously (and it suffices to just pass the OSM "id"
values to these dodgr
routines):
vert_sample <- verts$id [sample (nrow (verts), size = 100)]
system.time (
dists <- dodgr_dists (net_bike, from = vert_sample, to = vert_sample)
)
## user system elapsed
## 2.978 0.020 1.030
That call just calculated \(100\times 100=10,000\) distances in around one second. Such calculations can technically be done by calling routing APIs, but even where that is possible, the big constraint is the number of calls. google, for example, only offers 2,500 free requests per day, so we’re already over the limit there. They then charge US$0.50 per additional 1,000 requests, so that would have cost us \(7.5\times 0.5 = \$US3.25\). dodgr
can scale to far larger calls. Try scaling the previous call to 1,000 vertices instead of 100. That generally takes around one minute, and performs one million calculations. Using the google API, that would translate to \(1,000\times 0.5 = \$US500\).
Not only is dodgr
free, it is also very highly optimized code. Let’s compare dodgr
with the “routing-industry-standard” igraph
. We’ll do the comparison on a smaller sub-sample of the graph, extracted using the dodgr_sample
function. We also need to do some fiddling around to convert our network to igraph
format.
nets <- dodgr_sample (net_bike, nverts = 1000)
netc <- dodgr_contract_graph (nets)$graph # removes all points except junctions
from_id <- unique (netc$from_id)
to_id <- unique (netc$to_id)
# set up igraph:
edges <- cbind (netc$from_id, netc$to_id)
edges <- as.vector (t (edges))
igr <- igraph::make_directed_graph (edges)
igraph::E (igr)$weight <- netc$d_weighted
rbenchmark::benchmark (
d <- dodgr_dists (netc, from = from_id, to = to_id),
d <- igraph::distances (igr, v = from_id, to = to_id,
mode = "out"),
replications = 10, order = "relative")
## test
## 1 d <- dodgr_dists(netc, from = from_id, to = to_id)
## 2 d <- igraph::distances(igr, v = from_id, to = to_id, mode = "out")
## replications elapsed relative user.self sys.self user.child sys.child
## 1 10 0.076 1.000 0.203 0.004 0 0
## 2 10 0.208 2.737 0.208 0.000 0 0
dodgr
is a few times faster than the faster alternative. In computing times, speed increases of a few percent are often big news. A few times really is an awful lot faster. (Note also that igraph
does some clever caching behind the scenes, so re-running the above benchmark code will greatly increase the relative speed of igraph
; fair comparisons require re-running in a fresh R session.)
Beyond the usual distances and routes, the real workhorse of dodgr
is the ability to aggregate flows between multiple points. This can readily be demonstrated using our Bristol network, calculating flows between a random sample of 10 points. Flow aggregation requires specifying a "flow"
matrix, equivalent to an OD matrix.
n <- 10
pts <- sample (verts$id, size = n)
flows <- array (runif (n ^ 2), dim = rep (n, 2)) # random OD values
f <- dodgr_flows_aggregate (net_bike, from = pts, to = pts, flows = flows)
head (f)
## geom_num edge_id from_id from_lon from_lat to_id to_lon
## 1 1 1 331744 -2.666747 51.50947 1591662663 -2.666181
## 2 1 2 1591662663 -2.666181 51.51003 1953103652 -2.664919
## 3 2 3 331756 -2.667861 51.50466 2983641845 -2.667944
## 4 2 4 2983641845 -2.667944 51.50281 331757 -2.668086
## 5 3 5 991949098 -2.684810 51.49633 305908 -2.684199
## 6 4 6 305900 -2.674981 51.50161 1591662605 -2.670281
## to_lat d d_weighted highway way_id component flow
## 1 51.51003 0.07407219 7407.219 motorway 3391098 1 0
## 2 51.51115 0.15223452 15223.452 motorway 3391098 1 0
## 3 51.50281 0.20554267 20554.267 motorway 3391104 1 0
## 4 51.50205 0.08595035 8595.035 motorway 3391104 1 0
## 5 51.49676 0.06397381 6397.381 motorway 3391107 1 0
## 6 51.50432 0.44360430 44360.430 motorway 3391109 1 0
The dodgr_flows_aggregate()
function returns the input data (net_bike
) modified through the additional of an extra flow
column. Plotting these flows is best done through first filtering out all flow == 0
values:
f <- f [which (f$flow > 0), ]
This dodgr data.frame
can then be converted to an sf
object for plotting with
fsf <- dodgr_to_sf (f)
This is not quite a standard sf
object, as the two components of data and geometry are split into separate list items. The following lines can be used to plot the results, scaling both line thicknesses and colours according to the flow values.
cols <- topo.colors (30) [ceiling (30 * fsf$dat$flow / max (fsf$dat$flow))]
mapview (fsf$geoms, lwd = fsf$dat$flow, color = cols)
That result is only from a made-up sample of 10 points. It is obviously straightforward both to extend to many more points, and also to insert real OD data. As a final note, mapview
- and any other leaflet
-based map generators - struggles with large data sets (say, beyond 1,000 or so sf
objects). For such large data sets, it is often advantageous to filter out flows below some finite minimal value, through replacing the above line with
flim <- 0.1 # or whatever; examine histogram of flows?
f <- f [which (f$flow > flim), ]
Finally, dodgr
includes an additional function, dodgr_flows_disperse
. This calculates flows describing movement that simply disperses away from a given set of points according to a simple exponential dispersal model. This model corresponds to a typical spatial interaction model, and this function can be applied in places lacking OD data but for which population densities are available.
f <- dodgr_flows_disperse (net_bike, from = pts, dens = runif (length (pts)),
k = 0.5) # k is exponential decay parameter in km
This can also be plotted with the same code as above, although (i) it may not respond as many more line segments will be generated, and (ii) the line thicknesses may need to be adjusted with
line_scale <- 2 # or whatever
mapview (fsf$geoms, lwd = line_scale * fsf$dat$flow, color = cols)