Creating travel time isochrone maps with pgRouting and OpenStreetMap data
Lately I’ve been doing quite a lot of travelling across South Africa, visiting many smaller cities in the provinces. Because of this I was inspired to revisit and improve a project from earlier this year in which I produced an isochrone map of South Africa showing travel time from Johannesburg.
The map below shows how long (approximately) it takes to travel from Johannesburg to any point in South Africa using cars and scheduled airline flights. Specifically, it shows the shortest travel time from Johannesburg City Hall, either by driving directly to the destination, or by driving to OR Tambo International Airport, flying to another airport and then driving to the destination.
The map also shows (using dashed lines) the “catchment area” for each airport, by which I mean the area of the country for which the fastest route from Joburg involves flying to that airport. It makes some optimistic assumptions: that no traffic is encountered on the road; that there are no flight delays; and that a total of 1 hour and 30 minutes is used for checking in and boarding a flight and for disembarking at the other end.
I’ve also made similar maps for travel time from Cape Town and Durban, which can be seen below.
I produced these maps using the pgRouting routing engine and road network data from OpenStreetMap. This post details how I did it; the accompanying code is available on GitHub.
Strategy
The strategy for producing these maps can be briefly described as follows:
- Calculate the driving time from the origin point (e.g. Johannesburg City Hall) to every point on the road network of South Africa.
- Calculate the driving time from the origin point to the nearest airport (e.g. OR Tambo International).
- Look up the flying times for every scheduled route out of that airport.
- Calculate the driving time from every airport in South Africa to every point on the road network of South Africa.
- For every point on the road network, work out the shortest time travel time out of all all possible routes.
- One possible route is the direct drive from the origin which has been calculated in point 1.
- Another possible route is given by each flight route out of the airport identified in point 3. The time for each of those routes is given by adding up the driving time from the origin from point 2, the flight time from point 3, the driving time to the destination from point 4, and an extra 90 minutes to account for check-in, boarding, disembarking, etc.
One caveat regarding point 3: every airport in South Africa that is served by scheduled flights has flights from OR Tambo, so it’s unnecessary to consider flights from other airports (e.g. Lanseria or Wonderboom). When it comes to Cape Town and Durban, each city is served by only one airport, but they don’t have direct flights to all airports, so you have to consider routes that involve changing planes in Johannesburg.
Implementation
If you want to follow along on your computer, you’ll need to install some software:
On my Mac I installed all of these with Homebrew by running
On another OS, you’re on your own; I’m pretty sure they’re all available through apt
on a Debian or Ubuntu system.
Create the database and load the SA road network
First we create a PostgreSQL database, and add the PostGIS and pgRouting extensions. I’ve called the database traveltime
.
Then we download and unzip the South African OpenStreetMap extract graciously provided by GeoFabrik.
Next we load the OSM data into the pgRouting database. You will need to change the username from adrian
to your username. The mapconfig.xml
file, which describes the features to load from the OSM data, is available from the git repository.
Remove unconnected ways and vertices
From here on, everything we do is going to be inside the PostgreSQL shell, which we start by running something like the following.
The database produced by osm2pgrouting
has two main tables: ways
, which contains road segments; and ways_vertices_pgr
, which contains vertices, i.e. points where two or more road segments meet.
In the OSM extract there are some ways and vertices that are not connected to the main South African road network, meaning that they will never be reachable from our origin point or from an airport. We want to remove these from the database so that they won’t complicate things later. First we produce a list of all the vertices that are connected to the network.
This is our first encounter with the pgr_drivingDistance
function, and it needs some explaining. This function walks the road network starting at a specified vertex, and calculates the driving distance (or time) from the starting point to each vertex in the network. In this case we don’t care about the driving time; we’re just using it to find out which vertices are connected at all.
The first parameter to the function is a SQL function which defines the “cost” (distance or time) of travelling along each way; since we don’t care about the cost, we set this to 1 for both the forward and reverse directions.
The second parameter is the origin vertex for the walk; the subquery there is to find the ID of the vertext corresponding to OSM node 263568563 which is a point on the N5 just outside of Winburg in the Free State. (This is correct as of July 2017; if you’re visiting this in the future the OSM data might have changed.) The choice of this point is arbitrary; it has to be a point on the South African road network, and I chose Winburg because it’s central.
The third parameter is a limit on the distance that the function will walk before giving up; in this case 20000 is a limit large enough to make sure we’ve walked the whole network.
Now we can continue by deleting the unconnected ways and vertices.
Calculate the driving time from the origin point
We create a table to hold the origin vertex for our map.
OSM node 201596578 is the corner of Albertina Sisulu Road and Rissik Street in front of the Johannesburg City Hall. We could add records here for other origins, like Cape Town and Durban.
Now we use pgr_drivingDistance
to find the driving time from this vertex to every vertex in the road network.
This time we’ve used an actual cost function in the first parameter to pgr_drivingDistance
. It takes the length of the way in metres divided by the speed limit in kilometres per hour, and multiplies that by a factor of 0.08, which scales it to a time in minutes and increases that time by a third. This increase accounts for the fact that one can’t always drive at the declared maximum speed of a road. I determined it by comparing the results of pgRouting’s calculations with times given by other routing engines (Google Maps, OSRM, etc.)
The array_agg
in the second parameter constructs an array of all the vertices in the origin
table, so that if there’s more than one the function will calculate driving times for all the possible origins.
Load airports and flying times
We create a table of airport vertices using the list of OSM node IDs from the file airports.csv. These are the IDs of nodes on the road immediately outside the airport terminal; as before, these are valid as of July 2017 but might need to be updated in the future.
Next we load details of flying times from one airport to another, from the file flying-time.csv. I collected these times from the airline schedules.
Calculate driving times from the airports
Just as we created a table of driving times from the origin point, we now create a table of driving times from each airport vertex.
The repeated uses of pgr_drivingDistance
with the LIMIT
and OFFSET
parameters in the subquery are necessary because that function uses more memory if you have more origin vertices. My computer was not able to calculate the driving times from all 24 airports simultaneously, so I split them up into three batches.
Finding the shortest route to every vertex
At this point we have built up all the data we need according to points 1 to 4 of the strategy that I laid out above. Now we can proceed to calculate the shortest route and travel time from our origin to every vertex on the road network.
The first part of this query, the indented part in the brackets following WITH all_routes AS
, is a common table expression (CTE), which is basically a subquery that we can refer to by the name all_routes
in the main part of the query that follows. This CTE builds up all possible routes from Joburg to every vertex.