GIS, or Geographic Information System seems to fallen out of flavour amongst those with 'data science' leanings. R is a fantastic toolset, with some great geospatial packages that are built on solid GIS libraries, but sometimes I feel it just doesn't compare to using programs made to fit for geo work. This is why this post will focus on visualising data using QGIS (http://www.qgis.org).
Don't be alarmed by this. QGIS is Open Source and well supported. It has a python console that allows the user to script things as required, and also hooks up to other GIS software such as GRASS, GDAL and SAGA. Goto the link above to check more out about it.
The purpose of this post will be to show how easy it is to download OpenStreetmap, or OSM, data into QGIS and have a look around. For those unaware of what OSM is, check it out here.
Firstly, I'm on Ubuntu. Which means scripting things is a breeze, we can download the OSM data easily from geofabrik in a one liner. We'll get data for The Netherlands:
wget -c http://download.geofabrik.de/europe/netherlands-latest.osm
This .osm file can be quite large, and compressed versions are available. But we consider it as almost a source data file that we will import into a database. I've used PostgreSQL along with the GIS extension PostGIS for the last two years now. It works great and can do many great things.
In order to import the .osm file into PostgreSQL + PostGIS we require a library, GDAL. This is a well known library that operates across multiple platforms - you may have used this package in R. Again, this is open source. We're going to use the ogr2ogr program from GDAL. The manual describes the function of this program quite well:
This program can be used to convert simple features data between file formats performing various operations during the process such as spatial or attribute selections, reducing the set of attributes, setting the output coordinate system or even reprojecting the features during translation.
Assuming you've installed things correctly, and you have access working PostgreSQL + PostGIS database, you can simply insert the .osm data directly in.
ogr2ogr -f "PostgreSQL" PG:dbname=data netherlands-latest.osm -gt 65536 -overwrite --config PG_USE_COPY YES --config OVERWRITE YES --config SPATIAL_INDEX NO -progress
The command instructs ogr2ogr to insert data from 'netherlands-latest.osm' into 'PostgreSQL' database 'data'. I'll leave the additional parameters for you to investigate, but quickly mention '--config PG_USE_COPY YES' as it greatly enhances the speed of the operation on certain systems.
Now we can fire-up QGIS and investigate what we've done. You'll need to click Layer>Add Layer>Add PostGIS Layers. Upon connecting to the database in QGIS, we should see five tables with vector geometries. These are:
- points
- lines
- multilinestrings
- multipolygons
- other_relations
We're going to add roads to our map, so we select lines and filter with the following code:
"highway" in ('motorway','motorway_junction','motorway_link','primary','primary_link','trunk','trunk_link')
This allows us to add only roads in the lines table that are of type motorway, primary or trunk. This filter is important on low-spec'd machines are often it's possible to import more data than your machine can handle. Let's see the outcome.
Image 1. Adding a PostGIS geometry table with a filter in QGIS
Image 2. Motorways, Primary and Trunk roads of The Netherlands via OSM
With a little work, you can achieve some beautiful maps and unlock some incredibly useful data from OSM. Below is just a small sample of the potential of OSM data with QGIS, some additional SQL and creative styling.
Image 3. Bank branch locations in Amsterdam and surrounding region
This post barely touches on any of the fantastic things you can do with QGIS, PostGIS or OSM data, and doesn't even mention the interaction with R. Hopefully I'll have time to show more cool things this software combination has to offer.
Cheers