You can't do web mapping these days without knowing your GeoJSON. It's the vector format of choice among popular mapping libraries like Leaflet, D3.js and Polymaps. Size matters on the web, especially if you want to distribute complex geometries, like the world's countries. The challenge is even bigger if you want to target mobile users - or support web browsers with poor vector handling (IE < 9). This blog post will show you how to minify your GeoJSON files before sending them over the wire.
The first thing you should do is to generalize your vectors so they don't contain more detail than you need. In a previous blog post, I was able to remove 90% of the coordinates without loosing to much detail for map scale I wanted to use. This will of course have a great effect on the file size.
Today, I'm going to use country borders from the Natural Earth dataset. These datasets are already generalized for different scales (1:10m, 1:50m, and 1:110 million), so I'll use them as they are. The 1:110m (small scale) and 1:50m (medium scale) shapefiles will cover the needs for the thematic world maps I plan to make:
Let's open the datasets in QGIS. If you look at the attribute table you'll see that each dataset contains 63 attributes, which makes them very versatile. For your web maps, you probably need just a few of the attributes, and you should remove the ones you don't need. I'm keeping the country name and the ISO 3166-1 country codes (alpha-2, alpha-3, and numeric), which can be used to link country geometries to statistical data.
Next, we can convert the shapefiles to GeoJSON with ogr2ogr:
ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=1 ne_110m_admin_0_countries.json ne_110m_admin_0_countries.shp
ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=2 ne_50m_admin_0_countries.json ne_50m_admin_0_countries.shp
The important thing is that I'm only keeping one decimal (coordinate precision) for the 110m dataset, and two decimals for the 50m dataset, which is sufficient for my map scales. This will reduce the size of the GeoJSON files by more than half. The size of the 110m GeoJSON is now 207 kB and the 50m version is 1,897 kB. But we can do better.
The files contains a lot of whitespace, which is waste of space. I planned to use Sublime Text to remove the whitespace, but it were not able to handle the 50m GeoJSON file, so I switched to Notepad++. I used these regular expressions:
Find: "([^a-z.]) "
Replace: "$1"
This will remove all whitespace which is not succeeding a letter or a dot, which are present in country names.
Find: "\n,"
Replace: ","
Remove line breaks (keeping some for readability).
Find: "\.0([,\]])"
Replace: "$1"
Remove trailing zeros.
This will reduce the file size of the 110m GeoJSON from 207 to 156 kB, without loosing any data quality. More than 400k of whitespace characters was removed from the 50m GeoJSON file, reducing the file size from 1,897 to 1,481 kB.
If your web server is supporting gzipping on-the-fly, the 110m GeoJSON will end up being 45 kB and the 50m version will be 430 kB. Not bad!
And if this is too much work, you can always download the final GeoJSON files on thematicmapping.org.
NB! Mike Bostock’s TopoJSON would allow us to compress the GeoJSON even more, while preserving topology (shared borders between countries) - but we would need to use a map client supporting the format. Looks promising!
The first thing you should do is to generalize your vectors so they don't contain more detail than you need. In a previous blog post, I was able to remove 90% of the coordinates without loosing to much detail for map scale I wanted to use. This will of course have a great effect on the file size.
Today, I'm going to use country borders from the Natural Earth dataset. These datasets are already generalized for different scales (1:10m, 1:50m, and 1:110 million), so I'll use them as they are. The 1:110m (small scale) and 1:50m (medium scale) shapefiles will cover the needs for the thematic world maps I plan to make:
The 110m and 50m country polygons shown in QGIS. |
Let's open the datasets in QGIS. If you look at the attribute table you'll see that each dataset contains 63 attributes, which makes them very versatile. For your web maps, you probably need just a few of the attributes, and you should remove the ones you don't need. I'm keeping the country name and the ISO 3166-1 country codes (alpha-2, alpha-3, and numeric), which can be used to link country geometries to statistical data.
Only keep the attributes you need. |
Next, we can convert the shapefiles to GeoJSON with ogr2ogr:
ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=1 ne_110m_admin_0_countries.json ne_110m_admin_0_countries.shp
ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=2 ne_50m_admin_0_countries.json ne_50m_admin_0_countries.shp
The important thing is that I'm only keeping one decimal (coordinate precision) for the 110m dataset, and two decimals for the 50m dataset, which is sufficient for my map scales. This will reduce the size of the GeoJSON files by more than half. The size of the 110m GeoJSON is now 207 kB and the 50m version is 1,897 kB. But we can do better.
The files contains a lot of whitespace, which is waste of space. I planned to use Sublime Text to remove the whitespace, but it were not able to handle the 50m GeoJSON file, so I switched to Notepad++. I used these regular expressions:
Find: "([^a-z.]) "
Replace: "$1"
This will remove all whitespace which is not succeeding a letter or a dot, which are present in country names.
Find: "\n,"
Replace: ","
Remove line breaks (keeping some for readability).
Find: "\.0([,\]])"
Replace: "$1"
Remove trailing zeros.
This will reduce the file size of the 110m GeoJSON from 207 to 156 kB, without loosing any data quality. More than 400k of whitespace characters was removed from the 50m GeoJSON file, reducing the file size from 1,897 to 1,481 kB.
If your web server is supporting gzipping on-the-fly, the 110m GeoJSON will end up being 45 kB and the 50m version will be 430 kB. Not bad!
And if this is too much work, you can always download the final GeoJSON files on thematicmapping.org.
NB! Mike Bostock’s TopoJSON would allow us to compress the GeoJSON even more, while preserving topology (shared borders between countries) - but we would need to use a map client supporting the format. Looks promising!