Thursday 13 March 2008

World borders for thematic web mapping

I've spend almost a week tweaking a world borders dataset. Since I need a dataset which I can use without restrictions, I had to dig into the public domain world. I found a proper dataset on the Mapping Hacks website. The original dataset was derived by Schuyler Erle from public domain sources, while Sean Gilles has done some clean up and enhancements.

The main changes I've done are:
  • Polygons representing one country/area are merged into one feature.
  • Added ISO 3611-1 Country codes (alpha-2, alpha-3, numeric-3).
  • Various feature changes to make the dataset more compatible with ISO 3611-1.
  • Added region and sub-region codes from UN Statistical Division.
  • Added longitude/latitude values for each country.

More information on this page and in the "readme.txt" file provided with the dataset. You can download the the dataset in two resolutions. I recommend the simplified version for thematic web mapping.

The dataset could be further optimised by removing small island polygons that are not individual countries.

Download and enjoy!

26 comments:

Beau Gunderson said...

Fantastic work! I've been looking for a world border dataset like this for a LONG time.

Ignatius Indiligentius said...

Congratulations for your blog!

I can not open your shape file. I use Linux and this is the out screen

$ unzip TM_WORLD_BORDERS-0.1.zip
Archive: TM_WORLD_BORDERS-0.1.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of TM_WORLD_BORDERS-0.1.zip or
TM_WORLD_BORDERS-0.1.zip.zip, and cannot find TM_WORLD_BORDERS-0.1.zip.ZIP, period.

Is your shape ok?

Thank you

Bjørn Sandvik said...

The shapefiles should be ok, but they are zipped on a windows computer using 7-Zip (www.7-zip.org). I'm not familiar with the unzip Linux command.

Unknown said...

Great work !
I can't open the zip file (on 7zip and windows) ; I download the file 3 times and it was never possible...
Perhaps you should reload the zip into your blog
Many thanks in advance

Sebastien VIAL
GIS teacher

Bjørn Sandvik said...

Hi,
New zip archives are now available on this page.

Bjørn Sandvik said...

Use link above.

Anonymous said...

Whatever the nuances of copyright/database law in your area, can I assume you intend this dataset resulting from your modifications to be in (or essentially) also in the public domain? The Readme file is a bit unclear on this topic.

Bjørn Sandvik said...

This dataset is available under a Creative Commons Attribution-Share Alike License. I will add this information to the Readme-file.

Unknown said...

Bjorn,

Thanks for this great contribution. Using your data there's now a Mapnik tutorial here that creates a simple thematic map of world population: http://trac.mapnik.org/wiki/XMLGettingStarted#WorldPopulationXML

I'm curious if you have given thought to reprojection issues with this dataset? I'd like to use it for tutorials involving reprojection but it appears that the use of multipolygons can cause havoc near the dateline. This does not happen with the world_borders.shp from mapping hacks.

For an example of this try:

ogr2ogr -t_srs http://spatialreference.org/ref/user/north-pacific-albers-conic-equal-area/ TM_WORLD_BORDERS_SIMPL_pacific_albers.shp TM_WORLD_BORDERS_SIMPL-0.2.shp

Bjørn Sandvik said...

Hi Dane,

Good to see that this dataset is used in an open source project.

I've not given thought to reprojection issues width this dataset. I think you're right about multipolygons, - being on both side of the dateline might cause some problems for some projections. Maybe splitting multipolygons that cross the dateline will solve your problem?

Please notify me if you're able to find a solution.

Anonymous said...

There is a small (but funny) bug in the dataset.

The "United Arab Emirates" are mistyped as "Untied Arab Emirates", at least in version 0.2 (which is what i'm using right now).

Bjørn Sandvik said...

Version 0.3 of the world borders dataset is now available. See changelog in Readme.txt

Unknown said...

Hello again,

A few more items to report:

1) There is now a 'Hello World' GeoDjango application using your data available at:

http://code.google.com/p/geodjango-basic-apps

2) I've yet to look closely at the reprojection issue, but when I do I'm also going to look more closely at the invalid antarctica polygon issue described here:

http://www.nabble.com/modifying-XMLGettingStarted-to-output-in-mercator-projection--tt17664268.html#a17664268

Of course if anyone else digs into the issue please post here!

Cheers, Dane

Roel said...

Hi Bjorn,

The downloadlink isn't working anymore. Could you provide some help?

Thanks!

rOEL

Bjørn Sandvik said...

Hi rOEL,

The download link seems to work here.

Anonymous said...

Hi all,
Sorry if I'm wrong here, but I've noticed that some of the population estimates seem to be a factor of 10 out.... This is not necessarily an exhaustive list but countries I've identified are Italy, Colombia, Saudi Arabia, Niger, Mali, Ghana and Senegal. You can correct the populations in R using correctid <- c(86,38,177,126,113,69,183) and correctpop <- c(58093000,45600000,24573000,13957000,13518000,22113000,11658000). I've taken the figures from http://en.wikipedia.org/wiki/List_of_countries_by_population_in_2005

Unknown said...

How up-to-date are the political boundaries? Circa 2008, or earlier?

Thanks,

Sara

Knut said...

This is great work indeed, Bjørn.

Could you explain how you went about the simplification to get it from 3.3 MB to 226 KB?

Bjørn Sandvik said...

Hi Knut,

The simplification process is described in this document (PDF).

Bjorn

Knut said...

Very helpful - thanks a lot!

Sherwin said...

Hi Bjorn!

Thanks a lot for this! It's very helpful to my project for my client. :)

I have a problem though, can you divide Sudan to "Sudan" and "South Sudan"? I need South Sudan in the list of countries.

http://en.wikipedia.org/wiki/Southern_Sudan

Can you also add Alderney or at least combine it with Guernsey to form Guernsey and Alderney?

http://en.wikipedia.org/wiki/Alderney

I'm using the simple one by the way. :)

Thank you very much!

Bjorn said...

This dataset is no longer maintained. You'll find the same data on www.naturalearthdata.com.

Bjorn

Anonymous said...

Hi! I used your map, but I noticed that "Occupied Palestinean Territory" miss in the file ".dbf".
Can you fix this problem?
I apologize for my bad English.

Thanks

Anonymous said...

World boundaries contain invalid character in the record: 141;NULL;AX;ALA;248;[invisible invalid character]land Island;0;0;150;154;19.952;60.198 This bug inhibit importing this shp as UTF-8 to postgis. Moreover some coordinates seems to be out of extent: ERROR: Coordinate values are out of range [-180 -90, 180 90] for GEOGRAPHY type
LINE 1: ...','Antarctica','0','0','0','0','21.304','-80.446','010600002...
All the best,
Jan

Morten Myksvoll said...

Thank you for a great map! If I may, I'd like to ask for a version where Kosovo is separated from Serbia, as 108 UN-countries, including my own, recognize its independence.

Also, it is relevant for my use of the map, as my data separates the two entities.

South Sudan is also a country that I'd like to see in this map, as it is a member state of the UN.

Bjørn Sandvik said...

Morten: the dataset is no longer maintained. You'll find the same data on naturalearthdata.com