Tuesday, May 10, 2011

Time-animated Thematic Maps w/ US Census Data


Looking for cool data to visualize for my Google I/O talk (High Performance KML for Google Earth and Maps), I found a few datasets from the US Census and American Community Survey that were pretty interested.

(Download this KML at the bottom of the page... read the warning ;P)


Datasets


Most of this data was pretty messy, in a number of different formats (Excel, csv, raw text) and required a lot of scrubbing. I used Google Refine, Excel and Open Office Calc and good ol' Notepad++ to massage this data into a usable format (CSV).

I then found state and county polygon data on Google Fusion Tables. By maintaining the FIPS codes for all the states and counties in the Census/ACS data, merging with the geometries in Fusion Tables was pretty easy. I then downloaded a master CSV file from Fusion Tables that had a row for each state/county, and columns for each of the time-series values.

Here is the Fusion Table of all the raw data I used to make the map: raw data

To generate a KML from this csv file I used the python-based pyKML library, developed by Tyler Erickson. It is based on lxml's Objectify API, and supports schema validation against the KML XSDs. I’ll try to share some code snippets once I clean it up... right now it’s a bunch of spaghetti :)

I also used the Google Charts API Wizard to dynamically generate graphs for each state/county during the animations (you must start the tour, pause or wait until the end and then click on a polygon to see the chart). This was made possible by a lot of <BalloonStyle> + <ExtendedData> hackery (see this tutorial).

The final product is almost 650K lines of KML including 200K+ polygon vertices and point coordinates!

Ultimately, we need to add a feature to KML that allows you to associate altitudes with timestamps, like the way <gx:Track> works. But in the mean time, this will do :)




U.S. Population by State


California Population by Counties


Percent of children lacking health care in 1987 by State


Chart API for children without health insurance in 1994 in Texas


Percent of homes without complete indoor plumbing 1940



Warning: This is a massive KML (3MB compressed, 33MB uncompressed). The "Population (1900-2010) (County)" Tour is particularly cumbersome for laptops/older computers, because there are 3000+ animated county polygons. Click on the tour in the left hand panel to view the data.

Here are some of the KML tags I used to make this:
  • <ExtendedData> & <BalloonStyle> (tutorial)
  • <gx:Tour> & <gx:AnimatedUpdate> (docs)
  • <Region> (docs & tutorial)

No comments:

Post a Comment