Yours and Mine: the NYC Data Mine

Thursday was a happy day for me. I was quite proud to learn yesterday that NYC has finally publicly demonstrated some evidence of tangible commitment to participating in the “open government” movement.

On 8 October 2009, NYC published a collection of open datasets in various machine-readable formats, from RSS feeds, spreadsheets, and more. These datasets are available at the NYC Data Mine. The NYC Data Mine is presently divided into two general types of datasets: the Geo Data Catalog, which offers “administrative and political boundaries, facilities and structures, and various imagery and base maps”; and the Raw Data Catalog, which offers all sorts of other types of data in the form of spreadsheets, RSS feeds, and various XML document formats.

Having browsed at what’s presently published in the NYC Data Mine, I must admit that — in its present state — I find the breadth of the offered data to be lacking. If this is the final state of things, it’d be lame compared — for example — to the data that the state of Utah has published.

That said, I’m willing to give NYC the benefit of the doubt here. Every effort has to start somewhere.

Moving forward, however, I’d still like to see the following:

  • A complete itemization of the City’s expenditures, down to the dime, including staff and office-holder payrolls.

  • NYC public school data, from student performance metrics to faculty information and budgetary expenditures to nutritional reports outlining what foods are served (and the serving volume) by each school.

  • Geo data showing property and business taxes collected by the City, perhaps down to the block level (I can anticipate concerns over privacy issues arising at any greater granularity).

  • Public heath care data, including frequency of reported ailments, injury, etc at each hospital, school, and other institution.

Note that, in all cases, data collection should err on the side of preserving anonymity, whenever there is reasonable concern that the data can be traced back to specific private citizens (especially with respect to specific individuals’ health and educational situations).

But the announcement of NYC’s Data Mine is only part of the story.

The City also launched NYC BigApps, a competition intended to raise awareness of this new open dataset, and to promote its use to create new tools to serve New Yorker City residents, businesses, and visitors.

From Mayor Bloomberg’s introductory post on the competition site’s blog:

NYC BigApps provides a competitive outlet for developers and encourages the general public to get involved as well. We welcome public comment on the process — indicate your support for the competition, share app ideas, and inform contestants on what type of app you’d like to see.

Ultimately it’s great to finally see NYC — my city — step up to the plate on the open government scene. There’s yet a long (long) way to go, but yesterday’s announcements do give me a glimmer of hope.

So — anyone up for a hackathon weekend… or three?