Going Places – Great news from Adrian Holovaty

In the first installment of this mini-series i announced that dpa-infocom is geocoding it’s regional newswires and that i’m going to report on the rationale behind doing so, as well as on the solution we ended up using today and our roadmap in forthcoming posts.

One of the main reasons for doing this mini-series is getting the discussion about a/the semantics for geocoding news stories started and working towards a standardisation of the syntax and semantics.

Three days later Adrian Holovaty announces everyblock.com. And in the Poynter interview following the launch of everyblock.com Adrian was asked:

Tompkins: How do you hope newsrooms will adapt your ideas and even your code to their own work?

Holovaty: We’re interested in spreading the concept of “geocoding” news — that is, classifying news articles by location. Currently, we do that by crawling news sites and applying algorithms and human editing efforts, but it’d be best for everybody if news organizations did this on their own. We’re interested in developing some sort of specification/standard for designating granular locations in news stories (Emphasized by me) — look for more about that from us soon.

This is great news! Obviously i agree that news organizations should geocode their news stories on their own. If not, we wouldn’t do so. And we are very interested in working towards a spec / standard. It is for this reason that I’m evangelizing the need for geocoding news not only within dpa (one of the worlds largest news agencies) but am also communicating our approach at various occasions when meeting other news agencies.

For example last September i presented our approach to geocoding news at the inaugural meeting of MINDS International, an association of currently 11 news agencies focusing on the exchange of ideas and solutions between agencies in the online and mobile area.

The key: “Designating granular locations”

Not very surprisingly Adrian hit the nail on the head by stating that a spec/standard for designating granular locations is what is needed most in order to geocode news stories.

Actually, identifying that granular location representations are key to geocoding news stories was one the key learnings on our own road to geocoding news. Other key learnings where:

  1. It is essential to distinguish between locations of news stories and locations in news stories.
  2. Locations of news stories are at least as important as locations in news stories (at least for news agencies)
  3. The most important type of location of a news story is the scope of a news stories, i.e. its geographic area of relevance

In the next couple of posts i’m going to elaborate on these key learnings and present ourĀ  approach.

Hopefully this and the following posts will get a discussion as well as a joint effort for a common spec started.

EveryBlock launched – first impressions

I’ve just read the news that EveryBlock has lauched on Adrian Holovaty’s blog. (You can read about the idea behind everyblock at the launch entry of the everyblock blog). I had the chance to meet Adrian last August and discuss some of my ideas and concerns related to geocoding news etc with him.

It’s great that his vision is now graspable for the cities of Chicago, New York and San Francisco.

everyblock2.png everyblock.png

Here are my first impressions:

Not very surprisingly Adrian and his team “get it right”. All kinds of data accessible at the right URLs. Classification into multiple administrative hierarchies, navigation along these hierarchies, graphically reduced maps, …

The only thing i’m wondering about is the lack of any feeds, especially given the fact that the feed framework of django makes generating feeds easy.

But may be they are at the same point as i’m right now: “What is the best way to include the classification into the hierarchies and the other metadata into the feed.”

No, i’m not talking about the coordinates and geometries, you could use GeoRSS for that. I’m talking about adding the metadata that this news is relevant for a certain borough, precinct etc. and provide enough information that consumers of the feeds can use this information in order to build their own (re-)presentations. More how we handle this at dpa-infocom in a post coming soon to the Going Places mini series i just started.

Presumably everyblock uses django as its backend. Since i’m a long term follower of Christopher Schmidt’s great work, hence its especially good to see that everyblock.com is using openlayers and tilecache for its mapping components.

I’m wondering if they are using the django-gis branch and would be eager to learn which GIS-backend they are using for generating the tiles. Looking at the tilecache URLs they are using a WMS-compatible one. Right now i’m wondering if i should use GeoServer or MapServer for my own experiments i’m going to start very soon. Any recommendations?

Since i know how hard it is to get neighborhood and other geometries (it is even harder in europe than the states, e.g. finally i had to buy them from a commercial provider for the german cities) i wonder how Zillows move to making their neighborhoods available as shapefiles eases their tasks of creating everyblock for more cities.