Some Thoughts on News Registries

EdNotes: This post has been hanging around unpublished since July 24th. Just discoverde that i missed to push the publish button. Since i still think that it is relevant i’m going to push that button now.

Yesterday the Associated Press made her move and announced their plans for a news registry based on the hnews Format and some kind of beacon / tracking device. The day before fairsyndication.org announced that Conde Nast, Scripps, Gawker, Hearst Newspaper Group, McClatchy (NYSE: MNI) and Newsweek have joined the consortium and that the first ad network: AdBrite also joined.

So for me it looks like an arms race between the two right now ( in the US market). Right now i have no time to explain the details and the differences between the two but i wanted to add some more general remarks on the topic:

IMHO News registries are a very good idea, provided that they are open to the public (so that everybody can see what is out there) and what rights are attachetd to the content. Preferrably they provide an public web frontend as well as an API.

The best exisiting news registries i know about in this respect are the NYT APi and the Guardian open platform
Their disadvantage is that they are single source only and do not describe the attached rights in a somewhat machine readable format. hNews the proposed Microformat by the MediaStandardsTrust and the AP seem to be a pragmatic way to do that (much better than ACAP)

Hence IMHO a centralized NewsRegistry:

Would ideally provide an public web frontend (as well as an API) to an index of the registered content
- containing at least URL, publisher and headline,
- preferrably some Metadata and content as dedided on by the publisher
- ideally realnear-time tracking data like you get with the URL Shorteners (needs cooperation of the original content owner as well as the licensees
Would provide some way to (automatically) license/use content. This could be best done by a multi-tiered API.
- It free level should at least include the NYT API and Guardian Open Platform models: Content excerpt with link to original source (NYT) and full content with ad (Guardian) as well as full-content for cc-licensed content.
- The non-free API levels would be basically different levels of rate-limits on full-content. Ideally theere should a a price model that is as simple / complex as the current iTunes model. but not more complex. (It presumably gets more difficult if resellers are allowed to license from the registry). Rev-Sharing has to be considered as an alternative.
- The licensees are contractually mandated to provide statistics to their use of the content (this is presumably what AP’s tracking beacon is about

AFAIK the fairsyndication consortium doesn’t plan to open up it’s registry to the general public. I’m not sure what the AP is planning but i don’t think they will open the index up to the general public :-(

But what about other contenders? Especially the one?

The (technologically) best positioned news registry right now IMHO is Google News.
They already have the content of 25000 sources, the have proven that they can scale,
With the integration of creative commons filtering into Google image search, and the they have shown that they can filter on rights, etc.
A recent (this years W3C paper) shows that they can track the origin of texts on web scale
They already have a quite successful tracking system that can be used. It’s called Google Analytics

So here is a (may be not so) crazy thought experiment:

What would happen if Google would fund the operation of a NewsRegistry similar to the BookRightsRegistry?

It would be run as part of a non-profit organisation
Google would provide licenses to it’s technology to the NewsRegistry
GoogleNews would be the first customer for full text-content (for the content that Google deems relevant to include in full at their site.

Your comments please.