bbctouch.pngI already credited the BBC a couple of times for opening up their content vie RSS feeds and APIs on Backstage.

This encourages and enables users to come up with innovative ideas how to use that content.

One idea i especially like is BBCTouch that uses the BBC Homepage RSS feed and information about the most read stories (also provided by the BBC) to determine the degree of synchronisation of both collections in near time.

I think that this kind of tool should be available in every newsroom (maybe with an even better visualisation like treemaps ;-)
Don’t get me wrong. It is not about following the masses. It is about making informed decisions. Having the basis for defining something that goes into a balanced scorecard for news operations.

Via BBCTouch i also discovered the Yahoo term extraction API one of the lesser known APIs of Yahoo. At least i didn’t know that it was there although i signed up as a yahoo developer quite some time ago. I couldn’t withstand to testdrive it immediately yesterday evening:

  • It took me 2 minutes to get a new API key
  • Thanks to the availability of an JSON output format and the included batteries of Python, only a few minutes later I was test driving it with sample content from the english and german newswires of dpa. (available to me because of my current employment situation :-)

I was sceptical escpecially wrt. the term extraction for german texts but was more or less pleasantly surprised. On first look it looks quite promising at least as an input feed for specialized classifiers (e.g. for persons, places, organizations etc.).

I definitely will test it on a large scale but first have to come up with an appropriate data model etc. that wil be useful for subsequent steps like cterm classification, term clustering and the likes.

If i find time during my daytime job i’ll try to compare this with dpa’s inhouse term extraction and classification solution.

