On the 20th May the Guardian [1] announced that their "Open Platform" [2], an API to access the vast repository of Guardian media with over a million articles, video clips, photographs and audio tracks, was "open for business". With 36 million people making regular use of the repository, the Open Platform was created to service the rapidly growing demand and provide a high performance interface for media application developers. Over 2000 of them registered to use the API and 200 application products built to use it during the Beta evaluation phase. This exciting API was implemented in Scala using Lucene/Solr [3] for media storage.
Graham Tackley, the Web Platform Development Team Lead for guardian.co.uk, explained, slides here [4], at Eurocon 2010 [5] how they represent their complex 350 table relational database model in Solr for the media storage and used Scala to meet the demanding real-time content searching, indexing or updating. Using actors for example, he explains how they were able to reduce the search index build time from 20 hours to just one. Request patterns, he says, are hard to predict so the ability to easily scale the services was essential. His presentation slides contain more detail on the system architecture and you can find Grahams impressions of Scala on his blog [6].
The developers can create client facing "MicroApps" using Google App Engine [7]. This give the developers a way to create new, highly scalable applications that interface via the Open Platform API to the Guardian repository.
More information on Open Platform can be found on the Guardian Open Platform blog [8] and a slide presentation here [9].
Links:
[1] http://www.guardian.co.uk/
[2] http://www.guardian.co.uk/open-platform/what-is-the-open-platform
[3] http://www.lucidimagination.com/About/Company-News/Guardian-News-and-Media-collaborates-Lucid-Imagination-Next-Generation-Open-Platf
[4] http://www.slideshare.net/openplatform/the-guardian-open-platform-content-api-implementation
[5] http://lucene-eurocon.org/sessions-track1-day1.html
[6] http://blog.tackley.net/
[7] http://googleappengine.blogspot.com/2010/05/microapps-from-guardian-and-google.html
[8] http://www.guardian.co.uk/open-platform/blog
[9] http://www.slideshare.net/openplatform/from-publisher-to-platform-how-the-guardian-used-content-search-and-open-source-to-build-a-powerful-new-business-model