UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

22 March 2013

APIs, data services, and being generous

Traditionally, the online presence of most galleries, archives, libraries and museums have concentrated on delivering access to individual items, directly to users, one by one. This is changing. As more items are either born digital or have excellent digital facsimiles, these organisations (sometimes collectively designated as GLAM) are beginning to offer data access and services in addition to simple direct use. This allows the communities we serve to build great things.

One of the most successful examples is the National Library of Australia's Trove database. Trove provides a rich API, that allows independent developers such as Tim Sherratt (@wragge) to create all sorts of new interfaces for particular needs. These, since they are fitted afresh to each community of users, can be much nearer what Dan Cohen (after Mitchell Whitelaw) has called "generous interfaces". Similarly, the British Library provides various free data services and The National Archives of the UK has started offering direct API access to its discovery systems.

Web archives have tended to focus on the playback of individual web pages, by means of the Wayback machine, and this is what most users are used to. However, for many years now, that same playback infrastructure has been used to develop other data about and interfaces to the content. These APIs allow structured metadata about archival holdings to be retrieved programmatically, and in subsequent posts we'll explore how the Wayback queries and Memento protocols can be used to exploit web archives. (See earlier post about our web-based use of Memento here.)

Alongside these online services, we've also been exploring the possibilities around making metadata datasets available for research and analysis, based on an archive of the UK web for 1996-2010, secured for the nation by the JISC and which we look after. So far we've released an historical geo-index and a data format profile. We're also about to make further, even richer datasets available, based on the same archive, and drawing on the experiences of the AADDA and Big Data projects. Watch this space for more news on these in future posts.

Andy Jackson, Web Archiving Technical Lead (British Library)

Comments

The comments to this entry are closed.

.