UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites


News and views from the British Library’s web archiving team and guests. Posts about the public UK Web Archive, and since April 2013, about web archiving as part as non-print legal deposit. Editor-in-chief: Jason Webber. Read more

22 September 2016

Web Archiving Rio 2016 Olympic and Paralympic Games

‘For the Olympics, the whole world is captivated, turns on its television and supports their country’

The Olympic and Paralympic Games in Rio de Janeiro, Brazil may be over but it will be some time before they are forgotten about in the press and social media. Web archives play a vital role in preserving the narratives that have come out of these Games. The Content Development Group (CDG) at the International Internet Preservation Consortium (IIPC) has been archiving both the Winter and Summer Games since 2010 and the Rio 2016 Collection will be available in October 2016.


Rio 2016 is the first time the CDG has archived events both on and off the playing field making this its biggest collection so far in terms of the number of nominations and geographical coverage. The CDG also enlisted the help of subject experts as well as the general public to nominate sites from countries not usually covered in IIPC collections. As the IIPC only has members in around 33 countries public nominations played an important role in filling this void.

What’s involved?
But what’s involved in web archiving the Olympics? CDG members the British Library and the National Library of Scotland co-hosted a Twitter chat on 10th August 2016 to give an insight on what’s involved. The Twitter chat was based on set questions published in an IIPC blog post with a Q&A session and some time for live nominations. This was an international chat with participants from the USA, Ireland, England, Scotland, Serbia and even Australia. The chat was added to Storify as well as the final archived collection of the Games. Even though the chat was small it helped us to connect with a wider audience and increase the number of public nominations. You can follow updates on this project on Twitter by using the collection hashtag #Rio2016WA.

How can you get involved?
There is still time for you to get involved in web archiving the Olympics and Paralympics. The public nomination form will be open till 23rd September 2016. If you would like to make a nomination you can follow these guidelines. As Carly Lloyd stated above the whole world is captivated by the Olympics now is your opportunity to be part of it.

By Helena Byrne, Assistant Web Archivist, The British Library

15 September 2016

Commemorating the Battle of the Somme in the UK Web Archive

On the 15 September 1916 the the Battle of Flers Courcelette (a phase of the greater Battle of the Somme) commenced. It is mostly famous for the introduction of the tank into battle (to mixed results). Less well known now is that it was the day that the Prime Ministers own son Lt. Raymond Asquith was killed when he went into action with his unit, the 3rd Grenadier Guards. It turned out to be the battalion's bloodiest single day of the war. Asquith's death is recorded in the battalion war diary that I transcribed while I was researching my own Great Grandfather. This website is now saved as part of the UK Web Archive and will be available for future research even if the original goes offline.


Commemorating the Somme and the First World war
The UK Web Archive has been collecting websites about the First World war since 2014 and will continue to do so until at least 2019. So far we have 726 individual websites in the collection, 128 of which are available to view through the public website.

There is already a great range of websites in the collection. Many of them look at memorials linked to places (e.g. Crich parish roll of honour) or individual units (e.g. 36th Ulster Division). Others commemorate individual family members such as William Thomas Clarke.

The home front is not forgotten in projects such as 'A Year in the Life of Avon Dassett' or 'Sunderland in the First World war'.

We need your help!
We welcome any suggestions for making this collection as complete as possible. If you have a UK website that relates to the First World War (or know of one), please let us know through twitter (@ukwebarchive) or our nomination form.

Online resources often only last a few years and the UK Web Archive aims to keep copies of these First World War centenary websites in perpetuity. Help us keep these memories alive.

By Jason Webber, Web Archiving Engagement Manager, The British Library

14 September 2016

Surveying the Domain: Three Days with the Web Archiving Team

I’m Sara Day Thomson, researcher for the Digital Preservation Coalition. We’re a membership organisation who support institutions, like BL, to ensure long-term access to their digital content, no matter what that might be. To support my own professional development and general curiosity, the Web Archiving team at BL let me spend three days with them learning the ins and outs of archiving the Internet.


Web Archiving vs Digital Preservation?
What, you might ask, does web archiving have to do with digital preservation? I would answer: everything. Web Archiving operates at the frontier of capturing and preserving our contemporary cultural and historical record. From the Information Highway to social networking sites, the Internet represents not only our cultural record but the inscription of an evolving technology. As I learned while tinkering with the web archiving ‘machine’, I got a first-hand look at the challenge this creates for archivists who must keep pace with the development of the Web and how people use it.

If you haven’t seen it, I’m the author of the recent Preserving Social Media Technology Watch report. Preserving Social Media presents these same issues faced by organisations who want—or are required—to archive social media content. My three days with the BL team have provided a wider lens to my understanding of the role of social media and what it actually looks like to archive the wider Web.


Three days spent harvesting the Web with the BL team has solidified my view that web archiving is fundamentally an act of digital preservation. Just like many ‘traditional’ digital media, such as PDFs or emails or mp4s, further action must be taken on web content in its native form in order to ensure its long term accessibility. The need for further action for web content is urgent, even more so than for some other digital formats. During just my brief tenure, I came across more than one website that had disappeared since it was last harvested.

Challenges and rewards
Web content is complex—even discussing social media as a single category poses problems because different platforms function in different ways and are governed by varying Terms of Service. While social media has more recently become a dominant player, there’s a whole world of Web out there that isn’t ‘platformized’. Given this diversity—and the likelihood that technology will continue to dramatically alter how we dispense and consume information—web archivists are faced with the challenge of ensuring this content will be useable and comprehensible in the future. This challenge is at the centre of any digital preservation endeavour: it’s not just about saving the bits, or the code, but about preserving meaning.

The BL team are not alone in the effort to save the Web for future generations. While the team is relatively small (smaller than you’d think given the scale of the task), they work closely with their Legal Deposit partners, with curators within BL, with curators without the BL, and with the researchers and other users. The creation of a meaningful record of our lives online requires the input of all of these specialists and is likely to be more successful through open collaboration.

The challenges—and rewards—of digital preservation are best shared, whether it’s for the preservation of digitised manuscripts from the Middle Ages or the emails of the prime minister or a national record of the World Wide Web.

By Sara Day Thomson, Digital Preservation Coalition