UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

Introduction

News and views from the British Library’s web archiving team and guests. Posts about the public UK Web Archive, and since April 2013, about web archiving as part as non-print legal deposit. Editor-in-chief: Jason Webber. Read more

18 November 2016

Explore Your Archives Week at the UK Web Archive

The UK Web Archive is talking part in the annual Explore Your Archives week organised by The National Archives (TNA) and the Archives and Records Association (ARA). There are different hashtags to use on social media during the week. The UK Web Archive will be tweeting throughout the week using the various hashtags. There is also a chance for you to join in on the conversation on Wednesday 23rd as we reflect on the work we have done in 2016.

How will the UK Web Archive Participate?

Saturday 19 November and Sunday 20 November
#ExploreArchives

This weekend we will be tweeting about the UK Web Archive’s aims and objectives as well as some FAQ’s that come up around copyright and preservation.

Monday 21 November 2016
#Archivepioneers

We will be tweeting about web archiving pioneers

Tuesday 22 November 2016
#hairyarchives

We will try and uncover some of the most interesting hair related pictures from our archive. Also have you ever wondered how many times the words moustache and hipster appears online together? Keep an eye out for all hair related tweets on Tuesday.

Wednesday 23 November 2016
#YearInArchives

2016 has been a very eventful year in politics and in the passing of so many celebreties. Let us know the moments that were important to you?

Tune in for a live chat 1300-1400 (GMT) with the web archivists from the British Library and National Library of Scotland to find out the latest news on the 2016 collections.

The British Library:

Nicola Bingham – Lead Curator of Web Archives – @NicolaJBingham

Jason Webber – Engagement Manager – @UKWebArchive

Helena Byrne – Assistant Web Archivist – @HBee2015

The National Library of Scotland:

Eilidh MacGlone - Web Archivist – @dalmailing

Thursday 24 November 2016
#autoarchives
A key day for transport enthusiasts, keep an eye out for polls on different types of transport and some pictures of some unusual forms of transport.

Friday 25 November 2016
#ArchiveAnimals

The crucial question of cats vs. dogs on the internet will finally be answered.

Saturday 26 and Sunday 27 November 2016
#ExploreArchives

To finish off the week we will have a few more fun facts about the UK Web Archive.

Get tweeting and don’t forget to use the designated hashtags for each day. If you know of any UK based websites that cover these topics, why don’t you nominate them to the archive?

Nominate websites

More information on this event

22 September 2016

Web Archiving Rio 2016 Olympic and Paralympic Games

‘For the Olympics, the whole world is captivated, turns on its television and supports their country’

Introduction
The Olympic and Paralympic Games in Rio de Janeiro, Brazil may be over but it will be some time before they are forgotten about in the press and social media. Web archives play a vital role in preserving the narratives that have come out of these Games. The Content Development Group (CDG) at the International Internet Preservation Consortium (IIPC) has been archiving both the Winter and Summer Games since 2010 and the Rio 2016 Collection will be available in October 2016.

Rio-world-map

Rio 2016 is the first time the CDG has archived events both on and off the playing field making this its biggest collection so far in terms of the number of nominations and geographical coverage. The CDG also enlisted the help of subject experts as well as the general public to nominate sites from countries not usually covered in IIPC collections. As the IIPC only has members in around 33 countries public nominations played an important role in filling this void.

What’s involved?
But what’s involved in web archiving the Olympics? CDG members the British Library and the National Library of Scotland co-hosted a Twitter chat on 10th August 2016 to give an insight on what’s involved. The Twitter chat was based on set questions published in an IIPC blog post with a Q&A session and some time for live nominations. This was an international chat with participants from the USA, Ireland, England, Scotland, Serbia and even Australia. The chat was added to Storify as well as the final archived collection of the Games. Even though the chat was small it helped us to connect with a wider audience and increase the number of public nominations. You can follow updates on this project on Twitter by using the collection hashtag #Rio2016WA.

How can you get involved?
There is still time for you to get involved in web archiving the Olympics and Paralympics. The public nomination form will be open till 23rd September 2016. If you would like to make a nomination you can follow these guidelines. As Carly Lloyd stated above the whole world is captivated by the Olympics now is your opportunity to be part of it.

By Helena Byrne, Assistant Web Archivist, The British Library

15 September 2016

Commemorating the Battle of the Somme in the UK Web Archive

On the 15 September 1916 the the Battle of Flers Courcelette (a phase of the greater Battle of the Somme) commenced. It is mostly famous for the introduction of the tank into battle (to mixed results). Less well known now is that it was the day that the Prime Ministers own son Lt. Raymond Asquith was killed when he went into action with his unit, the 3rd Grenadier Guards. It turned out to be the battalion's bloodiest single day of the war. Asquith's death is recorded in the battalion war diary that I transcribed while I was researching my own Great Grandfather. This website is now saved as part of the UK Web Archive and will be available for future research even if the original goes offline.

THE BATTLE OF THE SOMME, JULY-NOVEMBER 1916 THE BATTLE OF THE SOMME, JULY-NOVEMBER 1916© IWM (CO 802)

Commemorating the Somme and the First World war
The UK Web Archive has been collecting websites about the First World war since 2014 and will continue to do so until at least 2019. So far we have 726 individual websites in the collection, 128 of which are available to view through the public website.

There is already a great range of websites in the collection. Many of them look at memorials linked to places (e.g. Crich parish roll of honour) or individual units (e.g. 36th Ulster Division). Others commemorate individual family members such as William Thomas Clarke.

The home front is not forgotten in projects such as 'A Year in the Life of Avon Dassett' or 'Sunderland in the First World war'.

We need your help!
We welcome any suggestions for making this collection as complete as possible. If you have a UK website that relates to the First World War (or know of one), please let us know through twitter (@ukwebarchive) or our nomination form.

Online resources often only last a few years and the UK Web Archive aims to keep copies of these First World War centenary websites in perpetuity. Help us keep these memories alive.

By Jason Webber, Web Archiving Engagement Manager, The British Library

14 September 2016

Surveying the Domain: Three Days with the Web Archiving Team

I’m Sara Day Thomson, researcher for the Digital Preservation Coalition. We’re a membership organisation who support institutions, like BL, to ensure long-term access to their digital content, no matter what that might be. To support my own professional development and general curiosity, the Web Archiving team at BL let me spend three days with them learning the ins and outs of archiving the Internet.

SDT_DPC_ProfessionalProfile

Web Archiving vs Digital Preservation?
What, you might ask, does web archiving have to do with digital preservation? I would answer: everything. Web Archiving operates at the frontier of capturing and preserving our contemporary cultural and historical record. From the Information Highway to social networking sites, the Internet represents not only our cultural record but the inscription of an evolving technology. As I learned while tinkering with the web archiving ‘machine’, I got a first-hand look at the challenge this creates for archivists who must keep pace with the development of the Web and how people use it.

If you haven’t seen it, I’m the author of the recent Preserving Social Media Technology Watch report. Preserving Social Media presents these same issues faced by organisations who want—or are required—to archive social media content. My three days with the BL team have provided a wider lens to my understanding of the role of social media and what it actually looks like to archive the wider Web.

PresSocMedia_COVER

Three days spent harvesting the Web with the BL team has solidified my view that web archiving is fundamentally an act of digital preservation. Just like many ‘traditional’ digital media, such as PDFs or emails or mp4s, further action must be taken on web content in its native form in order to ensure its long term accessibility. The need for further action for web content is urgent, even more so than for some other digital formats. During just my brief tenure, I came across more than one website that had disappeared since it was last harvested.

Challenges and rewards
Web content is complex—even discussing social media as a single category poses problems because different platforms function in different ways and are governed by varying Terms of Service. While social media has more recently become a dominant player, there’s a whole world of Web out there that isn’t ‘platformized’. Given this diversity—and the likelihood that technology will continue to dramatically alter how we dispense and consume information—web archivists are faced with the challenge of ensuring this content will be useable and comprehensible in the future. This challenge is at the centre of any digital preservation endeavour: it’s not just about saving the bits, or the code, but about preserving meaning.

The BL team are not alone in the effort to save the Web for future generations. While the team is relatively small (smaller than you’d think given the scale of the task), they work closely with their Legal Deposit partners, with curators within BL, with curators without the BL, and with the researchers and other users. The creation of a meaningful record of our lives online requires the input of all of these specialists and is likely to be more successful through open collaboration.

The challenges—and rewards—of digital preservation are best shared, whether it’s for the preservation of digitised manuscripts from the Middle Ages or the emails of the prime minister or a national record of the World Wide Web.

By Sara Day Thomson, Digital Preservation Coalition

18 August 2016

Poetry Goes Online: Preserving poetry journals and zines for the Web archive

I have been working on a Special Collection for the UK Web Archive of UK-based online poetry journals and magazines. My own research at Goldsmiths, where I am completing the first year of a PhD, is concerned with contemporary poetic responses to the increasing ubiquity of the internet and networked culture. This project has been a fantastic opportunity to enrich my own understanding of digital poetry publishing in the UK and develop my research paradigm; I also hope my findings answer some questions regarding digital-only collection strategies for the Library’s on-going Non-Print Legal Deposit (NPLD) responsibilities. In this article I want to share some of my discoveries which will be included in the Special Collection.

The Next Generation of Poetry Journals
My research interests grew out of my experience with poetry communities which had emerged out of, and operated entirely within, digital spaces: participants used social media for networking, collaboration and promotion, taking advantage of cheap web hosting and free blog domains to publish zines and chapbooks. For a younger generation of digitally-native poets growing up in an era of cuts to arts funding (and perhaps less sentimental about print culture), the internet provides the easiest and cheapest method to publish, be read and to interact with other poets. It also provides a space for groups often excluded or underrepresented in print publishing. tender is an exceptional example of this latest generation of online journals; published quarterly as a highly-polished downloadable PDF file, it features a curated selection of original art, poetry, prose and interviews made exclusively by female-identified writers and artists.

Fig1_tender
Fig 1

Making a Break from Print Culture
Unlike print publishing - with its propensities for the risk-averse and the commercial - the effectively free status of online publishing encourages greater formal and thematic experimentalism. For Every Year, for example, is publishing an original piece of prose, poetry or “something else” for every year since 1400 – they have already made it to the year 1821 and show no signs of stopping anytime soon. Other thematically adventurous publications in the collection include Visual Verse, a zine based entirely around ekphrastic writing; and PracCrit, a journal which publishes original poems juxtaposed with essayistic responses from other poets. Many of these publications are - like much online activity - international in outlook, with contributors hailing from around the globe. The lack of a clear geographical home for certain journals opens up a number of problems regarding NPLD scope, which is limited to the preservation of UK publications.

Fig2_foreveryyear
Fig 2
Fig3_praccrit
Fig 3

The Changing Digital Landscape
Examining the brief history of online poetry also charts a broader history of internet publishing trends, as the infrastructure of online spaces evolves with each successive technological shift. The simplistic text and image sites of “Web 1.0” have been replaced with increasingly sophisticated interfaces and professional graphic design as internet culture comes of age (Footballpoetry).

Footballpoetry2005
In 2005
Footballpoetrytoday
In 2016

Elsewhere, journals like Conversation Poetry are published via Issuu, a skeuomorphic digital publishing platform which mimics the physical properties of a print publication. Conversely – and perhaps more interestingly - journals such as Proof in the UK and The Claudius App in the US foreground the aesthetics of their own digitality. Through the utilisation of multimedia platforms like Java and Flash, these journals aim to make the experience of reading itself consonant with the interactive, dynamic nature of computational technology.

These too present some of the greatest challenges for the Web Archive moving forward, since even advanced web crawlers have limitations when archiving plugins and streaming media content (although new advances in archiving technology show promise). As part of the broader born-digital genre of e-literature, these new experiments mark a break with traditional “bookbound” forms, and may offer a glimpse of the future of literary arts. Look out for the collection on the Web Archive in the next few months.

By Joe McCarney, PhD Placement Student, Goldsmiths, University of London

17 August 2016

Tender to Redevelop the UK Web Archive Website

The UK Web Archive (based at The British Library) is looking to appoint a superb User Experience (UX) company to help us improve our Web Archive service to the public and facilitate high quality academic research.

The project should be open source and bring an innovative and engaging interface to our web archive collections. The project should also integrate with our work on full-text search and trend analysis (see www.webarchive.org.uk/shine).

For a copy of the Invitation to Tender and how to respond please visit the BL eTendering Portal at the following link:

https://bl.bravosolution.co.uk; at the Home Page, under ‘Opportunities and notices’ please click on ‘View current opportunities and notices’.

If you wish to view/download the documents and are not registered on Bravo, please follow the instructions below (registration is free). If you are register please go to step 2.

1. Register your company on the e-tendering portal (this is only required once).

Select the ‘Login or register to participate’ link above and click the ‘Click here to register’ link on the home page.

Accept the terms and conditions and click continue,
Enter your correct business and user details,
Note the username you chose and click Save when complete,
You will shortly receive an email with your unique password (please keep this secure).

2. Respond to the ITT

Login to the portal with the username/password,
Click the ITT's Open To All Suppliers link.
Click on the relevant Tender
Click the Express Interest button in the Actions box on the left-hand side of the page,
This will move the ITT into your My ITT's page. (This is a secure area reserved for your projects only),
You can now access any attachments by clicking the Settings and Buyer Attachments in the Actions box

3. Responding to the ITT

You can now choose to Reply or Reject (please give a reason if rejecting),
You must use the Messages function to communicate to the Library and seek any clarification,
Note the deadline for completion, then follow the onscreen instructions to complete your response to the ITT.

You must then publish your reply using the publish button in the Actions box on the left-hand side of the page.

Note: If you have any questions regarding the tender, please do so through the portal.

By Jason Webber, Web Archiving Engagement Manager

27 June 2016

Capturing and Preserving the EU Referendum Debate (Brexit)

Following the announcement in May 2015 that there would be a referendum on the UK’s EU membership; the Legal Deposit UK Web Archive, led by curators at the Bodleian Libraries, started a collection of websites.

The team of curators includes contributors from the Bodleian Libraries, The British Library, the National Libraries of Scotland and Wales and also Queen’s University Belfast (for the Northern Ireland perspective) and the London School of Economics (for capturing and preserving individual documents, such as the pdf versions of campaigning leaflets). 

The collection scope is to capture the ‘Brexit’ debate and the debate around the EU Referendum as well as the wider context of UK/EU relations, including:

  • Media coverage,
  • websites of political parties and other political institutions and groups
  • campaigning and lobbying
  • trade unions, professional organisations, businesses
  • academic debate
  • culture and arts
  • public opinion through blogs, comments, and if possible social media.

We primarily archive UK websites under the Non-Print Legal Deposit mandate, but also decided to include some sites outside the UK, if relevant – e.g. websites of UK expats in Europe, or political parties, interest groups and think tanks in the EU and in EU member states – on a permission basis.

The collection (at the time of writing) has 2590 target websites. Some of these are whole websites; others will be a single news story or blog post.

Access and availability
The majority of the collection will be available in the reading rooms of UK Legal Deposit libraries, including both British Library sites. As is usual for web archive collections, there is a delay between collection and availability of up to a year.

By Svenja Kunze, Project Archivist, Bodleian Libraries (Oxford University)

17 May 2016

Saving BBC Recipes Website

There's been much coverage today of plans to remove the recipe pages from the BBC website.

6018503713_573fccc22a_z

The UK Web Archive has been collecting selected pages from the BBC, mainly news, for over ten years and since 2013 we have attempted to capture the entirety of the BBC web estate. A small number of pages are available on the Open UK Web Archive website. Most of the BBC's online presence, however, is only available in the reading rooms of UK Legal Deposit libraries, including both of the British Library sites at St. Pancras and Boston Spa in Yorkshire.

We have today instigated a further crawl of the BBC website with the specific aim of ensuring that we save the recipes from the food pages. We can also report that the Internet Archive, Library of Alexandria and the National Library of Iceland have also captured these pages so their future is assured.

Polly Russell, British Library Curator and Food Historian says 

"Cookery books, like cookery websites, obviously serve a practical purpose but that is not all. For historians, sociologists and anthropologists they also tell us about people's culinary aspirations and anxieties, cultural tastes and trends, dietary preoccupations, social expectations and economic conditions. They are, therefore, a rich source for researchers. So while it's sad news to hear about plans to close the much trusted and well-loved BBC Food website, it's a relief that the British Library is going to be able to archive the website for posterity."