UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

Introduction

News and views from the British Library’s web archiving team and guests. Posts about the public UK Web Archive, and since April 2013, about web archiving as part as non-print legal deposit. Editor-in-chief: Jason Webber. Read more

06 October 2022

WARCnet Special Report: Skills, Tools and Knowledge Ecologies in Web Archive Research, 2022

by Sharon Healy, Maynooth University (Project Lead)

WARST report image - skills, tools and knowledge ecologies in web archive research

The WARST team are delighted to announce the publication of a WARCnet Special Report, titled: Skills, Tools and Knowledge Ecologies in Web Archive Research. This study is part of a collaborative project by researchers from Maynooth University, the British Library, the International Internet Preservation Consortium, Bayerische Staatsbibliothek, and the University of Siegen. The research team are all members of Web ARChive studies network researching web domains and events (WARCnet).

The study focuses on individuals around the globe who participate in web archive research, in the context of web archiving, curation, and the use of web archives and archived web content for research or other purposes. We consider web archive research to be representative of the processes and activities described in Archive-It’s web archiving life cycle model from appraisal, acquisition, and preservation, to replay, access, use and reuse (Bragg & Hannah, 2013).

The methodology for the study entailed desk research, participation in WARCnet meeting discussions, and an online questionnaire. The study sought to identify and document the skills, tools, and knowledge required to achieve a broad range of goals within the web archiving life cycle and to explore the challenges for participation in web archive research, and the interludes of such challenges across communities of practice. We suggest that there is a perpetual need to examine the roles of skills, tools, and methods associated with the web archiving life cycle as long as internet, web and software technologies keep advancing, upgrading, and changing.

The Executive summary offers an overview of the findings, and is translated into Danish, French, Spanish and Catalan.

The Report is available to download from WARCnet website:

https://cc.au.dk/fileadmin/dac/Projekter/WARCnet/Healy_et_al_Skills_Tools_and_Knowledge_Ecologies.pdf

A section of the Report that focused on the software, tools and methods used in the web archive research life cycle was presented in a poster at iPres 2022.

05 October 2022

iPres 2022 Conference Report from the UK Web Archive

By Helena Byrne, Nicola Bingham, Dr Andrew Jackson, British Library, Eilidh MacGlone, National Library of Scotland and Caylin Smith, Cambridge University Libraries

IPres2022-logo

iPres is the largest international conference on digital preservation. The conference has been held every year since 2004. The 2022 edition was hosted by the DPC in Glasgow. This meant that the official conference website ipres2022.scot was within scope for the UK Web Archive to preserve. You can view the archived version of the website here: 

https://www.webarchive.org.uk/wayback/archive/20220914105705/https://ipres2022.scot/ 

Screenshot of the iPres 2022 conference website

iPres 2022 was held from Monday 12 to Friday 16 September. There were a mix of presentations over the week with workshops, long papers, short papers, poster presentations and lightning talks as well as show and tell sessions in the form of a ‘Bake Off’. On the final day of the conference, there were a number of site visits to organisations that are running a digital preservation programme. 

This year’s conference also coincided with the 20th anniversary celebrations of the DPC, as well as the DPC Preservation Awards that are held every two years. In 2020, the UK Web Archive won The National Archives (UK) Award for Safeguarding the Digital Legacy at the virtual Digital Preservation Awards 2020 ceremony.

There are also a number of awards given at iPres in various categories. This year’s winner of the Angela Dappert Memorial Award established in 2021, was Dr Andrew Jackson, Technical Lead for the UK Web Archive for his presentation ‘Design Patterns in Digital Preservation: Understanding Information Flows’. 

Many UK Web Archive colleagues from the British Library, National Library of Scotland and Cambridge University Library attended the conference both as delegates and presenters. In this blog post they have reported back on their conference experience.

British Library

Dr Andrew Jackson
As well as presenting my Design Patterns paper, I was also involved in a workshop on format registries in digital preservation. Both sessions were well-attended and seemed to go well, and I’m planning to post about both in more detail in the future. 

I particularly enjoyed the session on DNA storage, especially because of Euan Cochrane’s approach: working with a DNA lab at Yale University to independently verify the work being done by Twist Bioscience.  It’s still a long way from being a storage option we can depend on, but it’s starting to look like it might actually happen!

There were a lot of good quality papers but I particularly enjoyed “Monitoring Bodleian Libraries' Repositories with Micro Services” presented by James Mooney. The overall approach was very similar to how I like to work, from the design of the overall architecture (federated monitoring of resources in situ rather than centralised and ingest-driven) to the style of implementation (microservices combined with best-in-class open source service components).

Nicola Bingham
This was the first iPres conference I have attended. I wish I could have been there in person but due to practicalities, I attended online. Some of my highlights were the presentation from William Kilbride in which he stated that one of the aims of the DPC was to build “the social infrastructure of digital preservation” (as opposed to focussing on technical aspects), which I think has always been true but is now more so than ever especially when it comes to diversifying our archives and enabling communities to have agency in telling their own stories, as articulated by Tamar Evangelista-Dougherty in her keynote. 

Other highlights were hearing from Garth Stewart, Head of Digital Records at National Records Scotland. Garth presented on NRS’s two year project to ingest and make available Scottish Government Cabinet Records and had practical advice for negotiating the transfer of good quality metadata from the depositors - it’s all about gaining trust and explaining to depositors that the quality of metadata provided impacts the experience of the end users. I was also intrigued that they had the challenge of building and maintaining two access solutions, one for journalist access and one for the public. 

A final highlight for me was the long paper, “A Digital Preservation Wikibase” by Kenneth Seals-Nutt of Yale University. Kenneth’s presentation set down the practical steps taken by Yale University Library’s department of digital preservation to implement a Wikibase instance and how this was used to transform a data set related to software into a knowledge base using technologies of the Semantic Web. This is particularly useful to us at the UK Web Archive as we consider the next steps in our web archiving roadmap. 

Helena Byrne
This was my first time attending iPres but I wasn’t able to make it in person so I was delighted that they had an option to join the conference remotely. I was also involved in a collaborative poster presentation with Katharina Schmid (Bayerische Staatsbibliothek) and Sharon Healy (Maynooth University). Our poster ‘Exploring Software, Tools and Methods used in Web Archive Research’ was part of a bigger study that will be published through WARCnet in the coming weeks. 

There were so many great talks, especially around inclusion and diversity in the wider digital preservation field. This along with activism was also a common theme in the three keynotes. These were all very different in scope so it is hard to pick one over the other but I will definitely be watching back over these in the coming weeks and I will share them with colleagues when they are published online.

National Library of Scotland

Eilidh MacGlone
I was grateful to have the opportunity to attend iPres this year. This was my first experience of the conference, and it was a happy one. There were lots of opportunities to meet up with new people and catch up with those I knew from the preservation world. And it was useful! The continuous improvement models are a very handy way to set achievable targets to professionals who are often the only preservationists in their organisation. I know this will be useful to me, even though I am not on my own. I was fascinated to hear about DNA data storage, which although not yet operating at scale, has interesting properties of robustness at room temperature.

You can read more about one of Eilidh’s takeaways from iPres in her blog post - iPres report: a simple workshop exercise using Robust Links.

Cambridge University Library

Caylin Smith
Glasgow 2022 was the second in-person iPres I’ve attended; I previously attended in 2019 when the conference was held in Amsterdam. I was grateful to attend again this year to present about ongoing research as well as catch up with friends and colleagues in the field and meet some new faces. 

Along with Sara Day-Thomson (Edinburgh University Library) and Patricia Falcao (TATE), I led a workshop on the first day of the conference. Titled “Preserving Complex Digital Objects: Revisited”, this workshop picked up on the workshop we gave at iPres in 2019 and focused on supporting the collection management of digital materials for which few or no solutions currently exist. 

There were many great submissions to iPres this year. One paper on the topic of web archiving that stood out to me was “These Crawls Can Talk. Context Information for Web Collections” by Susanne van den Eijkel and Daniel Steinmeier from the KB (National Library of the Netherlands). I’m looking forward to thinking further about their research in the context of web archiving activities at Cambridge University Libraries. 

The next iPres conference will be held in Champaign-Urbana, Illinois in the U.S.A. from September 19-22, 2023.

04 October 2022

iPres report: a simple workshop exercise using Robust Links 

By Eilidh MacGlone, Web Archivist, National Library of Scotland

Inspiration at iPres
I had the opportunity to attend iPres 2022, an international conference dedicated to digital preservation. One of the sessions - Robust Links - run by the Digital Preservation Coalition (DPC), really sparked ideas for me. Robust Links offers anyone the opportunity to make links more permanent and less susceptible to 'link rot'. You add a link and it offers several options, one being to link to a 'memento' version of the web page.

It initially seemed out of reach, a bit too technical; but, listening, I recalled using glitch. It is a platform which can handle JavaScript and style sheets. I have known about Robust Links for a few years, but it delighted me to have it function in a page I built. This step was valuable to me: it helped me phrase the question I need to ask within my own organisation. 

NLS workshop
I was therefore inspired to include Robust Links in this workshop exercise for National Library of Scotland staff. I asked attendees to create another category for an imaginary "Scottish Music collection". I built this with websites we already collect. I was going to share this as a document file, but it became a web page following a quick refresher on HTML. 

Screenshot of the 'scottish music collection' website 

In this way, Robust Links create a kind of distributed collection through “archived near” links without the risk of cutting each other off. Legal deposit items have to be read by one person at a time, which can make a task that shares the same titles a little tricky. It also gives us the chance to talk about how the new categories interact with the original list. Here were our results: 

Screenshot of the results section of the 'scottish music collection' website

It was also a starting point for retrieving information through public directories. These included OSCR, the charities register for Scotland and the Companies House register. Finally, it is a kind of crowd sourcing exercise. More than a quarter (six out of twenty one) were not in the archive. 

Colleagues gave positive feedback about our workshop, and this exercise. I plan to continue developing the idea and would love to hear from anyone making their own version. 

30 September 2022

Celebrating Sporting Heritage Day 2022

By Helena Byrne, Curator of Web Archives, The British Library

NSHD-Facebook-Banner-Sport-Icons-2.jpg-564x339

This blog post gives an overview of our sports related activities for the year to celebrate Sporting Heritage Day 2022 

2022 has been, and continues to be, a really busy year for international sport especially in the UK. The Winter Olympics in Beijing and the Commonwealth Games in Birmingham were  always scheduled to take place in 2022 years in advance. But as the Covid-19 pandemic caused disruption to many events in 2020 and 2021 many sporting events were postponed. The UEFA Women's Euros and the Rugby League World Cup, both hosted by England, were moved from 2021 to 2022, meaning that 2022 was even busier than normal in terms of major sporting events.

Sports has always been an Important part of the UK Web Archive so 2022 has been a busy year for us so far. Since 2017, sports has been grouped into three separate collections. 

Sports Collection - https://www.webarchive.org.uk/en/ukwa/collection/1768 

Sports: Football - https://www.webarchive.org.uk/en/ukwa/collection/1490 

Sports: International Events - https://www.webarchive.org.uk/en/ukwa/collection/2315 

The UK Web Archive regularly publishes blog posts about sport, which can be found here: https://blogs.bl.uk/webarchive/sports/

2022 Winter Olympics and Paralympics

As members of the International Internet Preservation Consortium (IIPC) both the British Library and the National Library of Scotland contributed to the IIPC Content Development Group (CDG) 2022 Winter Olympics and Paralympics collection.

The Olympics took place in Beijing from 4 to 20 February 2022, while the Paralympics were also in Beijing from 4 to 13 March 2022. 

The collection archived 863 items which included whole websites, subsections or individual pages from websites. These items are from 38 countries and 24 different languages are represented in the collection. Topics covered both events on and off the sporting field.

Browse the collection here:

https://archive-it.org/collections/18422 

UEFA Women’s Euro England 2022

The UEFA Women's Euro 2022 competition took place across England from July 6 to July 31, 2022. Although the event is over we are still collecting websites about the Euros from around the UK till the end of October. 

This collection covers both the sporting and cultural achievements of the event. There are over 275 items in the UEFA Women’s Euro England 2022 collection.

So far we have published seven blog posts about the Women’s Euros and there are still more to come. They can be found on the UK Web Archive blog with the sports tag here:

https://blogs.bl.uk/webarchive/sports/ 

Browse the collection here: https://www.webarchive.org.uk/en/ukwa/collection/4278

Commonwealth Games Birmingham 2022

Commonwealth Games Birmingham 2022 ran from 28 July to 8 August. Although the sporting events are over the cultural programme is continuing for a number of weeks. This means that UKWA still has an open call for nominations for this collection.

The collection covers both the sporting and cultural achievements as well as the social impact of this mega event. So far there are 434 items in the Commonwealth Games Birmingham 2022 collection.

Browse the collection here: https://www.webarchive.org.uk/en/ukwa/collection/4228 

Rugby League World Cup 2021

The Rugby League World Cup 2021 will take place from 15 October to 19 November 2022 across England. 

This event is unique in that the men's, women's and wheelchair competition all take place alongside each other. You can nominate your UK published Rugby League World Cup content here: https://www.webarchive.org.uk/nominate 

Updates on this collection will be published on the UK Web Archive blog and Twitter account

When published this collection will sit as a subsection of the Sports: International Events collection on the UKWA Topics & Themes page and will be available here: https://www.webarchive.org.uk/en/ukwa/collection/2315 

Access to the collections 

All of the archived content in the IIPC CDG 2022 Winter Olympics and Paralympics collection is open access. CDG collaborative collections are archived using the Archive-It platform meaning that all archived content is open access, although a publisher may  request its removal under the Internet Archives’ general terms and conditions

All CDG collections can be viewed here: https://archive-it.org/home/IIPC 

UK Web Archive Content has a mix of on-site and remote access due to the Non-Print Legal Deposit Regulations implemented in 2013. The full manifest of  content selected for UK Web Archive collections is visible on the website but access to individual archived websites depends on permission being granted by website publishers.  A note under each title informs users whether they can view the archived website online or whether they need to visit a UK Legal Deposit Library to view the archived content. 

All curated collections can be found on the Topics and Themes page of the UK Web Archive website: https://www.webarchive.org.uk/en/ukwa/category 

Get involved

The UK Web Archive is a partnership of the six UK legal Deposit Libraries and works with other external partners in order to expand  our subject expertise. We can’t curate the whole of the UK web on our own, however - we need your help to ensure that information, discussions and creative output related to sports is preserved for future generations.

Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form: https://www.webarchive.org.uk/nominate 

07 September 2022

GLAM Workbench update

By Andy Jackson, Web Archive Technical Lead, British Library

In 2020, we led a project funded by the International Internet Preservation Consortium (IIPC) called Asking questions with web archives – introductory notebooks for historians, developing a set of Jupyter notebooks to introduce researchers to the potential and possibilities of web archives. In collaboration with the National Library of Australia and National Library of New Zealand, this funding enabled Tim Sherratt to create the Web Archives section of the GLAM Workbench.

Screenshot of GLAM workbench website

We were very happy with how this project worked out, and we think collaborating with someone like Tim opens up new ways of supporting researchers working with web archives. If you’d like to know more about the results of the project, check out Tim’s 2020 blog post and his conference presentation from 2021.

While the investment in project funding got the ball rolling, the GLAM Workbench needs ongoing management and maintenance to keep it running.  This should not be taken for granted, so we’re proud to announce that the Web Archives section of the GLAM Workbench is now supported by the British Library.

We hope this will help ensure this critical resource remains available in the future, and we would like to encourage other web archives to look at whether they could pursue project or supporting funding to help maintain and grow the GLAM Workbench.

08 August 2022

Cats on the web and in the web archive

By Jason Webber, Web Archive Engagement Manager, British Library

Domestic cats have featured in human life for thousands of years. They can make an incredible impact on many of our lives and even if you don’t have one, you may have enjoyed watching, reading or listening about them online. Have you watched a ‘funny cat’ video, laughed at ‘Grumpy cat’ or simply enjoyed a colleague’s pet making an appearance in a work video call?

It seems appropriate , therefore, on International Cat day, to show a few things ‘cat’ that we have in the archive.

Larry the No.10 Cat - Political scrapbook

Screenshot of the Political Scrapbook website with an article about Larry the Downing street cat

www.webarchive.org.uk/wayback/archive/20170708173602/https://politicalscrapbook.net/2011/02/larry-the-downing-street-lolcat/

Battersea Dogs and Cats Home

Pictures of cats for adoption on the battersea dogs and cats home website

https://www.webarchive.org.uk/wayback/archive/20110117154300/http://www.battersea.org.uk/cats/new_cats_gallery/index.html

WW1 cats - Durham at War website

Screenshot of the Durham at War website - ww1 cats

www.webarchive.org.uk/wayback/archive/20170702092930/http://ww1countydurham.blogspot.co.uk/2015/02/wwi-cats-on-internet.html

Cat Behaviour - Cat Protection website

Screenshot of the Cat behaviour page on the cat Protection website

www.webarchive.org.uk/wayback/archive/20170215160335/http://www.cats.org.uk/cat-care/cat-behaviour-hub

Teapots, Teapots, Teapots - Cheshire Cats

Screenshot of the Teapots website showing a cheshire cat teapot

www.webarchive.org.uk/wayback/archive/20141030023644/http://teapotsteapotsteapots.blogspot.co.uk/2009/10/cheshire-cat-teapot.html

Pi-powered cat feeder

Screenshot of Pi-powered cat feeder website

www.webarchive.org.uk/wayback/archive/20141021035626/http://www.raspberrypi.org/pi-powered-cat-feeder/

If you haven't had enough cat chat, take a look at our 2020 blog - Cats v Dogs on the archived web.

Fan of dogs, their day is on the 26th August.

29 July 2022

Web archiving the UEFA Women’s Euros in Wigan

By Georgina Bentley, Service Manager Community-based Customer & Cultural Services at Wigan Council

Image of a jersey commissioned for the Around The Match project hanging over the top of a rusty goal post in a sports field with multiple soccer and rugby pitches.

Introduction

The Heritage Fund awarded £500,000 to a programme which is recording the hidden history of women’s football and launched a celebration of the game, its players, and communities in partnership with the UEFA Women’s EUROs.

Alongside this programme, the UK Web Archive is also archiving UK-published websites about the tournament. In this guest blog post, we hear from Georgina Bentley from Wigan Council about their contribution to the collection.

Wigan Council

Wigan Council is the local authority for the Metropolitan Borough of Wigan in the North West of England. The Council have been one of the 10 host cities for the UEFA Women’s EURO 2022, hosting 4 matches at the Leigh Sports Village.

What did you collect for your museum/archive while working on this project?

From the start we wanted to ensure the stories of our local pioneers were central to our collection approach. Supported brilliantly by our archive volunteers, we established a much deeper understanding of how the game had developed in the borough, whilst a call out for local women and girls to share their stories provided us with the source material from which our heritage projects developed.

We translated this material via a series of creative heritage programmes including temporary exhibitions, contemporary collecting events at the fan parties and projects such as A Place At The Table and Around The Match.

The programme has already increased our existing collection with more coming forward. The material collected to date includes a range of oral histories, memorabilia, photographs, news articles, programmes, alongside the output of the creative heritage projects such as the new kit, pin badge and programme developed for the Around The Match.

What kind of online content did you select for the UK Web Archive collection?

With our content selection we wanted to try and capture the breadth of the heritage programme in the borough as it has been an incredibly rich experience to celebrate the amazing stories of our women and girls that play and love the game. This includes:

  • Event pages from cultural sites.
  • Project websites
  • Online newspapers

What websites are important for telling the story of the UEFA Women’s Euro England 2022 tournament in your area?

The Visit Wigan web page encapsulate the breadth of opportunity the tournament afforded locally to celebrate elite women’s sport and be inspired to participate.

The Around The Match web page tell the story of 11 women and girls brought together to form a new team. Their individual passions and stories beautifully expressed in a wonderful film on the site that also has details of the contemporary memorabilia created to mark the tournament in the borough. The memorabilia is currently for sale, with 100% of the proceeds going to support the grass roots game locally as a lasting legacy.

The A Place At The Table web page follows the history of women’s football both locally and in context to the national and international game. Each table from the project focuses on a point in history that highlights the place of women in football, as well the parallels with the development of rights for women and wider society at the time.

The archived versions of these web pages can be found in the Cultural Programme subsection of the UEFA Women’s Euro England 2022 collection on the UK Web Archive website.

Get Involved

Browse through the UEFA Women’s Euro England 2022 and let us know if there is any UK published content that should be added to the collection. Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nominations form: www.webarchive.org.uk/en/ukwa/info/nominate

 

26 July 2022

Web archiving the UEFA Women’s Euros in Sheffield

By Dr Justine Reilly, Strategic Director, Sporting Heritage

Four different photos of handmade football flags. There are six flags in total. The image is from a partnership event Sporting Heritage hosted with Sheffield Museums. The event was held on Monday 25 July 2022 at the Museum. There were four different sessions where children came together to make football flags.

Introduction
The Heritage Fund awarded £500,000 to a programme which is recording the hidden history of women’s football and has launched a celebration of the game, its players, and communities in partnership with the UEFA Women’s EUROs.

Alongside this programme, the UK Web Archive is archiving UK-published websites about the tournament. In this guest blog post, we hear from Dr Justine Reilly from Sporting Heritage who supported host city Sheffield, with their contribution to the collection.

Sporting Heritage
Sporting Heritage is a UK wide organisation who work to support the preservation, collection, access, celebration of the sporting past. Whether that be objects and archives, photographs and videos, oral histories or song and chants, our role is to support all those who have a sporting story. We deliver a range of activities and events for example training events, National Sporting Heritage Day, and the Sporting Heritage Awards.

What did you collect for your museum/archive while working on this project?
We supported the host city of Sheffield by developing a number of different programmes including:

How did you collect your archive material?
We reached out to local sports clubs and organisations with links to football across Sheffield to inform both our exhibition and our wider activity. This included a social media campaign to draw in voices which have previously been ignored and hidden in the story of women’s football in Sheffield.

We continue to capture new stories via our web pages, and worked closely with partner organisations such as Football Unites, Racism Divides (FURD) and academic researchers Dr Fiona Skillen and Dr Gary James to inform our programming. Our aim was to draw on online content, cross reference historical facts, and hear from lived experience voices which may not have been part of the historical record previously. 

What websites are important for telling the story of the UEFA Women’s Euro England 2022 tournament in your area?
The overarching web pages linking to our heritage content around the Women’s Euro in Sheffield:

https://www.sportingheritage.org.uk/content/category/what-we-do/projects/uefa-womens-euros-22

The linked pages hosted by the City of Sheffield:

https://www.welcometosheffield.co.uk/visit/uefa-women-s-euro-2022/

And FURD pages outlining their work on the physical exhibition plinths and supporting activity:

https://furd.org/news/hidden-history-of-sheffield-womens-football-revealed-in-new-exhibitions

The archived versions of these web pages can be found in the Cultural Programme subsection of the UEFA Women’s Euro England 2022 collection on the UK Web Archive website.

Get Involved
Browse through the UEFA Women’s Euro England 2022 collection and let us know if there is any UK published content that should be added? Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form: www.webarchive.org.uk/en/ukwa/info/nominate