UK Web Archive blog

Introduction

The UK web is one of the most important aspects of the nation’s digital record. But the web is extremely vulnerable, and websites can and do disappear frequently. Preserving them, and providing access to those preserved versions, have become matters of urgency and strategic importance.

Read more

20 November 2019

Militarism and its role in the commemoration of British war dead

By Liam Markey, Collaborative Doctoral Student, University of Liverpool

Mediating Militarism: Chronicling 100 Years of Military Victimhood from Print to Digital, 1918-2018 is an ESRC funded CASE studentship in collaboration with the University of Liverpool and the British Library. The project aims to assess militarism and its role in the commemoration of the British war dead since the end of the First World War.
By taking advantage of unique access to print and digital materials captured and held by the British Library my aim is to chronicle the changing public portrayal of the British war dead from the print to the digital age, evaluating the role this portrayal plays in the mediation of militarism in the process.

Memorial

What is Militarism?
Militarism, generally defined as the glorifying of war and invasion of the civilian sphere by military ideals, manifests itself in a variety of ways that depend heavily on contemporary politics, alongside both military and social developments. In the case of Britain, national narratives surrounding the First World War have played a key role in the development of the nation’s own form of militarism.

The nature of Britain’s involvement in the First World War meant that following the Armistice of 11th November 1918, a multitude of commemorative practices were developed in order to facilitate the mourning of an entire nation. British soldiers who had died abroad were not repatriated following the war, meaning tangible sites of mourning, such as the Cenotaph in London, were created as focal points of British remembrance. A unique language and symbology surrounding the commemoration of the war dead developed. Fallen soldiers began to be venerated as almost Christ-like figures, and symbols such as the poppy became tangible representations of commemoration, these practices continue into the present day and have saturated British attitudes to the military and the waging of war.

UK Web Archive
How, then, can the UK Web Archive assist in the development of this research project? Websites curated by the archive provide us with a valuable look at how ordinary British people and communities interact with these commemorative practices, and I am interested in looking at how the language and symbols popularised over the past century are reproduced, for example, in amateur websites. One of the big questions I have been asking as I carry out my research is how the First World War leaving living memory has affected the function of these practices.

Using the UK Web Archive to assess the British discourse around those who were killed in the war, be it regarding a family member or a soldier who served in a local regiment, will prove fascinating when interrogating ideas such as the sanitisation and trivialisation of war.

Questions?
Does language steeped in religious rhetoric glorify war, representing the saturation of British commemorative practices with militarism over the past century, or instead are they an insight into the more personal and isolated forms of commemoration distinct from national narratives we are presented with in the media? Does an excessive use of the poppy on both amateur and media websites reflect this potent symbol’s original meaning or has it been hijacked to serve more nationalistic and militaristic purposes?

Materials collected by the UK Web Archive will prove invaluable in answering these questions.

04 October 2019

UKWA Website Crawl - One hour in One minute

By Jason Webber, Web Archive Engagement Manager, The British Library

Each year we attempt to collect as much of the UK web space as we can. This typically involves millions of websites and billions of individual assets (images, pdf's, css files etc.). We send out our robots across the interwebs looking for websites that we can archive. The bots follow links to pages that have links to follow and it keeps going until we have archived (almost) everything. But what does it look like to 'crawl' the web? Here we have condensed an hour of live web crawling into a one minute video:

Every circle is a different website, and every line represents a link that was followed between websites. The size of the circle represents how many pages we visited from that site, and the width of the line represents the number of links we followed.

If you want to see what we are crawling at the moment, look here (NOTE: this link only works while we are crawling the web): https://jumbled-eggplant.glitch.me/graph.html

You can see what we have captured at our website (www.webarchive.org.uk/ukwa/), however, many of the sites themselves can only be viewed in the reading rooms of UK Legal Deposit Libraries. 

Despite our best efforts we can't collect every UK owned website as many are hosted abroad and not under a .UK (looking at you wordpress, squarespace and wix). You can nominate a website here: https://www.webarchive.org.uk/en/ukwa/info/nominate

30 September 2019

The Magic of Wimbledon in the UK Web Archive

By Robert McNicol, Librarian at the Wimbledon Lawn Tennis Museum

341099

The magic of Wimbledon is its ability to preserve its history and tradition while simultaneously embracing the future. When you enter the Grounds of The All England Lawn Tennis Club, you know you’re somewhere special. It’s the spiritual home of the sport and you can feel the history all around you. And yet Wimbledon in 2019 is also a thoroughly modern sporting venue with state-of-the-art facilities for players, spectators, officials and broadcasters. While Wimbledon loves its traditions (the grass courts, the all-white clothing, the strawberries & cream), it has always been looking ahead as well. From the very first Lawn Tennis Championships in 1877, to the introduction of Open tennis in 1968, to the building of roofs on Centre and No.1 Courts. Wimbledon is both the past and the future of tennis.

It’s in this same spirit that the Kenneth Ritchie Wimbledon Library has teamed up with the British Library to curate a collection of tennis websites for the UK web archive. This is a subsection of the much larger Sports Collection on the UK Web Archive Website. Using the latest technology to preserve the past, it’s a project that captures the essence of Wimbledon.

Naturally, one of the first websites we added to the Tennis collection was our own. Wimbledon.com was established in 1995 and is very excited to be celebrating its 25th anniversary next year.  This project ensures that, in future, researchers will be able to go back and search the contents of the Wimbledon website from previous years. We have also archived some Wimbledon social media feeds, including the Twitter feed of the Wimbledon Lawn Tennis Museum, of which the Library is part.

However, the ultimate aim is to archive a complete collection of UK-based tennis websites. This will include sites belonging to governing bodies, clubs, media and individual players. One part of the project already completed is to archive the Twitter feeds of all British players with a world ranking. From Andy Murray and Johanna Konta to Finn Bass and Blu Baker, every British player with a Twitter account has had it saved for posterity!

If you want to hear more about the project, you may be interested in attending Wimbledon’s Tennis History Conference on Saturday 9 November, where Helena Byrne (Curator of Web Archiving at the British Library) will be joining me to do a joint presentation.

And if you’d like to know more about the Wimbledon Library, feel free to get in touch. We’re the world’s biggest and best tennis library, holding thousands of books, periodicals and programmes from more than 90 different countries. We’re open by appointment to anyone with an interest in researching tennis history. https://www.wimbledon.com/en_GB/atoz/library_research_enquiries.html

Finally, if you’d like to nominate a tennis or other sporting websites for us to archive, go to our Save a UK website form: https://www.webarchive.org.uk/en/ukwa/info/nominate

16 July 2019

Summer Placement with the UK Web Archive

By Isobelle Degale, Masters student, University of Sussex

My summer placement at the British Library is now coming to an end. As a Masters student studying Human Rights, I contacted the UK Web Archiving team based at the British Library as a way to enrich my understanding of the sources available on London policing, specifically looking at stop and search procedure.

BL-porthole

The first few days of the placement I learnt how to add online content onto the UK Web Archives using the 'Annotation and Curation Tool' (ACT). I learnt how to add 'targets' (web addresses) to the web archive using ACT and the importance of crawl frequency of different sources. Over the last few weeks I have been researching and selecting content to add to the online collections: Black and Asian Britain and Caribbean Communities in the UK.

Having previously studied history, including the impact of the British Empire during my undergraduate degree, I also have an interest in the Windrush generation and have been selecting content such as websites, podcast links, videos and documentaries. I have also gained hands on experience in web archiving through emailing website authors requesting permission for open access  of their content.

As my summer dissertation discusses discrimination and disproportionality of London stop and searches, I have also been adding related content to the UK Web Archive. I have gathered content such as news articles, twitter accounts of activists, grassroots websites and publications from racial equality think tanks that highlight the disproportionality of stop and searches on young BME (Black and Ethnic Minorities) peoples and communities, which is the central debate of this topic. My dissertation specifically explores the experiences and perspectives of those stopped and searched. I have noted that there is a gap on the web which explores and expresses the opinions of those who are more likely to be stopped, despite the abundance of news reports and statistics on the topic.

My experience with the web archiving team has opened up my thoughts to the value of archiving online content, as with the breadth and depth of the web, socially and culturally important web sites can easily be overlooked if not archived.

I hope that my contribution over the weeks will be useful in documenting the cultural and social celebration of black and Asian communities in Britain, but also demonstrating that there are negative experiences of black and ethnic minority Britons that make up an important part of daily life and should not be ignored. As a human rights student I feel that it is important in recognising inequality in both past and present Britain. I am, therefore, grateful to the Web Archiving team for the opportunity to add to the UK Web Archive the much debated topic of London stop and searches that will hopefully provide insight and information into the subject.

12 June 2019

Trains, Tea, Depression and Cats – what do UK Interactive Fiction writers write about?

Background
Works of Interactive Fiction (IF) are stories that allow the user to guide or affect the narrative by making choices, clicking links or otherwise navigating the text. As part of the 'Emerging Formats project', I’ve been investigating UK Interactive Fiction (IF) in order to help determine potential collecting priorities, and attempting to collect works for the UK Web Archive. This has allowed me to discover who is creating interactive fiction, what kinds are they creating, and what tools they’re using.

As might be expected, IF creators come from a wide variety of backgrounds and create works for a wide variety of reasons. Some create interactive fiction to educate, to experiment with particular IF tools, to create portfolio pieces to support their professional writing and design careers, to sell their work, or simply to share an idea or story. Both the tools used and the approaches to using these tools are extremely diverse, as were the genres and topics covered by the works (see fig 1) However, as I studied the 200+ items found during the course of the project, several recurring themes began to emerge.

Chart

Themes
Public transport features in a number of works, but trains are represented particularly strongly. In both Eric Eve’s, Nightfall and Jonathan Laury’s Ostrich trains are indicative of wider problems in society. Nightfall is a thriller where an unknown threat lurks in the city and that threat is made all the more ominous by the fact that the protagonist’s only means of escape – the train, is cut off at the opening of the story. In Ostrich, much of the scene-setting takes place on the protagonist’s commutes to and from work. As a totalitarian regime gradually takes over, who is and isn’t on the train, and how they behave during the journey becomes increasingly crucial. (Ostrich was created following the British Library’s Interactive Fiction Summer School in 2018).

In Journey Through Your Final Dream by Sammi Narramore and Awake the Mighty Dread by Lyle Skains, trains are presented as a dreamlike (or nightmarish) liminal space. Both works play around with the idea of falling asleep on a train and awaking disoriented and unsure whether the dream is truly over.

Many of the works live up to a particularly British stereotype by foregrounding tea. Joey Jones’ Strained Tea asks the user to perform the simple task of making a cup of tea. However, as this is a parser-based piece, the only commands available are to ‘take’, ‘put’ and move using the compass directions, turning this everyday act into a fiendishly difficult puzzle. Tom Sykes’ Fog Lights and Foul Deeds is a Lovecraftian tale set on a narrowboat, where the player-character and his crew must face the horrors lurking in the canal as they struggle to reach their destination. Tea serves as a resource which bolsters the crew’s resolve, increasing their morale and improving their chances of survival. Providing enough tea to keep the team sane, while also ensuring supplies don’t run out before journey’s end is a careful balancing act throughout the piece. Damon Wakes’ Lovely Pleasant Teatime Simulator begins as a very civilised affair where the reader-player may take tea with scones and compliment their host’s décor, but as the tea keeps coming and the banal chitchat loops around and around, new choices begin to emerge which will end the tea party in a variety of scandalous ways.

A huge number of the works deal with personal issues and experiences, with gender and sexuality occurring often, but mental health being the most prevalent topic. Many creators use the interactive affordances available to them to help convey how it feels to suffer from mental health issues. Joseph J Clark’s Depression Simulator is a short, looping piece in which no matter what option is chosen, text appears which reads: ‘You can’t. You are too sad’. Miles Aijala’s Fatigue takes a similar approach in that when the option ‘go out’ is selected, the text changes to ‘haha yeah right’ becoming greyed out and unselectable. Emma Winston’s What it Feels Like in Here is a poetic, meditative piece in which the reader-player guides a somewhat abstract avatar through a series of rooms which become smaller and smaller and darker and darker, echoing the feelings of anxiety, depression and claustrophobia discussed.

Pepito

Since these works live on the internet, it’s perhaps no surprise that many are replete with cats. Creator Ben Bruce’s work is very cat-oriented, with highlights including Bedtime, Kitties, Said the Witch, where a gathering of talking felines pester their witch owner for a bedtime story, and Something That a Cat Once Told Me About Midnight, a legend translated from the original cat about why time behaves strangely around midnight. Freya Campbell’s Pépito incorporates the Twitterbot @PepitoTheCat into a Bitsy story to imagine a day in the life of an internet cat and reflect on the death of her own pets.

Finally, many of the works are self-reflexive and describe either the experience of writing interactive fiction, reading it, or being involved with its community. The Cat Demands by Adam Hay not only features an attention-seeking cat, it’s also about a Twine author’s struggles to complete their latest piece. A Short Journey by Cameron Home critiques the structure of many interactive narratives and questions whether what they’re offering can really be considered ‘choice’.

While the works themselves may offer only an illusion of choice, the collection as a whole offers a genuine range of works to choose from. I hope you’ll explore them via the UK Web Archive, and Webrecorder, or in their original locations using the links in this blog post. (Please note that as this is an experimental project, some works may not be fully accessible via the Web Archive. For the best viewing experience, visit the live versions of the sites).

By Lynda Clark, Innovation Placement, The British Library - @notagoth

28 May 2019

FIFA Women’s World Cup and the UK Web Archive

The 2019 FIFA Women’s World Cup will take place in France from the 7th June to the 7th July 2019. Although women's world cups date back as far as the early 1970s, the FIFA Women’s World Cup was only established in 1991. This is the fifth time that England have qualified for the FIFA World Cup but it is a first for Scotland who also join England in Group D of the competition.

Traditionally, women’s sport and in particular football is not well represented in the mainstream media but this is slowly starting to change. Coverage of events such as the FIFA Women’s World Cup is increasing, one way to gauge this is to see how many resources on the .uk web were archived. This trend graph on the UK Web Archive Shine interface, which contains all the archived .uk websites from 1996 to April 2013 shows that for each of the World Cup years that there was an increase in coverage on the .uk webspace. By clicking at a point in the graph a sample of up to 100 websites appears below the graph. There were four competitions (1999, 2003, 2007 and 2011) held during the period 1996 and 2013, but England was the only country from the UK to qualify in the 2007 and 2011 competitions. Thus, it is not surprising that there are just 11 references to “FIFA Women’s World Cup” in 1999 while there were 4,930 in 2011 on Shine Trends.

FIFA-Shine-01

Link to graph.

The UK Web Archive aims to archive the UK web space. It does this through curating collections and an annual domain crawl, which has been running since 2013 when the Non-Print Legal Deposit Regulations came into force in April 2013. Sport is a popular subject on the web, however, it is a subject area that is underrepresented in many traditional libraries and archives. The UK Web Archive works across the six UK legal Deposit Libraries and with other external partners to try and bridge gaps in our subject expertise. We have three curated collections related to sport, one of which is dedicated to the many codes of football. These collections don’t differentiate by gender but balance between male and female representation in the collections will be skewed due to the lack of gender equality that exists in all parts of society, including the news industry. According to a UNESCO report ‘only 12 percent of sports news is presented by women worldwide, and only four percent of media content is dedicated to women's sports’.

FIFA Women Image (1)

Mega sporting events like the FIFA Women’s World Cup generates a lot of ephemeral material both in print and online. On average the lifespan for a webpage is 100 days and unless it is archived, it could disappear forever. Have you spotted any UK published web content related to the England, Scotland, Germany, USA or the odds-on favourite Japan? Then fill in our Public Nomination Form and it will be added soon after:

Nominate your website.

The only criteria that nominations to the UK Web Archive have to pass, is that the content is published from the UK (but it doesn’t have to be in English, there are multiple languages in the archive) and that it is not predominately audio-visual based platforms like Soundcloud and YouTube. Although, social media does fall into scope for Non-Print Legal Deposit with the exception of Twitter other platforms are very difficult to archive and we haven’t been able to archive Facebook since 2015.

Browse through the UK Web Archive Sports: Football collection and see if we have your local club website or Twitter account, your favourite fan sites and any other football related content you enjoy viewing. Feel free to nominate your website.

The British Library is currently hosting the (FREE) exhibition: 'An Unsuitable Game for Ladies: A Century of Women's Football' (14 May – 1 September 2019).

by Helena Byrne, Curator of Web Archiving, The British Library

29 March 2019

Collecting Interactive Fiction

Intro
Works of interactive fiction are  stories where the reader/player can guide or affect the narrative in some way. This can be through turning to a specific page as in 'Choose Your Own Adventure', or clicking a link or typing text in digital works. 

Archiving Interactive Fiction
Attempts to archive UK-made interactive fiction began with an exploration of the affordances of a couple of different tools. The British Library’s own ACT (Annotation Curation Tool), and Rhizome’s WebRecorder. ACT is a system which interfaces with the Internet Archive’s Heritrix crawl engine to provide large scale captures of the UK Web. Webrecorder instead focusses on much smaller scale, higher fidelity captures which include video, audio and other multimedia content. All types of interactive fiction (parser, hypertext, choice-based and multimodal) were tested with both ACT and Webrecorder in order to determine tools which were best suited to which types of content. It should be noted that this project is experimental and ongoing, and as a result, all assertions and suggestions made here are provisional and will not necessarily affect or influence Library collection policy or the final collection. As yet, Webrecorder files do not form part of standard Library collections.

Cat_Simulator

For most parser-based works (those made with Inform 7), Webrecorder appears to work best. It is generally more time-consuming to obtain captures in Webrecorder than in ACT as each page element has to be clicked manually (or at least, the top level page in each branch must be visited) in order to create a fully replayable record. However, this is not the case with most Inform 7 works. For the vast majority, visiting the title page and pressing space bar was sufficient to capture the entire work. The works are then fully replayable in the capture, with users able to type any valid commands in any order. ACT failed to capture most parser works, but there were some successes. For example, Elizabeth Smyth’s Inform 7 game 1k Cupid was fully replayable in ACT, while Robin Johnson’s custom-made Aunts and Butlers also retained full functionality. Unfortunately, games made with Quest failed to capture with either tool.

Another form which appears to be currently unarchivable are those works which make use of live data such as location information, maps or other online resources. Matt Bryden’s Poetry Map failed to capture in ACT, and in Webrecorder although the poems themselves were retained, the background maps were lost. Similarly, Kate Pullinger’s Breathe was recorded successfully with WebRecorder, but naturally only the default text, rather than the adaptive, location-based information is present. Archiving alternative resources such as blogs describing the works may be necessary for these pieces until another solution is found. However, even where these works don’t capture as intended, running them through ACT may still have benefits. A functional version of J.R. Carpenter’s This Is A Picture of Wind, which makes use of live wind data, could not be captured, but crawling it obtained a sample thumbnail which indicates how the poems display in the live version – something which would not have been possible using Webrecorder alone.

Choice-based works made with Ink generally captured well with ACT, although Isak Grozny’s dripping with the waters of SHEOL required Webrecorder. This could be due to the dynamic menus, the use of javascript, or because Autorun has been enabled on itch.io, all of which can prevent ACT from crawling effectively. ChoiceScript games were difficult to capture with either tool for various reasons. Firstly, those which are paywalled could not be captured. Secondly, the manner in which the files are hosted appears to affect capture. When hosted as a folder of individual files rather than as a single compiled html file, the works could only be captured with Webrecorder’s Firefox emulator, and even then, the page crashes frequently. Those which had been compiled appeared to capture equally well with either tool.

Twine works generally capture reasonably well with ACT. ACT is probably the best choice for larger Twines in particular, as capturing a large number of branches quickly becomes extremely time-consuming in Webrecorder. Works which rely on images and video to tell their story, such as Chris Godber’s Glitch, however, retain a greater deal of their functionality if recorded in Webrecorder. As the game is somewhat sprawling, a route was planned through which would give a good idea of the game’s flavour while avoiding excessively long capture times. Webrecorder also contains an emulator of an older version of Firefox which is compatible with older javascript functions and Flash. This allowed for archiving of works which would have otherwise failed to capture, such as Emma Winston’s Cat Simulator 3000 and Daniel Goodbrey’s Icarus Needs.

As alluded to above, using the two tools in tandem is probably the best way to ensure these digital works of fiction are not lost. However, creators are advised to archive their own work too, either by nominating web pages to the UKWA, capturing content with Webrecorder, or saving pages with the Internet Archive’s Wayback Machine.

By Lynda Clark, Innovation Placement, The British Library - @notagoth

21 March 2019

Save UK Published Google + Accounts Now!

The fragility of social media data was highlighted recently when Myspace deleted (by accident) user’s audio and video files without warning. This almost certainly resulted in the loss of many unique and original pieces of work. This is another example of how online social media platforms should not be seen as archives and that if things are important to you they should also be stored elsewhere. The UK Web Archive can play a role in this and we do what we can to preserve websites and selected social media. We do, however, need your help!

Google+
If you have a  Google + account you will have seen the warning that the service is shutting down on 2 April 2019 and have warned users to download any data they want to save by 31 March 2019.

However, it’s not easy to know how to preserve data from social media accounts and sometimes this information without the context of the platform it was hosted on doesn’t give the full picture. In a previous blog post we outlined the challenges involved in archiving social media. Currently the most popular social media platform in the UK Web Archive is Twitter, followed by Facebook, which we haven’t been able to successfully capture since 2015, and a limited amount of Instagram, Wiebo, WeChat and Google +.

Under the 2013 Non-Print Legal Deposit Regulations we can legally only collect digital content published in the UK. As these platforms are hosted outside the UK there is no automated way to identify UK accounts so it requires a person to look through and identify the profiles that are added. In general, these are profiles of politicians, public figures, people renowned in their field of study, campaign groups and institutions.

So far, we only have handful of Google + profiles in the UK Web Archive but we are keen to have more.

How to save your Google+ data
If you have a Google + profile or know of other profiles published in the UK that you think should be preserved, fill in our nomination form before March 29th 2019: https://www.webarchive.org.uk/en/ukwa/info/nominate

If the profiles you want to archive outside the UK you can use the save a website now function on the Internet Archive website: https://archive.org/web/

By Helena Byrne, Web Curator of Web Archiving, The British Library