UK Web Archive blog

Introduction

News and views from the British Library’s web archiving team and guests. Posts about the public UK Web Archive, and since April 2013, about web archiving as part as non-print legal deposit. Editor-in-chief: Jason Webber. Read more

14 January 2013

Religion, politics and the law: a new special collection

Add comment Comments (0)

It has been over two years in the making, but I am delighted to be able to say that my own special collection in the UK Web Archive is now online.

A couple of years ago, long before coming to the BL, I joined the Researchers and the UK Web Archive project at the Library which brought together a group of scholars to guest-curate special collections on our own particular research interests. As an historian, I was interested in the marked sharpening of the terms of discourse about the place of religion in British public life, particularly since 9/11 and the London bombings in 2005. It struck me that a good deal of this debate had already shifted online, and so new ways and means of capturing and preserving it were going to be needed. And so, the ‘politics of religion collection’ (as it was then known) was born. Religion politics law thumbnail

As has been noted many times in this blog, the problem for web archiving is that we’re dealing with other people’s copyright work, and so an individual permission is needed for each site. I have a long list of sites which I would dearly love to add to the collection, but for which (for various reasons) we’ve had no response. So, if you are the owner of Protest the Pope, or Holy Redundant, or Christians in Politics, please get in touch. For now, even if the collection cannot be anything like comprehensive, I do hope that it is at least coherent.

There are particular strengths, and some gaps. It includes many campaigning organisations, both secularist and religious, and is heavy on the conservative Christian organisations about which I myself know most. It is relatively light on non-Christian faiths, since I know the field much less well. It is still very much open, however, and so suggestions of sites that ought to be included are very welcome, via this blog or via the UK Web Archive site.

See a previous post about my progress in 2012.

Peter Webster

07 January 2013

Oral history in the UK: a new special collection

Add comment Comments (1)

[A guest post from Elspeth Millar, Oral History Archive Assistant in the National Lifestory Collection at the British Library.]

I have been involved in the the pilot Curators' Choice project, led by the Digital Curator team. The Curators' Choice project is helping curators within the British Library to establish collections in the UK Web Archive, based on the subject expertise of their curatorial department. As Archive Assistant for Oral History and National Life Stories at the British Library my natural topic of choice was going to be websites relating to 'Oral History in the UK'. I have nominated organisational or individual project websites which give information about a project (project background, participants, funding information), and websites which provide access to finding aids for oral history interviews, but ideally sites which provide direct online access to oral history archive material (either clips or full interviews).

Oral history thumbnail

I was lucky to have existing resources at my disposal to discover relevant websites, in particular our own Oral History section resources page, the Oral History Society website and the Oral History Journal; the journal includes a 'Current British Work' section which helpfully lists current oral history projects around the UK.

Oral History in the UK was traditionally concerned with community history and uncovering 'history from below' although it is now widely used within many academic disciplines.  I hope that the websites so far included in the 'Oral History in the UK' collection demonstrate the variety of ways in which Oral History is now used - from use by community and local history groups, charities but also universities. The range of websites in the collection includes those which document local history (Durham in Time, St. Helier Memories); the experiences of people who have emigrated to the UK (such as the Birmingham Black Oral History Project); disability history (Speaking Up For Disability); health (Testimony - inside stories of mental health care); industry (Songs of Steel); and memories of war (The Workers' War, Captive Memories).

The websites vary widely in the way they present oral history. Many websites, although not all, provide access to extracts from oral history audio or video archive material; and most sites also provide information on the project background, participants and funding arrangements.

There are many more websites I would love to include in the collection; indeed many more  websites have been nominated for inclusion within the collection but the Web Archive team is awaiting permission from the website owners to include the site.  We'll carry on nominating sites for inclusion, but we welcome nominations from the public as well - if you think there is an important UK oral history website that is not being included in the UK Web Archive at the moment contact the Web Archive team.

02 January 2013

Slavery and Abolition in the Caribbean: a new special collection

Add comment Comments (0)

[A guest post by Dr Philip Hatfield, Curator for Canadian and Caribbean Studies at the British Library.]

Back in July I added a short post to this blog about the first stages of selecting material for the UK Web Archive Special Collection, ‘Slavery and Abolition in the Caribbean’. Now, after much trawling of the web and selection of sites, and brilliant work from my colleagues from the UK Web Archive (whose determination and technical wizardry know no bounds) I’m delighted to say that the first iteration is now live for public use. You can access it here, and I hope you find it of use.

Before I go though, some further thoughts about web archiving in the context of this collection. The first thing to note is how important this kind of work is for maintaining a record of not just the Web but writing, publishing and commemoration more generally in the early 21st century. There Plan of the slave ship Brookes are many websites and pages produced during the 2007 bicentenary of the abolition of the slave trade that have either disappeared or no longer have a contactable administrator who can grant the Web Archive rights to collect and display the site. And so, valuable resources for understanding the UK’s engagement with the history of slavery and the politics of remembrance are lost to us.

Following on from this it is impossible to overstate the importance of permissions to the construction of viable collections within the UK Web Archive. Permissions allow sites to be archived and made available to the public and are key to providing a comprehensive research resource. Without them, a collection may not reflect completely the selections of the curator or material that is live on the Web, which is partly the case with the ‘Slavery and Abolition in the Caribbean’ collection. We’ll keep trying with those sites for which we have not yet got permission;  but I am very grateful to those institutions and individuals who have taken the time to consider our request and grant permission.

Highlighting these problems brings me to my main point: this is an evolving collection driven by the need to continue to archive what already exists on the Web and also relevant sites created in future. This is where, hopefully, readers of this post and users of the collection come in. I hope the process of building and maintaining this collection can become a dialogue between users, myself and the UKWA. If you know or moderate a site you think should be part of the collection please do get in touch with me, at [email protected].

[The image is part of a plan of the slave ship Brookes, found in various archived sites, including that of Brycchan Carey . Originally from Thomas Clarkson’s, ‘The history of the rise, progress and accomplishment of the abolition of the African slave trade by the British Parliament’ [BL Shelfmark: 522.f.23]

19 December 2012

Digital Humanities and the Study of the Web and Web Archives

Add comment Comments (0)

In early December 2012, I attended a PhD seminar on Digital Humanities and the Study of the Web and Web Archives. It was organised by netLab, a research project for the study of Internet materials affiliated to Centre for Internet Studies, Aarhus University, Denmark.

18 PhD candidates from different parts of the world attended the seminar. They are all at different stages of their research but together represent a new generation of researchers who have embraced the Internet to study society and culture as “it holds the most multifaceted material documenting contemporary social, cultural and political life”, in the words of the organisers. The workshop draws specific attention to Web Archives. This is not surprising as Niels Ole Finneman and Niels Brugger, organisers of the seminar, were not only closely involved in the conception and development of the Danish National Web Archive (NetArchive.dk), but also use web archives as a key source in their own research of the history of the Internet. The purpose of the workshop is two-fold: to explore relevant digital research tools and methods, and to introduce web archives, their characteristics and analytical and methodological consequences to the students as a corpus for research.

Presentations from the students painted a diverse picture of research topics and disciplines. Things that struck me included the already creative use of various digital research methods as well as the (almost indispensable) role of social networks such as Twitter and Facebook. Adrian Bertoli of the University
of Copenhagen, for example, who studies the online diabetes community, is also investigating how that community relates online to medical professionals, pharmaceutical companies and governments. He produced a hyperlink map to illustrate the interactions between the various actors. Another example is Jacob Ørmen, also of the University of Copenhagen, who investigates the interplay between established media and social media in the coverage of worldwide “media events”, such as the Diamond Jubilee or the 2012 London Olympics, where social media data about the events would be fundamental to the research. 

Over time, users of web archives such as those at the seminar are likely to need more and more the means to collect or assemble individual research corpora.  From our point of view, that of a web archiving service provider whose main users are academic researchers, broad national web archive collections, which often only have limited accessibility for legal and technical reasons, may not meet the dispersed needs of individual researchers, and be in danger of providing a “one-size fits nobody” solution. Archiving and providing access to individual historical web resource is the basic “must-have” of a web archive. To add value beyond that, we should think about collecting and storing those web resources in such a way that it will allow individual researchers to organise and then continually reassemble their own research corpora. We also need to provide the tools for processing and manipulating them using various digital methods.

One of the difficulties in studying web archives highlighted by Niels Brugger is the problematic interoperability between web archives with different scopes and geographical coverage. What we need is a research infrastructure which is capable of supporting the study of the history of the Internet across web archives in different countries, collected using different principles and with content in different languages. There is a funding bid under consideration by the EU to develop this.

706105_117656848399585_613850088_o
Helen Hockx-Yu, December 2012  

 

04 December 2012

Capturing the police authorities

Add comment Comments (0)

For almost half a century Police Authorities in England and Wales fulfilled their role of ensuring that the public had an efficient and effective local police force. This system was however replaced by a single elected individual (a Police & Crime Commissioner) following the Police Reform and Social Responsibility Act 2011.

Thursday 15th November saw elections for the new Police and Crime Commissioners in the 41 police force areas in England and Wales outside London (The Mayor of London, Boris Johnson, has since January held the equivalent role over the Metropolitan Police Force).

We in the British Library Web Archiving Team were concerned that with the abolition of the Police Authorities and the disappearance of their websites significant documentary material would be lost. Information on the Authority websites typically includes annual reports, statements of accounts, policing plans, public consultations, strategy and delivery plans and newsletters, all of which serve to inform the public of the work of the Authorities and to enable Authority members to scrutinise the constabulary and hold the Chief Constable accountable.

In light of this we contacted the Police Authorities asking for permission to archive their current websites before being replaced by the PCCs on 20 November. Some Authorities responded immediately whereas others required further information and (after a little bit of chasing) we received a 100% positive response rate. This is certainly something to be pleased about as the usual response rate is between 25 and 30 % and so for the first time we have been able to capture a nationwide administrative change comprehensively.

Between two and four snapshots of each website have been taken and reviewed individually for quality and completeness before being submitted to the archive. Typical issues included the need to add supplementary seeds to capture linked documents and style sheets external to the host server; applying filters to prevent crawler traps and probing crawl logs to identify the reasons for missing content. The final snapshots were taken on 20th November in case of any last minute changes. See the whole collection.

29 November 2012

Monarchy and New Media: bookings open

Add comment Comments (0)

Bookings are now open for this one-day conference, in London, on Thursday 7 February 2013. We at the UK Web Archive are joint organisers, with the Institute of Historical Research (University of London), and the Royal Archives.

The end of the Diamond Jubilee year affords an opportunity to look back and examine a neglected aspect of the history of the monarchy: the engagement with new forms of media. The event will include reflections on royal engagement with successive new technologies: telegraphy, radio, newsreel and television.

The event will also see the formal launch of our own jubilee collection, with reflections on our experience of creating it in collaboration with the Royal Archives and the IHR, and one historian’s engagement with the collection itself (our very own Peter Webster).

Booking costs a very reasonable £10, and further details, including a programme and a booking form may be found here.

22 November 2012

Upgrading the Wayback Machine

Add comment Comments (0)

We're very shortly to upgrade our deployment of the Open Source Wayback Machine, the software made openly available by the Internet Archive to enable browsing of timed snapshots of an archived site. (See it in action in the UK Web Archive.) We're deploying a new version made available by the Internet Archive earlier this year.

Users will see immediately some enhancements. The banner at the top now will include more information about the number of instances of each site that are available, and an easier way of navigating between them. The information will be available in Welsh, in recognition of the remit of the archive for the whole of the UK; and there's also a handy Help link. For now, however, it will no longer be possible to minimise the banner and then reveal it again; it will be necessary to reload the page to see the banner once minimised.

Behind the scenes, the new version reads directly from our Hadoop Distributed File System (HDFS) which is more cost-effective, simpler to administer, more robust, and easier to scale up to cater for growing levels of usage.

15 November 2012

Non-Print Legal Deposit Regulations 2013: what will they say ?

Add comment Comments (0)

Next year we anticipate that regulations will come into force which provide for legal deposit for non-print works, mirroring the longstanding situation for printed works. The final draft of the regulations, to be laid before Parliament, has recently been published.
Here's a summary of their impact in relation to web archiving.

What will we be collecting ?

The regulations cover four deposit models, of which the most relevant for the web archiving team are that:

(i) a deposit library may copy UK publications from the open web, including  websites, plus open access journals and books, government publications etc.;
(ii) a deposit library may collect other password-protected material by harvesting, subject to giving at least 1 month’s written notice for the publisher to provide access credentials (with some limited exemptions).

The regulations apply to any digital or other non-print publication, except:

(i) film and recorded sound where the audio-visual content predominates [but, for example, web pages containing video clips alongside text or images are within scope];
(ii) private intranets and emails;
(iii) personal data in social networking sites or that are only available to restricted groups.

The regulations apply to online publications:

(i) that are issued from a .uk or other UK geographic top-level domain, or;
(ii) where part of the publishing process takes place in the UK;
(iii) but excluding any which are only accessible to audiences outside the UK.

What will the Library be able to do with it ?

Deposited material may not be used for at least seven days after it is deposited or harvested.

After that, deposit libraries may:

(i) transfer, lend, copy and share deposited material with each other;
(ii) use deposited material for their own research;
(iii) copy deposited material, including in different formats, for preservation.

What will users be able to do with it ?

Users may only access deposited material while on “library premises controlled by a deposit library”.

Users may only print one copy of a restricted amount of any deposited material, for non-commercial research or other defined ‘fair dealing’ purposes such as court proceedings, statutory enquiry, criticism and review or journalism.

No more than one user in each deposit library may access the same material at the same time.

Users may not make any digital copies, except by specific and explicit licence of the publisher.

What restrictions may publishers request ?

The publisher or other rights holders may request at any time an embargo of up to 3 years, and may renew such request as many times as necessary. The requested embargo must be granted if the deposit library is satisfied on reasonable grounds that providing access would conflict with the publisher’s or rights holders’ normal exploitation of the work and unreasonably prejudice the legitimate interests of the publisher.

These conditions remain in force forever, including after all intellectual property rights in the deposited material have expired [“perpetual copyright”].