UK Web Archive blog

Introduction

News and views from the British Library’s web archiving team and guests. Posts about the public UK Web Archive, and since April 2013, about web archiving as part as non-print legal deposit. Editor-in-chief: Jason Webber. Read more

14 June 2012

Crowdsourcing and Web Archiving

Add comment Comments (0)

There has been a long history of members of the public acting as volunteers to refine, enhance and improve the collections of cultural heritage institutions for the benefit of others. Crowdsourcing can be seen as a continuation of this tradition.

The term crowdsourcing can be problematic as it is not necessarily about massive numbers of people or about outsourcing labour but rather about inviting participation from interested and engaged members of the public.

A workshop at the IIPC General Assembly in Washington DC in May 2012 addressed issues around applying crowdsourcing to web archiving. A paper by Trevor Owens, entitled The Crowd & the Library – the Agony and the Ecstasy of “Crowdsourcing our Cultural Heritage" was used as a framework for the workshop and a number of use case scenarios were evaluated by participants on the day.  

A number of key observations were made and extracted from the overall discussion. It was observed that there are advantages and disadvantages in engaging ‘the crowd’ in web archiving both for the institutions carrying out the initiatives and those members of the public involved.

There may be sensitivity around areas where there is already professional expertise within the organisation (e.g. cataloguing). It is important to design the project in such a way that the crowd and the expert each do what they are best at. Advanced users and regular users should be given different tasks, fully utilising the wisdom of the crowd.

Humans are capable of processing information and making judgements in ways that computers cannot. It is a waste of time to ask the public to do tasks that a computer can.

Putting the right tools in place will magnify the user’s effort by making it easier to accomplish tasks. Trade-offs quite often emerge between richer functionalities on a crowdsourcing website and forming barriers to participation by users. Requesting users to login for example has the advantage of being able to store information to enable personalised services but being able to start immediately without login is appealing.

It was pointed out that people feel motivated by doing something that matters to them and get a sense of belonging to something bigger than themselves. Crowdsourcing should be engaging, especially when users are asked to carry out repetitive tasks. It is important to provide feedback to users on how they are doing and how their contribution is furthering the overall progress of the project. This helps to keep users engaged.

Key challenges include devising an appropriate project and attracting an audience sizable enough to participate in the work. It was felt that crowdsourcing within web archiving would suit smaller discrete projects rather than ongoing open ended challenges. Suggested areas for the involvement of the crowd could apply to elements of the web archiving workflow such as identifying websites, quality assurance and cataloguing.

 

28 May 2012

Associations and Citizenship: Researching the ‘Big Society’ on the Web

Add comment Comments (0)

The following is a guest post by Tom Hulme a final-year PhD researcher at the Centre for Urban History, University of Leicester. His research interests are in associational culture, citizenship, local government, and education in the interwar period, focusing on Manchester and Chicago. He has further interests in contemporary civil society and citizenship. He is also currently undertaking an internship at the British Library in the Social Sciences department.

 

With the 2010 Coalition Government’s vision for a ‘Big Society’ of volunteers, supporting or even running public services, the need for researchers to study associations and civil society has never been more pressing. Volunteering, after all, makes up a large part of the social picture of Britain in the twenty-first century; a 2003 Home Office citizenship survey, for example, found that almost a third of people were engaging in ‘active community participation’, a trend that seems to be rising rather than falling. I am interested in examining the organisations that contribute to this culture, especially the ways in which it contributes to the imagined or real vision of a ‘community’ of citizens. What are their motivations and methods? How do they interact with local and national government? Does volunteering and associating with others for the ‘civic good’ create better citizens?

 

Some of the best known associations are, of course, the biggest and longest running. They are the bedrock of civil society, performing duties that have a never-ending purpose: educating the citizenry, fundraising, and acting as local, national, and even international pressure groups. Charities like the Royal Society for the Prevention of Accidents (1917), or non-governmental organisations like Amnesty International (1961), are still operating today because the problems they tackle are probably interminable. Giants of civil society like these can, however, obscure the myriad of local associations that have fleeting appearances, living for just a couple of years, before fading and sometimes vanishing altogether.

 

For the study of this type of parochial civil society, the web archive is an indubitable boon. Pressure groups have long realised the importance of web interaction for engagement and campaigns, and there is a wide variety of such sites in the Web Archive, like the Bengali Cultural Association of London, the Fernherst Society in West Sussex, or the Stop Norwich Urbanisation Blog. Websites like these are a treasure-trove of information and opinion; preserving them gives us a vital glimpse into the way that civil society ‘works’. Not just for contemporary researchers but for the historians of the future, it is vital that these small windows into civic life are recorded and maintained for generations to come. After all, their disappearance from the ‘real web’ can tell us as much about their purpose and operation, as it does about their demise.

11 May 2012

Scholarly value of the UK Web Archive? (correction)

Add comment Comments (0)

Tell us what you think about the UK Web Archive

Question-markIf you are a postgraduate researcher or a university lecturer we would like to get your feedback on the research value of the UK Web Archive. It doesn’t even matter if you have already used the archive or not.

We have commissioned an independent research agency – IRN Research – to gather information on the needs of archive users and potential users. If you would like to help shape the future development of the archive please register your interest.

In the next few weeks you will be contacted by a researcher and emailed an online facilitated walkthrough of the archive which will explain how the site works in just a few minutes. Using this walkthrough, you will be asked to answer questions about the content, functions, and tools available and your interest in, and likely use, of the archive.

All your answers will be treated in the strictest confidence and all those taking part will have the chance to win one of a number of £20 book tokens.

To take part in the research, please register.

16 April 2012

Scholarly value of the UK Web Archive?

Add comment Comments (0)

Tell us what you think about the UK Web Archive

Question-markIf you are a postgraduate researcher or a university lecturer we would like to get your feedback on the research value of the UK Web Archive. It doesn’t even matter if you have already used the archive or not.

We have commissioned an independent research agency – IRN Research – to gather information on the needs of archive users and potential users. If you would like to help shape the future development of the archive please register your interest.

In the next few weeks you will be contacted by a researcher and emailed an online facilitated walkthrough of the archive which will explain how the site works in just a few minutes. Using this walkthrough, you will be asked to answer questions about the content, functions, and tools available and your interest in, and likely use, of the archive.

All your answers will be treated in the strictest confidence and all those taking part will have the chance to win one of a number of £20 book tokens.

To take part in the research, please register.

13 April 2012

Improved search functionality

Add comment Comments (0)

We've recently implemented some changes to our search functionality in the UK Web Archive, particularly for full text searching.

We first enabled full text searching in the web archive a few years ago. This was a great leap forwards from title searches alone, but it was often time consuming to wade through the results. We harvest sites on a recurring basis, so the search results often contained a lot of 'noise' and duplicate results as the same instance often appeared several times over.

Search results are now grouped by domain, making it easier to immediately see which websites contain references to the search term(s) and easily identifying the context in which the search term appears. For domain results we group URLs by date. This eliminates duplicate entries in results but still provides temporal access when there is more than one instance captured.

Ukwa-protest

We have improved our content type filter, making it quicker and easier to filter by content type(s). Search results are now grouped by content type, separating 'documents' from 'images' and 'multimedia', in recognition of the fact that people will often be searching for a specific type of content. This is still in development and we know that it doesn't always work perfectly - images can appear in the documents tab when they are served from a single html page, for example. We're keen to hear from people about this feature, and whether they think it's useful.

We've also started to roll out some social media integration. It's now easy to share any of the resources in the search results, using the links provided under each one.

Socmed-ukwa-1

And finally, you can now use the Advanced Search tab to filter by archiving organisation. For example, if you're only interested in sites archived by the Wellcome Library, you can specify this prior to running the search. Only sites selected by thesethis institutions will then be included in your search results. 

We've lots more development planned over the next few years. If there are any particular features or functionality that you'd like to see, please do get in touch.

16 March 2012

Notice: Planned outage

Add comment Comments (0)

Between Friday the 23rd and Sunday the 25th of March, we are upgrading the technical infrastructure that the UK Web Archive services rely upon.

To perform this upgrade, a short break in service is required. The UK Web Archive will be unavailable for part or all of this period. We will be back up and running on Monday the 26th of March.

We apologise if this causes you any inconvenience.

13 March 2012

Public Consultation on non-print legal deposit

Add comment Comments (0)

The British Library has issued a Press Release on the Consultation on the Draft Legal Deposit Libraries (non-print works) Regulations 2013, recognising the importance of the legislation for web archive collections.

Since the introduction of the  Legal Deposit Libraries Act in 2003, the Legal Deposit Libraries have been working with the Government and publishers on the necessary regulations to allow the collection of digital material published in the UK on and offline. Without these regulations, a great deal of digital information about UK life and records of major events of the 21st century are at risk of being lost to the 'digital black hole'. 

The regulations are designed to 'ensure the Legal Deposit Libraries provide a national archive of the UK’s non-print published material'. This includes websites and would enable the web archive to begin comprehensive crawls of the UK domain, made accessible from the reading rooms of the Legal Deposit Libraries.

The consultation is open until May 18th.

01 March 2012

A Note on Nominations

Add comment Comments (0)

Did you know that anyone can nominate websites for the UK Web Archive? We're exploring different ways to make it easier for people to nominate websites.

For several years now we've had a public nominations form on our website. However, we know that filling in a form can be a little daunting sometimes, even when it's only small and especially if you've not much time. So for the past few weeks we've been looking into additional options for accepting or submitting nominations.

Yesterday, we ran a small experiment using Twitter and invited followers to simply tweet the details of their nomination to @ukwebarchive. Our reasoning was simple: 

  1. It's very, very easy to share a link on Twitter
  2. So many people and organisations are already on Twitter and regularly share links with their followers
  3. It's fairly easy for us to monitor nominations coming in this way

We tweeted several times throughout the day about this and are pleased with the response. We had a small number of nominations on the day and several ReTweets, reaching a wider audience that our followers alone. It will be interesting to see if nominations continue to be tweeted when we aren't actively encouraging them. We need to evaluate the day in more detail, particularly with regards to how (and when) we respond, the types of nominations we receive, and how we can factor this into our current workflow. At the moment though, it's certainly worth more investigation.

We've also thinking about producing a browser plug-in that would  automatically populate a small number of fields with details of the site people are visiting, and submit them directly to us as a packaged nomination. This needs further thought, but we'd be interested to hear from people who'd like to use a plug-in like this. 

Finally, we're planning to overhaul the nominations form on the UK Web Archive website. This will make sure we're only asking people for information we really need, and which will help us to better assess their nomination.

So why not drop us a line, or a tweet, with your nomination? Alternatively, if you have any other ideas on how you'd like to nominate sites, why not leave a comment below?  We're always happy to hear new suggestions.