THE BRITISH LIBRARY

Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

13 August 2015

Fin; or reflections on thirty months of Digital Research

Add comment Comments (0)

Thirty months ago I joined the British Library Digital Research Team. In that time we (often with the folks from British Library Labs) have achieved a huge amount, not least putting over one million public domain images on Flickr, developing our internal training provision, and repurposing British Library collections to enrich the education and outlook of computer science and game design students. This week I say goodbye.

Picture1

The Digital Research Team was created in 2010 with a broad mission that covers everything from enabling computational analysis of large scale digitised collections and creative reuse of openly licenced collections to advocacy of clear data citation and digital skills training. I often have summed our role up by saying that we are here to ensure that the British Library's digital collections are used in ways that go beyond looking at them on a webpage, an open, data, and creativity orientated approach that is at the forefront of the British Library's vision.

Picture4

I came to the team from academia and a background in studying long eighteenth-century satirical prints. My data was small, perspectives narrow, and foobar modest, but my eyes, ears, and mind open. And they needed to be, for in my first month in the job the British Library celebrated enhanced powers to collect non-print materials published in the UK. In effect this meant that this library of around 170 million things had the power to collect the UK web domain. Since then the library has collected over 2 billion web pages, fundamentally changing our collection profile (see the UK Web Archive blog for more), making the British Library a place full of data as much as books. Even the beloved manuscript, I soon learnt, was not 'safe' from the bitstream also changing our collection profile were the small but growing volume of floppy disks, CD-ROMs, hard-drives, and email archives that are the archives of life in the 'Information Age'. And these personal digital archives are more than just collections of 'proper' born-digital documents typed up on personal computers, they include software, browser-caches, spam, and downloads folders, in fact they include every bit on every disk: captures of whole computing environments that can be booted up to offer an experiential window into a person's interaction with their machine.

Picture5

I say can but in most cases they aren't. For as unpublished material these archives, like their paper counterparts, can only be made available to readers once we are sure we have complied with things like The Data Protection Act, a time consuming process that requires people to examine each and every digital object. This clash of possibilities speaks to two overarching themes of my thirty months with the Digital Research Team. The first is the gap that often appears between well thought out established practice and the demands of large and/or complex digital collections: in the case of born-digital manuscript collections, responsibilities to both readers and depositors compete when faced with hundreds of thousands of files. The second is the important - but often forgotten - role of decisions made by people in the creation, management, and marshalling of large and/or complex digital collections. This role may be self-evident. But data does tend to flatten and depersonalise. And interfaces to data tend to emphasise those qualities in their haste to ensure that experiences are smooth, that tensions recede from view. As someone trained to trace the provenance of evidence and to examine the role of agency and power in humanistic phenomena, I see it as important to out the personal back into our use of data. Why? Well, when you search Explore the British Library and Google Books you don't just search databases of 56 million things and over 30 million books respectively, rather you search accumulations of human labour, expertise, and decision making shaped (and constrained) by local, temporal, and organisational priorities and worldviews. When you browse Wikipedia, Wikimedia Commons, or Wikisource you rely on the production of human labour mediated through community guidelines and practices that - perhaps inevitably - introduce prejudices. When you use any computational process to take data in and push data out, the bit in the middle isn't the work of a machine but the work of a people instructing a machine, people - as Mia Ridge, Ramon Amaro and the Software Sustainability Institute, among others, remind us - with opinions, perspectives, fears, and dreams. And when you seek solace in a standard, you seek solace in something that, as a produce of human agency, can never wholly be neutral.

Screenshot 2015-04-07 15.30.46 - Copy

This may all sound a bit negative. But my point is that many of the achievements of the Digital Research Team stem from this sort of thinking, an approach that is deeply critical of techno-evangelist perspectives to the role of digital collections, methods, and approaches in society and culture. We don't assume that digital technology is the solution but rather that an approach that sees people using digital technology is one solution among many possible solutions. My job over the last thirty months has been to collaborate with amazing people both in and outside to British Library to chose the right solutions. As I move to a new position outside the British Library, I look forward to seeing the fruits of these and future decisions appear on the Digital Scholarship Blog.

James Baker -- Curator, Digital Research -- @j_w_baker

05 August 2015

Crowdsourcing as Interesting Decisions: Update from BL Labs 2015 Competition Winner

Add comment Comments (0)

Posted by Mahendra Mahey (BL Labs Manager) on behalf of Adam Crymble, a Lecturer in Digital History at the University of Hertfordshire, and one of the winners of the 2015 British Library Labs competition, describes the current progress of his project, ‘Mechanical Curator Arcade’.

When I was nine years old my friend Robbie and I spent an inordinate amount of time in the local video game arcade, and far more money than either of us would like to admit. We watched enviously as the teenagers hogged the Street Fighter II machine near the entrance. Robbie and I retreated deeper into the arcade, where we found a favourite in The Simpsons Arcade Game.

 

We even beat it once.

Like many children of the 1970s, 80s, and 90s, video games were a staple of our formative years. Many of us have developed a superhuman abililty to stare at screens for long periods without blinking. We know instinctively that there is something behind this wall, and that some combination of buttons will help us discover it:

Gaming_wall

But how many of us know Why a game is fun? I only recently began to ask myself that question, and I came across a quote attributed to reknowned video game maker Sid Meier, the creator of the Civilization franchise. Meier noted that 'a game is a series of interesting choices'.

Not everyone agrees with that definition, but it's a surprisingly simple and astute observation. Games lay down a series of rules - they generate the conditions of a virtual universe. We learn the rules, and our challenge is to win the game by making choices that lead us through that world, to victory.

But a game is about more than just choices. A game is about losing. Or at least, the threat of losing. If we make the wrong choice - jump on a prickly enemy, for example - we're punished. We die.

This revelation has been important for me, because for the past few months I've been trying to make crowdsourcing fun. Crowdsourcing is an increasingly common practice amongst historians, whereby a simple but repetitive task - such as transcription or tagging a huge set of images - is shared across a large number of volunteers. It adheres to the adage, 'many hands make light work'. Like games, crowdsourcing is inherently about choices. Depending on the task, the volunteer makes a choice. If they're transcribing handwritten documents, they have to decide what word they see on the screen. If they're asked to tag a historic image, they had to decide the appropriate tag.

In order to make crowdsourcing more fun, some projects have attempted to offer a series of incentives. High scores and leaderboards are popular now in 'gamified' crowdsourcing experiences. But I've yet to come across a crowdsourcing game in which you can REALLY lose. It's all carrot, and no stick, and that's why it's no fun.

Counterintuitive, perhaps, but once you hit the age of 5 and your competitive streak kicks in, it's the threat of losing that makes you want to win. And this is where crowdsourcing faces its biggest challenge if we want users to have a 'fun' experience. Because in order for you to lose, the maker of the game needs to know when you've done something wrong - when you've broken the rules of the virtual universe. That's easy enough for Super Mario, because the game is programmed to check when you've bumped into a bad guy, or fallen down a hole. But in crowdsourcing, we have no idea if you've given us the right answer - if you've tagged the image correctly, or transcribed the word right. If we knew that, we wouldn't have to ask you to do it in the first place. That means we can't punish you consistently. And it means you won't have fun the minute you realise that. Because at that point, your interesting decisions become meaningless and any correct information you provide comes down to your good will rather than your desire to win.

That's where we currently stand in our efforts to make crowdsourcing fun. It's a big challenge, but it's one I believe someone out there can tackle. So in the spirit of crowdsourcing, we're turning to the crowd, and we're hosting a virtual 'Game Jam' from 4-11 September 2015 to engage with amateur video game makers everywhere who think they've got the answer.

To help them get started with an appropriate crowdsourcing task, we've put together a sample set of these historic images - around 100 to 200 illustrations each of people, music, architecture, flora, fauna and even cycling - along with several hundred images that we know very little about. We thought this might help to validate the results of the crowdsourced content.

The sample link is: http://bl-labs.github.io/arcadeinterface/sample_images.html

An ideal game draws a random image from the set and through gameplay the player tells us something about the content of the image. Perhaps they choose from our limited set of tags (flora, fauna, mineral, human portrait, landscape, manmade - eg. machine, buildings, ship, abstract, artistic, music, map), or gamemakers can opt to be more creative.

If we like what we see, we've set aside up to £500 (courtesy of the Andrew Mellon Foundation) to work with someone to polish their game and release it as part of our 'Mechanical Curator Arcade Game', a 1980s-style arcade console that we're planning to install in the British Library this autumn. The Game Jam is open to anyone, but only those over the age of 18 are elligible to work for us.

All completed games (whether they fit the crowdsourcing theme or not) will also be eligible to enter the British Library Labs Awards, with a chance to win an additional £500 in prizes, as long as they use the British Library digital content such as the sounds and images from the open collections.

If you're up for the challenge, you can find out more on our Game Jam event page. We're looking forward to working with one of you, and get in touch at labs@bl.uk if you'd like to discuss ideas. We're here to listen and learn.

 

28 July 2015

Update on Political Meetings Mapper - BL Labs Competition Winner 2015

Add comment Comments (0)

Posted on behalf of Katrina Navickas.
Katrina Navickas, Senior Lecturer in History at the University of Hertfordshire, and one of the winners of the 2015 British Library Labs competition, describes the current progress of her project, ‘Political Meetings Mapper’.

Political Meetings Mapper is a project to build a database, website and interactive map of 19th century political meetings, using the Nineteenth Century Newspapers collection and the Maps collection. The meetings will be plotted on a geo-referenced historic map to show the spatial and temporal patterns of the movement.

You may have noticed the copies of a historic poster outside the entrance to the British Library, advertising a Chartist meeting. What was Chartism and why is it still relevant to us today?

Charter_newspaper

Chartism was the first mass movement campaigning for the vote in the United Kingdom. They presented three major petitions to parliament calling for the ‘six points’, which included the vote for all men, ensuring we can vote anonymously without bribery, and annual parliaments, so that the people can remove corrupt governments quickly. The Chartists campaigned for the constitutional freedoms that we now hold (and perhaps take for granted) in Britain, and remind us that these rights were hard-fought for.

We’re focusing on extracting records of meetings advertised in the Northern Star newspaper from 1838 to 1844 for two reasons:

  1. it was the main Chartist newspaper with a national reach;
  2. it had a regular column each week titled ‘forthcoming Chartist meetings’, which is easy to identify.

The British Library Labs team is working on building in the capability to identify and automatically geo-code the places and parse the dates mentioned in the text.

Current progress

We have redone and checked the Optical Character Recognition for the newspaper columns for 1841 to 1843 – we still need volunteers for checking the OCR for the other years in the sample are so let us know if you’re interested in participating.

We have extracted about 4000 meetings and other events so far, and are on track to reach the 5000 mark soon!

Over the past couple of weeks I’ve been focusing on Chartists in London. I’ve learned lots about the history of the capital (I’m a historian of the North of England by trade). I was astounded to find well over 50 different sites in London used regularly for Chartist and trade union meetings. I also expected that the venues would concentrate in the East End and docks, where many of the skilled workers who were most attracted to the Chartist movement lived and work. Yet having plotted the locations, I’ve found that the Chartists met all over London, including in the centre and in places near to the British Library.

Another surprise was that, regardless of all the urban change that has happened in the capital over the last two hundred years, many of the original pubs still exist, with the same names.

Follow the Chartists around London on 21 September 2015!

Join us for a mystery tour and reenactment of a Chartist meeting around some of the venues to bring the BL 19th century newspaper reports to life! ‘Follow the Chartists round London’ takes place on Monday 21 September, and is free and open to the public.

Participants will learn about the history of Chartism and the London venues, and participate in a re-enactment of a Chartist meeting in the actual pub where it took place nearly two hundred years ago. If you fancy dressing up in costume and pretending to be your democratic ancestor, do let us know. Volunteers welcome!

Programme:

Monday 21 September 2015:

1230 - 1300
Registration
Foyle Suite, Centre of Conservation, British Library

1300 - 1400
Lunch

1400 – 1530

Talks

Dr Katrina Navickas, University of Hertfordshire, ‘the Political Meetings Mapper and the history of Chartism’

Dr Matthew Sangster, University of Birmingham, ‘Romantic London’

British Library, ‘Digital collections at the British Library’

1530 - 1730
A 3km walking tour of Chartist sites in the Kings Cross/St Pancras/Somerstown/Camden area, with readings of reports from the Northern Star newspaper at each site. Sites may include:

  • Prince of Wales Feathers, 8 Warren Street, W1T 5LD
  • Archery Rooms*, 26 Bath Place
  • Tillman's Coffee House*, 59 Tottenham Court Road
  • Two Chairmen, 31-32 Dean Street, W1D 3SB
  • Three Crowns*, Richmond Street
  • Three Doves**, 24 Berwick Street, W1V 3RF
  • Red Lion Pub, 14 Kingly Street, W1B 5PR

*doesn't exist anymore
**now an art stationery shop 

1730 - 1830
The walking tour will end at a Pub where our group will get a drink. The room will be prepared for a renactment of a Chartist meeting that occurred in the pub, beginning at 1800. The meeting will end with the audience voting on various resolutions and some food.

Participants are welcome to continue their discussions into the evening.

Click here for more information about booking.