Digital scholarship blog

Enabling innovative research with British Library digital collections


Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

18 April 2016

Diamonds are forever. What about research data?

This is a guest post by  Dr Kirstie Hewlett, Project Support Officer for the THOR project.

The British Library’s Head of Digital Scholarship Adam Farquhar recently delivered a keynote at Our Digital Future. This conference, which was held in Cambridge on 14−15 March 2016, addressed challenges in long term preservation and archiving of digital data across a wide range of disciplines. His talk, titled “Diamonds are forever. What about research data?”, looked at some of the challenges to re-using data in the future and the fracture lines in the scholarly record between articles and data. In this talk, Adam considered ways to close these gaps and identified recent technical developments that may help researchers without radically changing their workflows. As Adam proposed, the emerging network of services, catalysed through the EU-funded THOR project coordinated by the library, promises to help researchers get appropriate credit for the additional work that they do to make data re-usable now and in the future.

The presentation is on line at:


15 April 2016

The Georgian Pingbacks Project

Add comment Comments (0)

Posted by Mahendra Mahey, Manager of BL Labs on behalf of Dr. Melodee Beals, Lecturer in Digital History, Department of Politics, History and International Relations, Loughborough University.

Georgian Pingbacks

In the wild west of the World Wide Web, if you compose a hilarious joke, provide a simple solution to a complex problem or break a major new story, it is almost certain that your work will be copied. Although intellectual property laws exist, they are inconsistently enforced because of the sheer number of sites where reposting occurs - a number that increases with each passing second. If you are lucky, and your re-poster is honest, you may discover how far your ideas have spread through a pingback, an automatically generated comment on your original blog post with a link to its reprint.

In the nineteenth century, reprinting—especially unauthorised reprinting—was the backbone of Atlantic journalism but, unlike modern bloggers, these authors had no effective means of discovering the fate of their quips or queries, except through chance encounters with competing papers or their readers. Although concerns of commercial losses are long past, this lack of attribution continues to plague researchers working with newspapers. Without a precise date of composition or of original publication, and without a specific or even a corporate author, the provenance of these texts remain frustratingly uncertain. One solution to this problem is to track reprinting through text-matching. Using plagiarism detection software, we can carefully reconnect different versions appearing in a wide range of publications. Yet, however efficient our text-matching processes become, two major problems remain. First, text-matching requires machine-readable versions of the articles—electronic texts rather than images. While the sheer number of historical newspapers that have been digitised is impressive, the number that have high-quality, searchable text is deceptively limited. Many community sites have uploaded images of their physical or microfilm archives but do not have the resources to create fully searchable transcriptions. Others, created by state or commercial providers, have relied upon optical-character recognition, the accuracy of which is subject to wild variations. Even when OCR texts are excellent, these represent a considerable investment to providers and often remain locked behind subscription fees.

Reprints within the British Library's 19th Century Newspaper Database, 1818-1819, based on analysis with Copyfind
Reprints within the British Library's 19th Century Newspaper Database, 1818-1819, based on analysis with Copyfind

Thanks to the efforts of public institutions—including the British Library, National Library of Wales, National Library of Australia and the Library of Congress—machine-readable transcriptions for a large number of nineteenth-century newspapers are now available to researchers. But within these collections, a second, more sinister problem arises. No matter how diligently archivists have worked to provide a representative or diverse selection, these digital holdings remain only a slice of the sprawling news network that once existed. Even if we find every single digital copy of a text, how can we know for sure that the original is among them? It is here that the humble pingback returns to the fore. Whether prompted by the innate honesty of editors or by their desire to establish the authenticity of their materials, a significant minority of newspapers articles contained an attribution. Whether appearing as an introductory dateline or a concluding tagline, these Georgian pingbacks offer tantalising clues as to the true origins of these anonymised texts. Yet, because only a minority of articles contain these attributions, because they can appear in many different forms or locations within the article text and because OCR is frustratingly inconsistent in transcribing italic and gothic typefaces, searching for datelines algorithmically is exceedingly difficult.A Snippet from the Ipswich Journal, 13 January 1821. Courtesy of the British Library.

A Snippet from the Ipswich Journal, 13 January 1821. Courtesy of the British Library.

That is where the crowd come in. Although computers can process data very quickly, the human brain is still more adept at finding patterns when the parameters for those patterns are particularly fuzzy. Because of this, it was easier for astronomers to train volunteers to identify dusty debris disks in nebulae than to train computers to do the same thing. And what is true for nebulae is equally true of these Georgian pingbacks. Using thousands of images from the British Library's 19th-Century Newspapers collection, we have created a new site where you can help spot these attributions and provide researchers with what Georgian authors could only dream of, a in-depth understanding of just who was stealing from whom! The site includes an in-depth tutorial on the structure of nineteenth-century newspapers articles as well as three different ways you can help us tag the database. So, whether you have a smart phone and 5 minutes waiting for your train or want to explore the collection in more depth at your home PC, please visit Georgian Pingabcks and try your hand uncovering a 200-year-old case of plagiarism.

Dr M. H. Beals is a historian of migration and media a Loughborough University. She would like to thank the following undergraduate students at Loughborough University's Department of Politics, History and International Relations for their work on this project. Will Dickinson, Alice Gilbert, Ollie Luhrs, Alex Mackinder, Pooja Makwana, Matthew McCulloch, Jonny Ord, Emily Stanyard and Rebecca Thompson.

30 March 2016

Exploring Poetic Places: Launching the App

Add comment Comments (0)

This is a guest post by Sarah Cole, the British Library’s current Creative Entrepreneur-in-Residence. 

Poetic Places is a free app for iOS and Android devices and the main outcome of my work as Creative Entrepreneur-in-Residence at the British Library, funded by CreativeWorks London.

Poetic Places brings poetic depictions of places into the everyday world, helping you to encounter poems and literature in the locations described, accompanied by audiovisual materials drawn from archive collections.

Utilising geolocation services and push notifications, Poetic Places can let you know when you stumble across a place depicted in verse. Alternatively, you can browse the poems and places as a source of inspiration without travelling to them.

Poetic Places aspires to give a renewed sense of place, to bring together writings and paintings and sounds to mean more than they do alone, and to bring literature into your everyday life in unexpected moments.


We launched the app on 18 March 2016 and celebrated with a half-day event at the British Library. I talked about the aims and development of Poetic Places before our five other speakers spoke about their work and ideas in the fields of Literary Geographies, location-aware apps, and cultural heritage.

  • Andy Ryan told us about CityRead London and how they’ve been bringing literature off the page and into the city. For 2016 the CityRead London book is Ten Days, by Gillian Slovo, free copies of which were given to the attendees
  • Dr David Cooper from Manchester Metropolitan University talked about Literary Geographies, digital mapping, and how technology can both connect us to and disconnect us from the landscape.
  • Dr Giasemi Vavoula of the University of Leicester introduced us to the Affective Digital Histories project; demonstrating two apps that both creative and historical connections to place.
  • Maya Chowdry brought an interactive installation with her and spoke about Tales from the Towpath, geocaching, and augmented reality.
  • Jocelyn Dodd, also from the University of Leicester, gave us an insight into how people have interacted with the technology used in the Talking Statues initiative.

All of the speakers were very interesting and have given me more food for thought as I consider ways in which to expand Poetic Places. I’m grateful to them for taking the time to come and talk to us all. 

Photo 18-03-2016, 15 33 41
  Dr Giasemi Vavoula talking about the Affective Digital Histories project

After the conference section of the day a few of us went on a brief walk to Euston via Bloomsbury to try out Poetic Places in action. This was largely successful, though the various Apple and Android devices had quite different sensitivities to the GPS triggers! I may have to tweak the GPS, but it was great to be able to try it out with a group of people and talk about both the technical aspects, the development process, and my curatorial choices.

David Cooper Cd2JzJpWAAAV4VA

On the Poetic Places walk, image by Dr David Cooper

We’ve already had positive feedback about the app, which is really great, and a few suggestions for poems we should consider including going forward, which is also great because I do intend to go forward.

Poetic Places has been released into the wild, but there’s still a lot of scope for growth. The first, and most obvious step, is to start including materials from outside of London; we limited the launch content to London for several reasons but I’m determined to expand its coverage to the rest of the UK and beyond. I also hope to bring in more contemporary materials. I’ve very much enjoyed working with out-of-copyright literature and showcasing Open collections (such as the British Library Flickr collection) but it’d be good to include more recent works. Audio, too, was out of scope for this pilot stage of Poetic Places, but I think this could be a popular feature in the future.

I’m going to pause before ploughing on with all of this, because I want to write up more of the development and choices that have shaped Poetic Places so far, but I’m enthused with ideas for Poetic Places to grow and inspire people.

If you’d like to find out more about Poetic Places, keep updated about its progress, and give feedback or content suggestions, please visit and follow us on Twitter @poetic_places.