Digital scholarship blog

Enabling innovative research with British Library digital collections

58 posts categorized "LIS research"

14 September 2023

What's the future of crowdsourcing in cultural heritage?

The short version: crowdsourcing in cultural heritage is an exciting field, rich in opportunities for collaborative, interdisciplinary research and practice. It includes online volunteering, citizen science, citizen history, digital public participation, community co-production, and, increasingly, human computation and other systems that will change how participants relate to digital cultural heritage. New technologies like image labelling, text transcription and natural language processing, plus trends in organisations and societies at large mean constantly changing challenges (and potential). Our white paper is an attempt to make recommendations for funders, organisations and practitioners in the near and distant future. You can let us know what we got right, and what we could improve by commenting on Recommendations, Challenges and Opportunities for the Future of Crowdsourcing in Cultural Heritage: a White Paper.

The longer version: The Collective Wisdom project was funded by an AHRC networking grant to bring experts from the UK and the US together to document the state of the art in designing, managing and integrating crowdsourcing activities, and to look ahead to future challenges and unresolved issues that could be addressed by larger, longer-term collaboration on methods for digitally-enabled participation.

Our open access Collective Wisdom Handbook: perspectives on crowdsourcing in cultural heritage is the first outcome of the project, our expert workshops were a second.

Mia (me) and Sam Blickhan launched our White Paper for comment on pubpub at the Digital Humanities 2023 conference in Graz, Austria, in July this year, with Meghan Ferriter attending remotely. Our short paper abstract and DH2023 slides are online at Zenodo

So - what's the future of crowdsourcing in cultural heritage? Head on over to Recommendations, Challenges and Opportunities for the Future of Crowdsourcing in Cultural Heritage: a White Paper and let us know what you think! You've got until the end of September…

You can also read our earlier post on 'community review' for a sense of the feedback we're after - in short, what resonates, what needs tweaking, what examples could we include?

To whet your appetite, here's a preview of our five recommendations. (To find out why we make those recommendations, you'll have to read the White Paper):

  • Infrastructure: Platforms need sustainability. Funding should not always be tied to novelty, but should also support the maintenance, uptake and reuse of well-used tools.
  • Evidencing and Evaluation: Help create an evaluation toolkit for cultural heritage crowdsourcing projects; provide ‘recipes’ for measuring different kinds of success. Shift thinking about value from output/scale/product to include impact on participants' and community well-being.
  • Skills and Competencies: Help create a self-guided skills inventory assessment resource, tool, or worksheet to support skills assessment, and develop workshops to support their integrity and adoption.
  • Communities of Practice: Fund informal meetups, low-cost conferences, peer review panels, and other opportunities for creating and extending community. They should have an international reach, e.g. beyond the UK-US limitations of the initial Collective Wisdom project funding.
  • Incorporating Emergent Technologies and Methods: Fund educational resources and workshops to help the field understand opportunities, and anticipate the consequences of proposed technologies.

What have we missed? Which points do you want to boost? (For example, we discovered how many of our points apply to digital scholarship projects in general). You can '+1' on points that resonate with you, suggest changes to wording, ask questions, provide examples and references, or (constructively, please) challenge our arguments. Our funding only supported participants from the UK and US, so we're very keen to hear from folk from the rest of the world.

02 May 2023

Detecting Catalogue Entries in Printed Catalogue Data

This is a guest blog post by Isaac Dunford, MEng Computer Science student at the University of Southampton. Isaac reports on his Digital Humanities internship project supervised by Dr James Baker.

Introduction

The purpose of this project has been to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information. The first part of this project involved experimenting with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum (described by Rossitza Atanassova in a post announcing her AHRC-RLUK Professional Practice Fellowship project) and trying to find the best ways to detect individual entries and reassemble them as data (given that the text for a single catalogue entry may be spread across multiple pages of a printed catalogue). Then the next part of this project involved building a complete system based on this approach to take the large volume of XML files for a volume and output all of the catalogue entries in a series of desired formats. This post describes our initial experiments with that data, the approach we settled on, and key features of our approach that you should be able to reapply to your catalogue data. All data and code can be found on the project GitHub repo.

Experimentation

The catalogue data was exported from Transkribus in two different formats: an ALTO XML schema and a PAGE XML schema. The ALTO layout encodes positional information about each element of the text (that is, where each word occurs relative to the top left corner of the page) that makes spatial analysis - such as looking for gaps between lines - helpful. However, it also creates data files that are heavily encoded, meaning that it can be difficult to extract the text elements from the data files. Whereas the PAGE schema makes it easier to access the text element from the files.

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the PAGE XML Schema
Raw PAGE XML for a page from volume 8 of the Incunabula Catalogue

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the ALTO XML Schema
Raw ALTO XML for a page from volume 8 of the Incunabula Catalogue

 

Spacing and positioning

One of the first approaches tried in this project was to use size and spacing to find entries. The intuition behind this is that there is generally a larger amount of white space around the headings in the text than there is between regular lines. And in the ALTO schema, there is information about the size of the text within each line as well as about the coordinates of the line within the page.

However, we found that using the size of the text line and/or the positioning of the lines was not effective for three reasons. First, blank space between catalogue entries inconsistently contributed to the size of some lines. Second, whenever there were tables within the text, there would be large gaps in spacing compared to the normal text, that in turn caused those tables to be read as divisions between catalogue entries. And third, even though entry headings were visually further to the left on the page than regular text, and therefore should have had the smallest x coordinates, the materiality of the printed page was inconsistently represented as digital data, and so presented regular lines with small x coordinates that could be read - using this approach - as headings.

Final Approach

Entry Detection

Our chosen approach uses the data in the page XML schema, and is bespoke to the data for the Catalogue of books printed in the 15th century now at the British Museum as produced by Transkribus (and indeed, the version of Transkribus: having built our code around some initial exports, running it over  the later volumes - which had been digitised last -  threw an error due to some slight changes to the exported XML schema).

The code takes the XML input and finds entry using a content-based approach that looks for features at the start and end of each catalogue entry. Indeed after experimenting with different approaches, the most consistent way to detect the catalogue entries was to:

  1. Find the “reference number” (e.g. IB. 39624) which is always present at the end of an entry.
  2. Find a date that is always present after an entry heading.

This gave us an ability to contextually infer the presence of a split between two catalogue entries, the main limitation of which is quality of the Optical Character Recognition (OCR) at the point at which the references and dates occur in the printed volumes.

 

An image of a digitised page with a catalogue entry and the corresponding text output in XML format
XML of a detected entry

 

Language Detection

The reason for dividing catalogue entries in this way was to facilitate analysis of the catalogue data, specifically analysis that sought to define the linguistic character of descriptions in the Catalogue of books printed in the 15th century now at the British Museum and how those descriptions changed and evolved across the thirteen volumes. As segments of each catalogue entry contains text transcribed from the incunabula that were not written by a cataloguer (and therefore not part of their cataloguing ‘voice’), and as those transcribed sections are in French, Dutch, Old English, and other languages that a machine could detect as not being modern English, to further facilitate research use of the final data, one of the extensions we implemented was to label sections of each catalogue entry by the language. This was achieved using a python library for language detection and then - for a particular output type - replacing non-English language sections of text with a placeholder (e.g. NON-ENGLISH SECTION). And whilst the language detection model does not detect the Old-English, and varies between assigning those sections labels for different languages as a result, the language detection was still able to break blocks of text in each catalogue entry into the English and non-English sections.

 

Text files for catalogue entry number IB39624 showing the full text and the detected English-only sections.
Text outputs of the full and English-only sections of the catalogue entry

 

Poorly Scanned Pages

Another extension for this system was to use the input data to try and determine whether a page had been poorly scanned: for example, that the lines in the XML input read from one column straight into another as a single line (rather than the XML reading order following the visual signifiers of column breaks). This system detects poorly scanned pages by looking at the lengths of all lines in the page XML schema, establishing which lines deviate substantially from the mean line length, and if sufficient outliers are found then marking the page as poorly scanned.

Key Features

The key parts of this system which can be taken and applied to a different problem is the method for detecting entries. We expect that the fundamental method of looking for marks in the page content to identify the start and end of catalogue entries in the XML files would be applicable to other data derived from printed catalogues. The only parts of the algorithm which would need changing for a new system would be the regular expressions used to find the start and end of the catalogue entry headings. And as long as the XML input comes in the same schema, the code should be able to consistently divide up the volumes into the individual catalogue entries.

03 April 2023

Topics in contemporary Digital Scholarship via five years of our Reading Group

Since March 2016, the Digital Scholarship Reading Group at the British Library has discussed articles, videos, podcasts, blog posts and chapters that touch on digital scholarship in libraries. I've shared our readings up to May 2018 and taken a thematic look at our readings at the intersection of digital scholarship and anti-racism in July 2020.

As the Living with Machines project draws to an end this (northern) summer, I thought I'd give an updated list of our readings since June 2018. I started including more pieces on deep learning, machine learning, AI ('artificial intelligence'), big data, data science, digital history, digitised newspapers, and user experience design for digital collections when we began discussing what became Living with Machines in early 2017. This was partly a way for me to catch up with relevant topics, and partly to lay the groundwork for LwM across the organisation. You can see that reflected in our topics up to May 2018 and onward.

Of course, the group continued to cover other topics, and sessions were suggested and/or led by colleagues including Adi Keinan-Schoonbaert, Annabel Gallop, Graham Jevon, Jez Cope, Lucy Hinnie, Mary Stewart, Nora McGregor, Sarah Miles, Sarah Stewart and Stella Wisdom. Especial thanks to Rossitza Atanassova and Deirdre Sullivan who’ve been helping me run the group in recent years. In 2021 we started using the January session to invite colleagues across the Library to look around and pick topics for discussion in the year ahead.

So what did we discuss from June 2018 to the end of 2022?

29 November 2022

My AHRC-RLUK Professional Practice Fellowship: Four months on

In August 2022 I started work on a project to investigate the legacies of curatorial voice in the descriptions of incunabula collections at the British Library and their future reuse. My research is funded by the collaborative AHRC-RLUK Professional Practice Fellowship Scheme for academic and research libraries which launched in 2021. As part of the first cohort of ten Fellows I embraced this opportunity to engage in practitioner research that benefits my institution and the wider sector, and to promote the role of library professionals as important research partners.

The overall aim of my Fellowship is to demonstrate new ways of working with digitised catalogues that would also improve the discoverability and usability of the collections they describe. The focus of my research is the Catalogue of books printed in the 15th century now at the British Museum (or BMC) published between 1908 and 2007 which describes over 12,700 volumes from the British Library incunabula collection. By using computational approaches and tools with the data derived from the catalogue I will gain new insights into and interpretations of this valuable resource and enable its reuse in contemporary online resources. 

Titlepage to volume 2 of the Catalogue of books printed in the fifteenth century now in the British Museum, part 2, Germany, Eltvil-Trier
BMC volume 2 titlepage


This research idea was inspired by a recent collaboration with Dr James Baker, who is also my mentor for this Fellowship, and was further developed in conversations with Dr Karen Limper-Herz, Lead Curator for Incunabula, Adrian Edwards, Head of Printed Heritage Collections, and Alan Danskin, Collections Metadata Standards Manager, who support my research at the Library.

My Fellowship runs until July 2023 with Fridays being my main research days. I began by studying the history of the catalogue, its arrangement and the structure of the item descriptions and their relationship with different online resources. Overall, the main focus of this first phase has been on generating the text data required for the computational analysis and investigations into curatorial and cataloguing practice. This work involved new digitisation of the catalogue and a lot of experimentation using the Transkribus AI-empowered platform that proved best-suited for improving the layout and text recognition for the digitised images. During the last two months I have hugely benefited from the expertise of my colleague Tom Derrick, as we worked together on creating the training data and building structure models for the incunabula catalogue images.

An image from Transkribus Lite showing a page from the catalogue with separate regions drawn around columns 1 and 2, and the text baselines highlighted in purple
Layout recognition output for pages with only two columns, including text baselines, viewed on Transkribus Lite

 

An image from Transkribus Lite showing a page from the catalogue alongside the text lines
Text recognition output after applying the model trained with annotations for 2 columns on the page, viewed on Transkribus Lite

 

An image from Transkribus Lite showing a page from the catalogue with separate regions drawn around 4 columns of text separated by a single text block
Layout recognition output for pages with mixed layout of single text block and text in columns, viewed on Transkribus Lite

Whilst the data preparation phase has taken longer than I had planned due to the varied layout of the catalogue, this has been an important part of the process as the project outcomes are dependent on using the best quality text data for the incunabula descriptions. The next phase of the research will involve the segmentation of the records and extraction of relevant information to use with a range of computational tools. I will report on the progress with this work and the next steps early next year. Watch this space and do get in touch if you would like to learn more about my research.

This blogpost is by Dr Rossitza Atanassova, Digital Curator for Digitisation, British Library. She is on Twitter @RossiAtanassova  and Mastodon @[email protected]

10 November 2022

'Expanding Voices, Expanding Access: Social and Community Centered Metadata'

Digital Curator Mia Ridge writes...Following a twitter conversation with Jessica BrodeFrank and Isabel Brador in mid-2021, I collaborated with them and Bri Watson on two conference panels. Our first was ' Expanding and Enriching Metadata through Engagement with Communities' for the Museum Computer Network (MCN) conference in October 2021:

'This panel discusses how cultural institutions are engaging various communities to co-create academic research and/or object metadata in order to increase representation and access to collections; highlighting how this is done in different ways to engage specific audiences and goals, i.e. graduate student assistantships, museum interactive experiences, crowdsourcing, and professional action groups'.

Earlier this year we got together again to record a panel for the National Council on Public History (NCPH) conference held in May 2022.

'As social justice movements challenge power structures, the ways in which public historians and cultural institutions create expert knowledge are also under scrutiny. Instead of using traditional top-down approaches to cataloguing, public historians and cultural institutions should be actively co-creating object metadata and research with the public. Discussion centers on how public involvement enriches the narratives we share, building transparency and trust within organizations and the surrounding communities. We hope to present various ways in which institutions are beginning this work and focus on a variety of audiences from graduate students and emerging professionals, to online citizen science communities and onsite museum audiences'.

Panelists:

"Collaboration and Citizen Science Approaches to Enriching Access to Scientific Collections," Jessica BrodeFrank, Adler Planetarium and University of London

"creating names together: homosaurus international thesaurus & the trans metadata collective," B.M. Watson, University of British Columbia iSchool; Homosaurus; Trans Metadata Collective

"Embedding Crowdsourcing in a Collaborative Data Science Project", Mia Ridge, British Library

Isabel Brador Sanz, Wolfsonian-FIU

We're sharing the video we pre-recorded for the NCPH conference so that we can include more people in the discussion: Expanding Voices, Expanding Access: Social and Community Centered Metadata.

 



20 April 2022

Importing images into Zooniverse with a IIIF manifest: introducing an experimental feature

Digital Curator Dr Mia Ridge shares news from a collaboration between the British Library and Zooniverse that means you can more easily create crowdsourcing projects with cultural heritage collections. There's a related blog post on Zooniverse, Fun with IIIF.

IIIF manifests - text files that tell software how to display images, sound or video files alongside metadata and other information about them - might not sound exciting, but by linking to them, you can view and annotate collections from around the world. The IIIF (International Image Interoperability Framework) standard makes images (or audio, video or 3D files) more re-usable - they can be displayed on another site alongside the original metadata and information provided by the source institution. If an institution updates a manifest - perhaps adding information from updated cataloguing or crowdsourcing - any sites that display that image automatically gets the updated metadata.

Playbill showing the title after other large text
Playbill showing the title after other large text

We've posted before about how we used IIIF manifests as the basis for our In the Spotlight crowdsourced tasks on LibCrowds.com. Playbills are great candidates for crowdsourcing because they are hard to transcribe automatically, and the layout and information present varies a lot. Using IIIF meant that we could access images of playbills directly from the British Library servers without needing server space and extra processing to make local copies. You didn't need technical knowledge to copy a manifest address and add a new volume of playbills to In the Spotlight. This worked well for a couple of years, but over time we'd found it difficult to maintain bespoke software for LibCrowds.

When we started looking for alternatives, the Zooniverse platform was an obvious option. Zooniverse hosts dozens of historical or cultural heritage projects, and hundreds of citizen science projects. It has millions of volunteers, and a 'project builder' that means anyone can create a crowdsourcing project - for free! We'd already started using Zooniverse for other Library crowdsourcing projects such as Living with Machines, which showed us how powerful the platform can be for reaching potential volunteers. 

But that experience also showed us how complicated the process of getting images and metadata onto Zooniverse could be. Using Zooniverse for volumes of playbills for In the Spotlight would require some specialist knowledge. We'd need to download images from our servers, resize them, generate a 'manifest' list of images and metadata, then upload it all to Zooniverse; and repeat that for each of the dozens of volumes of digitised playbills.

Fast forward to summer 2021, when we had the opportunity to put a small amount of funding into some development work by Zooniverse. I'd already collaborated with Sam Blickhan at Zooniverse on the Collective Wisdom project, so it was easy to drop her a line and ask if they had any plans or interest in supporting IIIF. It turns out they had, but hadn't had the resources or an interested organisation necessary before.

We came up with a brief outline of what the work needed to do, taking the ability to recreate some of the functionality of In the Spotlight on Zooniverse as a goal. Therefore, 'the ability to add subject sets via IIIF manifest links' was key. ('Subject set' is Zooniverse-speak for 'set of images or other media' that are the basis of crowdsourcing tasks.) And of course we wanted the ability to set up some crowdsourcing tasks with those items… The Zooniverse developer, Jim O'Donnell, shared his work in progress on GitHub, and I was very easily able to set up a test project and ask people to help create sample data for further testing. 

If you have a Zooniverse project and a IIIF address to hand, you can try out the import for yourself: add 'subject-sets/iiif?env=production' to your project builder URL. e.g. if your project is number #xxx then the URL to access the IIIF manifest import would be https://www.zooniverse.org/lab/xxx/subject-sets/iiif?env=production

Paste a manifest URL into the box. The platform parses the file to present a list of metadata fields, which you can flag as hidden or visible in the subject viewer (public task interface). When you're happy, you can click a button to upload the manifest as a new subject set (like a folder of items), and your images are imported. (Don't worry if it says '0 subjects).

 

Screenshot of manifest import screen
Screenshot of manifest import screen

You can try out our live task and help create real data for testing ingest processes at ​​https://frontend.preview.zooniverse.org/projects/bldigital/in-the-spotlight/classify

This is a very brief introduction, with more to come on managing data exports and IIIF annotations once you've set up, tested and launched a crowdsourced workflow (task). We'd love to hear from you - how might this be useful? What issues do you foresee? How might you want to expand or build on this functionality? Email [email protected] or tweet @mia_out @LibCrowds. You can also comment on GitHub https://github.com/zooniverse/Panoptes-Front-End/pull/6095 or https://github.com/zooniverse/iiif-annotations

Digital work in libraries is always collaborative, so I'd like to thank British Library colleagues in Finance, Procurement, Technology, Collection Metadata Services and various Collections departments; the Zooniverse volunteers who helped test our first task and of course the Zooniverse team, especially Sam, Jim and Chris for their work on this.

 

23 December 2021

Three crowdsourcing opportunities with the British Library

Digital Curator Dr Mia Ridge writes, In case you need a break from whatever combination of weather, people and news is around you, here are some ways you can entertain yourself (or the kids!) while helping make collections of the British Library more findable, or help researchers understand our past. You might even learn something or make new discoveries along the way!

Your help needed: Living with Machines

Mia Ridge writes: Living with Machines is a collaboration between the British Library and the Alan Turing Institute with partner universities. Help us understand the 'machine age' through the eyes of ordinary people who lived through it. Our refreshed task builds on our previous work, and includes fresh newspaper titles, such as the Cotton Factory Times.

What did the Victorians think a 'machine' was - and did it matter where you lived, or if you were a worker or a factory owner? Help us find out: https://www.zooniverse.org/projects/bldigital/living-with-machines

Your contributions will not only help researchers - they'll also go on display in our exhibition

Image of a Cotton Factory Times masthead
You can read articles from Manchester's Cotton Factory Times in our crowdsourced task

 

Your help needed: Agents of Enslavement? Colonial newspapers in the Caribbean and hidden genealogies of the enslaved

Launched in July this year, Agents of Enslavement? is a research project which explores the ways in which colonial newspapers in the Caribbean facilitated and challenged the practice of slavery. One goal is to create a database of enslaved people identified within these newspapers. This benefits people researching their family history as well as those who simply want to understand more about the lives of enslaved people and their acts of resistance.

Project Investigator Graham Jevon has posted some insights into how he processes the results to the project forum, which is full of fascinating discussion. Join in as you take part: ​​https://www.zooniverse.org/projects/gjevon/agents-of-enslavement

Your help needed: Georeferencer

Dr. Gethin Rees writes: The community have now georeferenced 93% of 1277 maps that were added from our War Office Archive back in July (as mentioned in our previous newsletter).  

Some of the remaining maps are quite tricky to georeference and so if there is a perplexing map that you would like some guidance with do get in contact with myself and our curator for modern mapping  by emailing [email protected] and we will try to help. Please do look forward to some exciting news maps being released on the platform in 2022!

29 September 2021

Sailing Away To A Distant Land - Mahendra Mahey, Manager of BL Labs - final post

Posted by Mahendra Mahey, former Manager of British Library Labs or "BL Labs" for short

[estimated reading time of around 15 minutes]

This is is my last day working as manager of BL Labs, and also my final posting on the Digital Scholarship blog. I thought I would take this chance to reflect on my journey of almost 9 years in helping to set up, maintain and enabling BL Labs to become a permanent fixture at the British Library (BL).

BL Labs was the first digital Lab in a national library, anywhere in the world, that gets people to experiment with its cultural heritage digital collections and data. There are now several Gallery, Library, Archive and Museum Labs or 'GLAM Labs' for short around the world, with an active community which I helped build, from 2018.

I am really proud I was there from the beginning to implement the original proposal which was written by several colleagues, but especially Adam Farquhar, former head of Digital Scholarship at the British Library (BL). The project was at first generously funded by the Andrew W. Mellon foundation through four rounds of funding as well as support from the BL. In April 2021, the project became a permanently funded fixture, helped very much by my new manager Maja Maricevic, Head of Higher Education and Science.

The great news is that BL Labs is going to stay after I have left. The position of leading the Lab will soon be advertised. Hopefully, someone will get a chance to work with my helpful and supportive colleague Technical Lead of Labs, Dr Filipe Bento, bright, talented and very hard working Maja and other great colleagues in Digital Research and wider at the BL.

The beginnings, the BL and me!

I met Adam Farquhar and Aly Conteh (Former Head of Digital Research at the BL) in December 2012. They must have liked something about me because I started working on the project in January 2013, though I officially started in March 2013 to launch BL Labs.

I must admit, I had always felt a bit intimidated by the BL. My first visit was in the early 1980s before the St Pancras site was opened (in 1997) as a Psychology student. I remember coming up from Wolverhampton on the train to get a research paper about "Serotonin Pathways in Rats when sleeping" by Lidov, feeling nervous and excited at the same time. It felt like a place for 'really intelligent educated people' and for those who were one for the intellectual elites in society. It also felt for me a bit like it represented the British empire and its troubled history of colonialism, especially some of the collections which made me feel uncomfortable as to why they were there in the first place.

I remember thinking that the BL probably wasn't a place for some like me, a child of Indian Punjabi immigrants from humble beginnings who came to England in the 1960s. Actually, I felt like an imposter and not worthy of being there.

Nearly 9 years later, I can say I learned to respect and even cherish what was inside it, especially the incredible collections, though I also became more confident about expressing stronger views about the decolonisation of some of these.  I became very fond of some of the people who work or use it, there are some really good kind-hearted souls at the BL. However, I never completely lost that 'imposter and being an outsider' feeling.

What I remember at that time, going for my interview, was having this thought, what will happen if I got the position and 'What would be the one thing I would try and change?'. It came easily to me, namely that I would try and get more new people through the doors literally or virtually by connecting them to the BL's collections (especially the digital). New people like me, who may have never set foot, or had been motivated to step into the building before. This has been one of the most important reasons for me to get up in the morning and go to work at BL Labs.

So what have been my highlights? Let's have a very quick pass through!

BL Labs Launch and Advisory Board

I launched BL Labs in March 2013, one week after I had started. It was at the launch event organised by my wonderfully supportive and innovative colleague, Digital Curator Stella Wisdom. I distinctly remember in the afternoon session (which I did alone), I had to present my 'ideas' of how I might launch the first BL Labs competition where we would be trying to get pioneering researchers to work with the BL's digital collections.

God it was a tough crowd! They asked pretty difficult questions, questions I myself was asking too which I still didn't know the answer too either.

I remember Professors Tim Hitchcock (now at Sussex University and who eventually sat (and is still sitting) on the BL Labs Advisory Board) and Laurel Brake (now Professor Emerita of Literature and Print Culture, Birkbeck, University of London) being in the audience together with staff from the Royal Library of Netherlands, who 6 months later launched their own brilliant KB Lab. Subsequently, I became good colleagues with Lotte Wilms who led their Lab for many years and is now Head of Research support at Tilburg University.

My first gut feeling overall after the event was, this is going to be hard work. This feeling and reality remained a constant throughout my time at BL Labs.

In early May 2013, we launched the competition, which was a really quick and stressful turnaround as I had only officially started in mid March (one and a half months). I remember worrying as to whether anyone would even enter!  All the final entries were pretty much submitted a few minutes before the deadline. I remember being alone that evening on deadline day near to midnight waiting by my laptop, thinking what happens if no one enters, it's going to be disaster and I will lose my job. Luckily that didn't happen, in the end, we received 26 entries.

I am a firm believer that we can help make our own luck, but sometimes luck can be quite random! Perhaps BL Labs had a bit of both!

After that, I never really looked back! BL Labs developed its own kind of pattern and momentum each year:

  • hunting around the BL for digital collections to make into datasets and make available
  • helping to make more digital collections openly licensed
  • having hundreds of conversations with people interested in connecting with the BL's digital collections in the BL and outside
  • working with some people more intensively to carry out experiments
  • developing ideas further into prototype projects
  • telling the world of successes and failures in person, meetings, events and social media
  • launching a competition and awards in April or May
  • roadshows before and after with invitations to speak at events around the world
  • the summer working with competition winners
  • late October/November the international symposium showcased things from the year
  • working on special projects
  • repeat!

The winners were announced in July 2013, and then we worked with them on their entries showcasing them at our annual BL Labs Symposium in November, around 4 months later.

'Nothing interesting happens in the office' - Roadshows, Presentations, Workshops and Symposia!

One of the highlights of BL Labs was to go out to universities and other places to explain what the BL is and what BL Labs does.  This ended up with me pretty much seeing the world (North America, Europe, Asia, Australia, and giving virtual talks in South America and Africa).

My greatest challenge in BL Labs was always to get people to truly and passionately 'connect' with the BL's digital collections and data in order to come up with cool ideas of what to actually do with them. What I learned from my very first trip was that telling people what you have is great, they definitely need to know what you have! However, once you do that, the hard work really begins as you often need to guide and inspire many of them, help and support them to use the collections creatively and meaningfully. It was also important to understand the back story of the digital collection and learn about the institutional culture of the BL if people also wanted to work with BL colleagues.  For me and the researchers involved, inspirational engagement with digital collections required a lot of intellectual effort and emotional intelligence. Often this means asking the uncomfortable questions about research such as 'Why are we doing this?', 'What is the benefit to society in doing this?', 'Who cares?', 'How can computation help?' and 'Why is it necessary to even use computation?'.

Making those connections between people and data does feel like magic when it really works. It's incredibly exciting, suddenly everyone has goose bumps and is energised. This feeling, I will take away with me, it's the essence of my work at BL Labs!

A full list of over 200 presentations, roadshows, events and 9 annual symposia can be found here.

Competitions, Awards and Projects

Another significant way BL Labs has tried to connect people with data has been through Competitions (tell us what you would like to do, and we will choose an idea and work collaboratively with you on it to make it a reality), Awards (show us what you have already done) and Projects (collaborative working).

At the last count, we have supported and / or highlighted over 450 projects in research, artistic, entrepreneurial, educational, community based, activist and public categories most through competitions, awards and project collaborations.

We also set up awards for British Library Staff which has been a wonderful way to highlight the fantastic work our staff do with digital collections and give them the recognition they deserve. I have noticed over the years that the number of staff who have been working on digital projects has increased significantly. Sometimes this was with the help of BL Labs but often because of the significant Digital Scholarship Training Programme, run by my Digital Curator colleagues in Digital Research for staff to understand that the BL isn't just about physical things but digital items too.

Browse through our project archive to get inspiration of the various projects BL Labs has been involved in or highlighted.

Putting the digital collections 'where the light is' - British Library platforms and others

When I started at BL Labs it was clear that we needed to make a fundamental decision about how we saw digital collections. Quite early on, we decided we should treat collections as data to harness the power of computational tools to work with each collection, especially for research purposes. Each collection should have a unique Digital Object Identifier (DOI) so researchers can cite them in publications.  Any new datasets generated from them will also have DOIs, allowing us to understand the ecosystem through DOIs of what happens to data when you get it out there for people to use.

In 2014, https://data.bl.uk was born and today, all our 153 datasets (as of 29/09/2021) are available through the British Library's research repository.

However, BL Labs has not stopped there! We always believed that it's important to put our digital collections where others are likely to discover them (we can't assume that researchers will want to come to BL platforms), 'where the light is' so to speak.  We were very open and able to put them on other platforms such as Flickr and Wikimedia Commons, not forgetting that we still needed to do the hard work to connect data to people after they have discovered them, if they needed that support.

Our greatest success by far was placing 1 million largely undescribed images that were digitally snipped from 65,000 digitised public domain books from the 19th Century on Flickr Commons in 2013. The number of images on the platform have grown since then by another 50 to 60 thousand from collections elsewhere in the BL. There has been significant interaction from the public to generate crowdsourced tags to help to make it easier to find the specific images. The number of views we have had have reached over a staggering 2 billion over this time. There have also been an incredible array of projects which have used the images, from artistic use to using machine learning and artificial intelligence to identify them. It's my favourite collection, probably because there are no restrictions in using it.

Read the most popular blog post the BL has ever published by my former BL Labs colleague, the brilliant and inspirational Ben O'Steen, a million first steps and the 'Mechanical Curator' which describes how we told the world why and how we had put 1 million images online for anyone to use freely.

It is wonderful to know that George Oates, the founder of Flickr Commons and still a BL Labs Advisory Board member, has been involved in the creation of the Flickr Foundation which was announced a few days ago! Long live Flickr Commons! We loved it because it also offered a computational way to access the collections, critical for powerful and efficient computational experiments, through its Application Programming Interface (API).

More recently, we have experimented with browser based programming / computational environments - Jupyter Notebooks. We are huge fans of Tim Sherrat who was a pioneer and brilliant advocate of OPEN GLAM in using them, especially through his GLAM Workbench. He is a one person Lab in his own right, and it was an honour to recognise his monumental efforts by giving him the BL Labs Research Award 2020 last year. You can also explore the fantastic work of Gustavo Candela and colleagues on Jupyter Notebooks and the ones my colleageue Filipe Bento created.

Art Exhibitions, Creativity and Education

I am extremely proud to have been involved in enabling two major art exhibitions to happen at the BL, namely:

Crossroads of Curiosity by David Normal

Imaginary Cities by Michael Takeo Magruder

I loved working with artists, its my passion! They are so creative and often not restricted by academic thinking, see the work of Mario Klingemann for example! You can browse through our archives for various artistic projects that used the BL's digital collections, it's inspiring.

I was also involved in the first British Library Fashion Student Competition won by Alanna Hilton, held at the BL which used the BL's Flickr Commons collection as inspiration for the students to design new fashion ranges. It was organised by my colleague Maja Maricevic, the British Fashion Colleges Council and Teatum Jones who were great fun to work with. I am really pleased to say that Maja has gone on from strength to strength working with the fashion industry and continues to run the competition to this day.

We also had some interesting projects working with younger people, such as Vittoria's world of stories and the fantastic work of Terhi Nurmikko-Fuller at the Australian National University. This is something I am very much interested in exploring further in the future, especially around ideas of computational thinking and have been trying out a few things.

GLAM Labs community and Booksprint

I am really proud of helping to create the international GLAM Labs community with over 250 members, established in 2018 and still active today. I affectionately call them the GLAM Labbers, and I often ask people to explore their inner 'Labber' when I give presentations. What is a Labber? It's the experimental and playful part of us we all had as children and unfortunately many have lost when becoming an adult. It's the ability to be fearless, having the audacity and perhaps even naivety to try crazy things even if they are likely to fail! Unfortunately society values success more than it does failure. In my opinion, we need to recognise, respect and revere those that have the courage to try but failed. That courage to experiment should be honoured and embraced and should become the bedrock of our educational systems from the very outset.

Two years ago, many of us Labbers 'ate our own dog food' or 'practised what we preached' when me and 15 other colleagues came together for 5 days to produce a book through a booksprint, probably the most rewarding professional experience of my life. The book is about how to set up, maintain, sustain and even close a GLAM Lab and is called 'Open a GLAM Lab'. It is available as public domain content and I encourage you to read it.

Online drop-in goodbye - today!

I organised a 30 minute ‘online farewell drop-in’ on Wednesday 29 September 2021, 1330 BST (London), 1430 (Paris, Amsterdam), 2200 (Adelaide), 0830 (New York) on my very last day at the British Library. It was heart-warming that the session was 'maxed out' at one point with participants from all over the world. I honestly didn't expect over 100 colleagues to show up. I guess when you leave an organisation you get to find out who you actually made an impact on, who shows up, and who tells you, otherwise you may never know.

Those that know me well know that I would have much rather had a farewell do ‘in person’, over a pint and praying for the ‘chip god’ to deliver a huge portion of chips with salt/vinegar and tomato sauce’ magically and mysteriously to the table. The pub would have been Mc'Glynns (http://www.mcglynnsfreehouse.com/) near the British Library in London. I wonder who the chip god was?  I never found out ;)

The answer to who the chip god was is in text following this sentence on white on white text...you will be very shocked to know who it was!- s

Spoiler alert it was me after all, my alter ego

Farwell-bl-labs-290921Mahendra's online farewell to BL Labs, Wednesday 29 September, 1330 BST, 2021.
Left: Flowers and wine from the GLAM Labbers arrived in Tallinn, 20 mins before the meeting!
Right: Some of the participants of the online farewell

Leave a message of good will to see me off on my voyage!

It would be wonderful if you would like to leave me your good wishes, comments, memories, thoughts, scans of handwritten messages, pictures, photographs etc. on the following Google doc:

http://tiny.cc/mahendramahey

I will leave it open for a week or so after I have left. Reading positive sincere heartfelt messages from colleagues and collaborators over the years have already lifted my spirits. For me it provides evidence that you perhaps did actually make a difference to somone's life.  I will definitely be re-reading them during the cold dark Baltic nights in Tallinn.

I would love to hear from you and find out what you are doing, or if you prefer, you can email me, the details are at the end of this post.

BL Labs Sailor and Captain Signing Off!

It's been a blast and lots of fun! Of course there is a tinge of sadness in leaving! For me, it's also been intellectually and emotionally challenging as well as exhausting, with many ‘highs’ and a few ‘lows’ or choppy waters, some professional and others personal.

I have learned so much about myself and there are so many things I am really really proud of. There are other things of course I wish I had done better. Most of all, I learned to embrace failure, my best teacher!

I think I did meet my original wish of wanting to help to open up the BL to as many new people who perhaps would have never engaged in the Library before. That was either by using digital collections and data for cool projects and/or simply walking through the doors of the BL in London or Boston Spa and having a look around and being inspired to do something because of it.

I wish the person who takes over my position lots of success! My only piece of advice is if you care, you will be fine!

Anyhow, what a time this has been for us all on this planet? I have definitely struggled at times. I, like many others, have lost loved ones and thought deeply about life and it's true meaning. I have also managed to find the courage to know what’s important and act accordingly, even if that has been a bit terrifying and difficult at times. Leaving the BL for example was not an easy decision for me, and I wish perhaps things had turned out differently, but I know I am doing the right thing for me, my future and my loved ones. 

Though there have been a few dark times for me both professionally and personally, I hope you will be happy to know that I have also found peace and happiness too. I am in a really good place.

I would like to thank former alumni of BL Labs, Ben O'Steen - Technical Lead for BL Labs from 2013 to 2018, Hana Lewis (2016 - 2018) and Eleanor Cooper (2018-2019) both BL Labs Project Officers and many other people I worked through BL Labs and wider in the Library and outside it in my journey.

Where I am off to and what am I doing?

My professional plans are 'evolving', but one thing is certain, I will be moving country!

To Estonia to be precise!

I plan to live, settle down with my family and work there. I was never a fan of Brexit, and this way I get to stay a European.

I would like to finish with this final sweet video created by writer and filmaker Ling Low and her team in 2016, entitled 'Hey there Young Sailor' which they all made as volunteers for the Malaysian band, the 'Impatient Sisters'. It won the BL Labs Artistic Award in 2016. I had the pleasure and honour of meeting Ling over a lovely lunch in Kuala Lumpa, Malaysia, where I had also given a talk at the National Library about my work and looked for remanants of my grandfather who had settled there many years ago.

I wish all of you well, and if you are interested in keeping in touch with me, working with me or just saying hello, you can contact me via my personal email address: [email protected] or follow my progress on my personal website.

Happy journeys through this short life to all of you!

Mahendra Mahey, former BL Labs Manager / Captain / Sailor signing off!

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs