Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

18 December 2024

The challenges of AI for oral history: theoretical and practical issues

Oral History Archivist Charlie Morgan provides examples of how AI-based tools integrated into workflows might affect oral historians' consideration of orality and silence in the second of two posts on a talk he gave with Digital Curator Mia Ridge at the 7th World Conference of the International Federation for Public History in Belval, LuxembourgHis first post proposed some key questions for oral historians thinking about AI, and shared an example automatic speech recognition (ASR) tools in practice. 

While speech to text once seemed at the cutting edge of AI, software designers are now eager to include additional functions. Many incorporate their own chatbots or other AI ‘helpers’ and the same is true of ‘standard’ software. Below you can see what happened when I asked the chatbots in Otter and Adobe Acrobat some questions about other transcribed clips from the ‘Lives in Steel’ CD:

Screenshot of search and chatbot interactions with transcribed text
A composite image of chatbot responses to questions about transcribed clips

In Otter, the chatbot does well at answering a question on sign language but fails to identify the accent or dialect of the speaker. This is a good reminder of the limits of these models and how, without any contextual information, they cannot understand the interview beyond textual analysis. Oral historians in the UK have long understood interviews as fundamentally oral sources and current AI models risk taking us away from this.

In Adobe I tried asking a much more subjective question around emotion in the interview. While the chatbot does answer, it is again worth remembering the limits of this textual analysis, which, for example, could not identify crying, laughter or pitch change as emotion. It would also not understand the significance of any periods of silence. On our panel at the IFPH2024 conference in Luxembourg Dr Julianne Nyhan noted how periods of silence tend to lead speech-to-text models to ‘hallucinate’ so the advice is to take them out; the problem is that oral history has long theorised the meaning and importance of silence.

Alongside the chatbot, Adobe also includes a degree of semantic searching where a search for steel brings up related words. This in itself might be the biggest gift new technologies offer to catalogue searching (shown expertly in Placing the Holocaust) – helping us to move away from what Mia Ridge calls ‘the tyranny of the keyword’.

However, the important thing is perhaps not how well these tools perform but the fact they exist in the first place. Oral historians and archivists who, for good reasons, are hesitant about integrating AI into their work might soon find it has happened anyway. For example, Zencastr, the podcasting software we have used since 2020 for remote recordings, now has an in-built AI tool. Robust principles on the use of AI are essential then not just for new projects or software, but also for work we are already doing and software we are already using.

The rise of AI in oral history raises theoretical questions around orality and silence, but must also be considered in terms of practical workflows: Do participation and recording agreements need to be amended?​ How do we label AI generated metadata in catalogue records, and should we be labelling human generated metadata too? Do AI tools change the risks and rewards of making oral histories available online? We can only answer these questions through though critical engagement with the tools themselves.

The challenges of AI for oral history: key questions

Oral History Archivist Charlie Morgan shares some key questions for oral historians thinking about AI, and shares some examples of automatic speech recognition (ASR) tools in practice in the first of two posts...

Oral history has always been a technologically mediated discipline and so has not been immune to the current wave of AI hype. Some have felt under pressure to ‘do some AI’, while others have gone ahead and done it. In the British Library oral history department, we have been adamant that any use of AI must align practically, legally and ethically with the Library’s AI principles (currently in draft form). While the ongoing effects of the 2023 cyber-attack have also stymied any integration of new technologies into archival workflows, we have begun to experiment with some tools. In September, I was pleased to present on this topic with Digital Curator Mia Ridge at the 7th World Conference of the International Federation for Public History in Belval, Luxembourg. Below is a summary of what I spoke about in our presentation, ‘Listening with machines? The challenges of AI for oral history and digital public history in libraries’.

The ‘boom’ in AI and oral history has mostly focussed on speech recognition and transcription, driven by the release of Trint (2014) and Otter (2016), but especially Whisper (2022). There have also been investigations into indexing, summarising and visualisation, notably from the Congruence Engine project. Oral historians are interested in how AI tools could help with documentation and analysis but many also have concerns. Concerns include, but are not limited to, ownership, data protection/harvesting, labour conditions, environmental costs, loss of human involvement, unreliable outputs and inbuilt biases.

For those of us working with archived collections there are specific considerations: How do we manage AI generated metadata? Should we integrate new technologies into catalogue searching? What are the ethics of working at scale and do we have the experience to do so? How do we factor in interviewee consent, especially since speakers in older collections are now likely dead or uncontactable?

With speech recognition, we are now at a point where we can compare different automated transcripts created at different times. While our work on this topic at the British Library has been minimal, future trials might help us build up enough research data to address the above questions.

Robert Gladders was interviewed by Alan Dein for the National Life Stories oral history project ‘Lives in Steel’ in 1991 and the extract below was featured on the 1993 published CD ‘Lives in Steel’.

The full transcripts for this audio clip are at the end of this post.

Sign Language

We can compare three automatic speech recognition (ASR) transcripts of the first line:

  • Human: Sign language was for telling the sample to the first hand, what carbon the- when you took the sample up into the lab, you run with the sample to the lab​
  • Otter 2020: Santa Lucia Chelan, the sound pachala fest and what cabin the when he took the sunlight into the lab, you know they run with a sample to the lab​
  • Otter 2024: Sign languages for selling the sample, pass or the festa and what cabin the and he took the samples into the lab. Yet they run with a sample to the lab.
  • Whisper 2024: The sand was just for telling the sand that they were fed down. What cabin, when he took the sand up into the lab, you know, at the run with the sand up into the lab

Gladders speaks with a heavy Middlesbrough accent and in all cases the ASR models struggle, but the improvements between 2020 and 2024 are clear. In this case, Otter in 2024 seems to outperform Whisper (‘The sand’ is an improvement on ‘Santa Lucia Chelan’ but it isn’t ‘Sign languages’), but this was a ‘small’ version of Whisper and larger models might well perform better.

One interesting point of comparison is how the models handle ‘sample passer’, mentioned twice in the short extract:

  • Otter 2020: Sentinel pastor / sound the password​
  • Otter 2024: Salmon passer / Saturn passes​
  • Whisper 2024: Santland pass / satin pass

While in all cases the models fail, this would be easy to fix. The aforementioned CD came with its own glossary, which we could feed into a large language model working on these transcriptions. Practically this is not difficult but it raises some larger questions. Do we need to produce tailored lexicons for every collection? This is time-consuming work so who is going to do it? Would we label an automated transcript in 2024 that makes use of a human glossary written in 1993 as machine generated, human generated, or both? Moreover, what level of accuracy we are willing to accept and how do we define accuracy itself?

 

Samplepasser: The top man on the melting shop with responsibility for the steel being refined. Sampling: The act of taking a sample of steel from a steel furnace, using a long-handled spoon which is inserted into the furnace and withdrawn. Sintering: The process of heating crushed iron-ore dust and particles (fines) with coke breeze in an oxidising atmosphere to reduce sulphur content and produce a more effective and consistent charge for the blast furnaces. This process superseded the earlier method of charging the furnaces with iron-ore and coke, and led to greatly increased tonnages of iron being produced
Sample glossary terms

Continue reading "The challenges of AI for oral history: key questions"

17 December 2024

Open cultural data - an open GLAM perspective at the British Library

Drawing on work at and prior to the British Library, Digital Curator Mia Ridge shares a personal perspective on open cultural data for galleries, libraries, archives and museums (GLAMs) based on a recent lecture for students in Archives and Records Management…

Cultural heritage institutions face both exciting opportunities and complex challenges when sharing their collections online. This post gives common reasons why GLAMs share collections as open cultural data, and explores some strategic considerations behind making collections accessible.

What is Open Cultural Data?

Open cultural data includes a wide range of digital materials, from individual digitised or born-digital items – images, text, audiovisual records, 3D objects, etc. – to datasets of catalogue metadata, images or text, machine learning models and data derived from collections.

Open data must be clearly licensed for reuse, available for commercial and non-commercial use, and ideally provided in non-proprietary formats and standards (e.g. CSV, XML, JSON, RDF, IIIF).

Why Share Open Data?

The British Library shares open data for multiple compelling reasons.

Broadening Access and Engagement: by releasing over a million images on platforms like Flickr Commons, the Library has achieved an incredible 1.5 billion views. Open data allows people worldwide to experience wonder and delight with collections they might never physically access in the UK.

Deepening Access and Engagement: crowdsourcing and online volunteering provide opportunities for enthusiasts to spend time with individual items while helping enrich collections information. For instance, volunteers have helped transcribe complex materials like Victorian playbills, adding valuable contextual information.

Supporting Research and Scholarship: in addition to ‘traditional’ research, open collections support the development of reproducible computational methods including text and data mining, computer vision and image analysis. Institutions also learn more about their collections through formal and informal collaborations.

Creative Reuse: open data encourages artists to use collections, leading to remarkable creative projects including:

Animation featuring an octopus holding letters and parcels on a seabed with seaweed
Screenshot from Hey There Young Sailor (Official Video) - The Impatient Sisters

 

16 illustrations of girls in sad postures
'16 Very Sad Girls' by Mario Klingemann

 

A building with large-scale projection
The BookBinder, by Illuminos, with British Library collections

 

Some lessons for Effective Data Sharing

Make it as easy as possible for people to find and use your open collections:

  • Tell people about your open data
  • Celebrate and highlight creative reuses
  • Use existing licences for usage rights where possible
  • Provide data in accessible, sustainable formats
  • Offer multiple access methods (e.g. individual items, datasets, APIs)
  • Invest effort in meeting the FAIR, and where appropriate, CARE principles

Navigating Challenges

Open data isn't without tensions. Institutions must balance potential revenue, copyright restrictions, custodianship and ethical considerations with the benefits of publishing specific collections.

Managing expectations can also be a challenge. The number of digitised or born-digital items available may be tiny in comparison to the overall size of collections. The quality of digitised records – especially items digitised from microfiche and/or decades ago – might be less than ideal. Automatic text transcription and layout detection errors can limit the re-usability of some collections.

Some collections might not be available for re-use because they are still in copyright (or are orphan works, where the creator is not known), were digitised by a commercial partner, or are culturally sensitive.

The increase in the number of AI companies scraping collections site to train machine learning models has also given some institutions cause to re-consider their open data policies. Historical collections are more likely to be out of copyright and published for re-use, but they also contain structural prejudices and inequalities that could be embedded into machine learning models and generative AI outputs.

Conclusion

Open cultural data is more than just making collections available—it's about creating dynamic, collaborative spaces of knowledge exchange. By thoughtfully sharing our shared intellectual heritage, we enable new forms of research, inspiration and enjoyment.

 

AI use transparency statement: I recorded my recent lecture on my phone, then generated a loooong transcription on my phone. I then supplied the transcription and my key points to Claude, with a request to turn it into a blog post, then manually edited the results.