08 January 2025
2024 Year in Review - Digital Scholarship Training Programme
Nora McGregor, Digital Curator and manager of the Digital Scholarship Training Programme reflects on a year of delivering digital upskilling training to colleagues at British Library, part of the Digital Research Team's focus on Embedding Digital Humanities in the British Library | 39 | The Digital.
2024 was a strange and difficult year, to say the least, for us and all our lovely colleagues across the whole of the British Library as we contended daily with the ongoing effects of a cyber-attack disrupting just about every aspect of our work. Not to be cowed by criminality however, the Digital Research Team dug in and ensured the Digital Scholarship Training Programme (DSTP) continued without fail.
From our experience during the pandemic, we knew that in times of major disruption, British Library staff do not stand still. They focus on what they can do, including prioritising their upskilling and have come to count on the DSTP as a kind of refuge whilst temporarily separated from their collections and normal workload.
So it’s with gratefulness to my colleagues in the Digital Research Team, and to BL staff for their engagement, that I reflect proudly on a challenging year where we managed to deliver a whopping 39 individual training events with nearly 900 attendees!
What we learned in 2024
Our training programme this year covered these topic priorities through a variety of talks, hands-on sessions, reading groups and formal workshops & courses:
- State-of-the-art Automatic Text Recognition (ATR) technologies
- Useful data science, machine learning and AI applications for analysing and enhancing GLAM digital collections and data
- The intersection of climate change + Digital Humanities
- Digital tools and methods to support the Library's Race Equality Action Plan
- WikiData, WikiSource, Wikimedia Commons
- OpenRefine for data-wrangling
- Collections as Data
- Making the most of the IIIF standard
We’re especially thankful for all the academics & professionals who contributed to our learning throughout the year by sharing their projects, experience and expertise with us! If you’d like to be part of our programme in 2025 get in touch with us at [email protected] with your idea, we’d love to hear from you.
2024 Year in Review-External Infographic by Nora McGregor
My Personal Highlights
In the coming months I will be interviewing my fellow Digital Curators to get their views on highlights from the 2024 Digital Scholarship Training Programme, either favourite events they attended or programmed in 2024 and topic areas they’re excited about this year. No easy ask actually, as I know they, like me, will have found every event spectacularly interesting and useful, but to highlight just a few for you...
21st Century Talks
Our 21st Century Curatorship talk series is looked after by Digital Curators Stella Wisdom and Adi Kienan-Schoonbaert. They are 1 hour invited guest lectures held once or twice a month where we learn about exciting, innovative, projects and research at the intersection of cultural heritage collections and new technologies. These talks are pitched for complete beginners – we try not to assume knowledge so that anyone from any department can come along! A few of my favourite talks in particular were from these projects:
- DE-BIAS - Detecting and cur(at)ing harmful language in cultural heritage collections | Europeana PRO
Kerstin Herlt and Kerstin Arnold introduced us to the DE-BIAS project which aims to detect and contextualise potentially harmful language in cultural heritage collections. Working with themes like migration and colonial past, gender and sexual identity, ethnicity and ethno-religious identity, the project collaborates with minority communities to better understand the stories behind the language used - or behind the gaps apparent. We learned about the development of the vocabulary and the tools the project has created. - The Print and Probability Project: From Restoration Era Printing to an Interim English Short Title Catalogue
Nikolai Vogler gave us an entertaining view of a selection of findings from the University of California’s Print & Probability project, an interdisciplinary research group at the intersection of book history, computer vision, and machine learning that seeks to discover Restoration-era letterpress printers whose identities have eluded scholars for several hundred years. He also presented his work on creating an interim English Short Title Catalogue (ESTC) in response to the cyber-attack on the Library in 2023, a pursuit for which colleagues were incredibly grateful for! - “Dark Matter: X%” - how many early modern Hungarian books disappeared without any trace?
This was such a fascinating talk by Péter Király, software developer and digital humanities researcher at the Göttingen computation centre, Germany. Estimating the unknown is always an interesting endeavour. There is a registry of surviving books, and we have collective knowledge about lost books, but how many early Hungarian printings have been lost without any historical trace? Their research group transformed the analytical bibliography "Régi Magyarországi Nyomtatványok" (Early Hungarian Printings) into a database and the use of mathematical models from the toolbox of biologists were employed to help estimate it. The analysis of the database also highlights unknown or less investigated areas and enables them to extend previous research focusing on a particular time range to the whole period (such as religious trends during reformation and counter reformation, the changes of genres over times).
Hack & Yacks
I have the privilege of programming and leading this particular series of events and they are my favourite days in the calendar! These are our casual, 2hr monthly meet ups where we all take some time to have a hands-on exploration of new tools, techniques, and applications. No previous experience is ever needed, these are aimed at complete beginners (we’re usually learning something new too!) and we welcome colleagues from across the Library to come have a play! Some sessions are more "yack" than "hack", while others are more quiet hacking depending on the topic but no matter the balance they're always illuminating.
- Introduction to AI and Machine learning was great fun for me personally as I had the chance to give staff an interactive and hands-on introduction to concepts around AI and ML, as it relates to library work, and play around with some open machine learning tools. The session was based on much of the text and activities offered in this topic guide AI & ML in Libraries Literacies – Digital Scholarship & Data Science Essentials for Library Professionals and it was a useful way for me to test the content directly with its intended audience!
- Catalogues as Data was a session run by Harry Lloyd our Research Software Engineer Extraordinaire and Rossitza Atanassova, Digital Curator, as a two part guided exploration of printed Catalogues as data, working with OCR output and corpus linguistic analysis. In the first half we followed steps in a Jupyter Notebook to extract catalogue entries from OCR text, troubleshoot errors in the algorithm, and investigate Named Entity Recognition techniques. In the second half we explore catalogue entries using corpus linguistic techniques using AntConc, gaining a sense of how cataloguing practice and the importance of different terms changes over time.
Digital Scholarship Reading Group
These monthly discussions led by Digital Curators Mia Ridge and Rossitza Atanassova, are always open to any of our BL colleagues & students, regardless of job title or department. Discussions are regularly attended by colleagues from a range of departments including curators, reference specialists, technology, and research services.
My favourite session of the year by far was “No stupid questions, AI in Libraries”, a lovely meandering session we held in December and a great way to wrap up the year. Instead of discussing any particular reading, we all shared bits about what we had read or learned about independently on the topic of AI in Libraries and had some good-natured debate about where we believe it’s all headed for us on personal and professional levels. Though no readings were required, these were offered in case folks wanted to swot up:
- National Film and Sound Archive of Australia (NFSA) Principles for ML and AI
- Approaching AI at the National Library of Scotland
- Creative Australia Principles: Generative Artificial Intelligence and creative work
- (US) National Archives’ New Strategic Framework Emphasizes Building Capacity Through Responsible Use of Artificial Intelligence
- Smithsonian AI Values Statement -- Spring 2022
- Library of Congress (LC) Labs AI Planning Framework
- Generative AI Framework for HMG (which includes the BL!)
- Generative AI at the BBC
Formal Workshops
We also programme formal courses as needed and this year we focussed very much on building our knowledge of the Wikimedia Universe. I thoroughly enjoyed the lessons we got from Lucy Hinnie and Stuart Prior which covered nearly every aspect of Wikimedia, and we’ll doing much more with this new knowledge, particularly WikiData in 2025!