Digital scholarship blog

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

22 June 2023

Explore Windrush Tales in our Digital Storytelling exhibition

On the 22^nd June in 1948 the HMT Empire Windrush docked in Tilbury, Essex, bringing people from the Caribbean who had been invited to help rebuild "the motherland" after the devastation of the Second World War.

Seventy-five years later, stories from the Windrush generation are shared in a ground breaking illustrated text-based interactive narrative from 3-Fold Games. Windrush Tales is still in development, but a preview can be read exclusively in the British Library’s current Digital Storytelling exhibition, which is open until 15 October 2023.

Illustration of a boat and story choices

Windrush Tales art by Naima Ramanan © 3-Fold Games

Windrush Tales tells the stories of characters Rose, an aspiring nurse, and her older brother Vernon, through a beautiful illustrated interactive photo album and branching narrative, allowing choices within the game to lead to one of many endings. Apprehensive but tenacious, Rose joins her brother in England with the intention to start work as a nurse in the newly formed NHS. Vernon has been in Britain for several years but, unknown to his sister, has struggled to find and stay in employment. Through his photography, he documents how they adapt to their new life, from grassroots arts and activism, to church and social clubs.

Illustration of a photograph showing the faces of a man and a woman

Windrush Tales art by Naima Ramanan © 3-Fold Games

As part of their extensive research process to develop the game's story lines, creative director Chella Ramanan and writer Corey Brotherson, both descendants of the Windrush generation, have drawn upon their families' experiences, news coverage, books, and exhibitions. They also consulted with people who emigrated from the Caribbean to Britain during the Windrush period from the late 1940s to the early 1970s, and their families, via a workshop organised in collaboration with Jennifer Allsopp, from the University of Birmingham.

Windrush Tales is one of eleven showcased narratives in Digital Storytelling, the first exhibition of its kind at the Library. Curators worked closely with writers, artists and creators to display a range of innovative publications, which reflect the rapidly evolving field of interactive writing, stories that are dynamic, responsive, personalised and immersive.

To accompany this exhibition there is a season of in-person events at the Library. Writer Corey Brotherson from the Windrush Tales team will be speaking about another of his projects, the Clockwork Watch story world at MIX 2023 Storytelling in Immersive Media, a one-day conference exploring the intersection of writing and technology on Friday 7 July 2023. Corey is also a guest tutor at the Fiction as Dialogue, Interactive Fiction Summer School, which runs from 21^st to the 25th August 2023.

Posted by Digital Research Team at 9:00 AM

Tags

Black & Asian Britain, Collaborations, Contemporary Britain, Digital scholarship, Events, Games, Literature

04 May 2023

Webinar on Open Scholarship in GLAMs through Research Repositories

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then join us on 18th May, Thursday for an online repository training session for cultural heritage professionals.

This event is part of the Library’s Repository Training Programme for Cultural Heritage Professionals. It is designed based on the input received from previous repository training events (this, this and this) to explore some areas of the open scholarship further. They include but are not limited to, research activities in GLAM, benefits of research repositories, scholarly publishing, research data management and digital preservation in scholarly communications.

Who is it for?

It is intended for those who are working in cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting the research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics is welcome to attend!

Programme

13.00 Welcome and introductions

Susan Miles, Scholarly Communications Specialist, British Library

Session 1 Open scholarship in GLAM research

13.15 Repositories to facilitate open scholarship

Jenny Basford, Repository Services Lead, British Library

13.40 Scholarly publishing dynamics in the GLAM environment

Ilkay Holt, Scholarly Communications Lead, British Library

14.05 Q&A

14.20 Break time

Session 2 Building openness in GLAM research

14.40 Research data management

Jez Cope, Data Services Lead, British Library

15.05 Digital preservation and scholarly communications

Neil Jefferies, Head of Innovation, Bodleian Libraries

15.30 Q&A

15.45 Closing

Register!

The event will take place from 13.00 to 15.45 on 18 May, Thursday. Please register at this link to receive your access link for the online session.

What is next?

The last training event of the Library’s Repository Training Programme will be held on 31 May in Cardiff, hosted by the National Museums Cardiff. It will be an update and re-run of the previous face-to-face events. More information about the programme and registration link can be found in this blog post.

Please contact [email protected] if you have any questions or comments about the events.

Previous Events

31 January, in-person, Edinburgh, hosted by the National Museums Scotland

8 March, online, hosted by the British Library

31 March, in-person, York, hosted by Archeology Data Service at the University of York

About British Library’s Repository Training Programme

The Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support cultural heritage organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories. You can read more about the scoping report and the development of this training programme in this blog post.

Posted by Digital Research Team at 3:00 PM

Tags

Collaborations, Digital scholarship, eResources, Events, Research collaboration

02 May 2023

Detecting Catalogue Entries in Printed Catalogue Data

This is a guest blog post by Isaac Dunford, MEng Computer Science student at the University of Southampton. Isaac reports on his Digital Humanities internship project supervised by Dr James Baker.

Introduction

The purpose of this project has been to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information. The first part of this project involved experimenting with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum (described by Rossitza Atanassova in a post announcing her AHRC-RLUK Professional Practice Fellowship project) and trying to find the best ways to detect individual entries and reassemble them as data (given that the text for a single catalogue entry may be spread across multiple pages of a printed catalogue). Then the next part of this project involved building a complete system based on this approach to take the large volume of XML files for a volume and output all of the catalogue entries in a series of desired formats. This post describes our initial experiments with that data, the approach we settled on, and key features of our approach that you should be able to reapply to your catalogue data. All data and code can be found on the project GitHub repo.

Experimentation

The catalogue data was exported from Transkribus in two different formats: an ALTO XML schema and a PAGE XML schema. The ALTO layout encodes positional information about each element of the text (that is, where each word occurs relative to the top left corner of the page) that makes spatial analysis - such as looking for gaps between lines - helpful. However, it also creates data files that are heavily encoded, meaning that it can be difficult to extract the text elements from the data files. Whereas the PAGE schema makes it easier to access the text element from the files.

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the PAGE XML Schema

Raw PAGE XML for a page from volume 8 of the Incunabula Catalogue

Raw ALTO XML for a page from volume 8 of the Incunabula Catalogue

Spacing and positioning

One of the first approaches tried in this project was to use size and spacing to find entries. The intuition behind this is that there is generally a larger amount of white space around the headings in the text than there is between regular lines. And in the ALTO schema, there is information about the size of the text within each line as well as about the coordinates of the line within the page.

However, we found that using the size of the text line and/or the positioning of the lines was not effective for three reasons. First, blank space between catalogue entries inconsistently contributed to the size of some lines. Second, whenever there were tables within the text, there would be large gaps in spacing compared to the normal text, that in turn caused those tables to be read as divisions between catalogue entries. And third, even though entry headings were visually further to the left on the page than regular text, and therefore should have had the smallest x coordinates, the materiality of the printed page was inconsistently represented as digital data, and so presented regular lines with small x coordinates that could be read - using this approach - as headings.

Final Approach

Entry Detection

Our chosen approach uses the data in the page XML schema, and is bespoke to the data for the Catalogue of books printed in the 15th century now at the British Museum as produced by Transkribus (and indeed, the version of Transkribus: having built our code around some initial exports, running it over the later volumes - which had been digitised last - threw an error due to some slight changes to the exported XML schema).

The code takes the XML input and finds entry using a content-based approach that looks for features at the start and end of each catalogue entry. Indeed after experimenting with different approaches, the most consistent way to detect the catalogue entries was to:

Find the “reference number” (e.g. IB. 39624) which is always present at the end of an entry.
Find a date that is always present after an entry heading.

This gave us an ability to contextually infer the presence of a split between two catalogue entries, the main limitation of which is quality of the Optical Character Recognition (OCR) at the point at which the references and dates occur in the printed volumes.

An image of a digitised page with a catalogue entry and the corresponding text output in XML format

XML of a detected entry

Language Detection

The reason for dividing catalogue entries in this way was to facilitate analysis of the catalogue data, specifically analysis that sought to define the linguistic character of descriptions in the Catalogue of books printed in the 15th century now at the British Museum and how those descriptions changed and evolved across the thirteen volumes. As segments of each catalogue entry contains text transcribed from the incunabula that were not written by a cataloguer (and therefore not part of their cataloguing ‘voice’), and as those transcribed sections are in French, Dutch, Old English, and other languages that a machine could detect as not being modern English, to further facilitate research use of the final data, one of the extensions we implemented was to label sections of each catalogue entry by the language. This was achieved using a python library for language detection and then - for a particular output type - replacing non-English language sections of text with a placeholder (e.g. NON-ENGLISH SECTION). And whilst the language detection model does not detect the Old-English, and varies between assigning those sections labels for different languages as a result, the language detection was still able to break blocks of text in each catalogue entry into the English and non-English sections.

Text files for catalogue entry number IB39624 showing the full text and the detected English-only sections.

Text outputs of the full and English-only sections of the catalogue entry

Poorly Scanned Pages

Another extension for this system was to use the input data to try and determine whether a page had been poorly scanned: for example, that the lines in the XML input read from one column straight into another as a single line (rather than the XML reading order following the visual signifiers of column breaks). This system detects poorly scanned pages by looking at the lengths of all lines in the page XML schema, establishing which lines deviate substantially from the mean line length, and if sufficient outliers are found then marking the page as poorly scanned.

Key Features

The key parts of this system which can be taken and applied to a different problem is the method for detecting entries. We expect that the fundamental method of looking for marks in the page content to identify the start and end of catalogue entries in the XML files would be applicable to other data derived from printed catalogues. The only parts of the algorithm which would need changing for a new system would be the regular expressions used to find the start and end of the catalogue entry headings. And as long as the XML input comes in the same schema, the code should be able to consistently divide up the volumes into the individual catalogue entries.

Posted by Digital Research Team at 8:45 PM

Tags

Digital scholarship, LIS research, Projects, Rare books, Research collaboration, Tools

19 April 2023

Repository Training Day in Cardiff: Research in GLAM and research repositories to facilitate open scholarship activities for cultural heritage organisations

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then register for a hybrid repository training day for cultural heritage professionals hosted by the National Museum Cardiff in Wales on 31 May 2023.

The British Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support GLAM organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories.

Manuscript illustration of Cardiff from the 17th Century showing a river, fields, a church and other small buildings

Insert from John Speeds County maps of Wales first published in The Theatre of the Empire of Great Britain by George Humble (1610) made available by the National Library of Wales via Flickr Commons

Background

The very first in-person event was in Edinburgh in January, with a follow-up online session in March and a second in-person event in York, hosted by the Archaeology Data Service (ADS) at the University of York on 23 March.

We had attendees from the British Museum, National Museums Scotland, National Portrait Gallery, Towards a National Collection (AHRC) and the ADS in various roles including scholarly communications librarian, digital archivist, project manager and senior researchers in their organisations.

The full programme for this event is available in a previous blog post. During the event, conversations took place on a range of topics from policy development, embedding research culture in organisations to encouraging staff to be involved in research cycles, different types of workflows in different institutions. In the feedback we received from the audience, there is a need to explore more about research data management, scholarly publishing, challenges in smaller organisations, working with emerging formats and building communities of practice.

Now looking forward, the last hybrid repository training event will be hosted by the National Museum Cardiff in Wales on Wednesday 31 May. You can see the details below and register here. We are looking forward to meeting everyone who is interested in learning more about research repositories from cultural heritage organisations.

Who is this training for?

We invite everyone who is working in a cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics is welcome to attend.

What will you learn?

This one-day training session is designed as a starting point to a broader set of knowledge that will help you to:

Understand the research landscape in cultural heritage organisations, benefits of openness for heritage research, basic concepts of open principles and influencing decision makers
Lay foundations for repository services including stakeholder engagement, policy development, technical overview and project planning
Adopt common principles and frameworks, technical standards and requirements in establishing repository services in a cultural heritage organisation

Explore basics of the scholarly communications ecosystem in the context of cultural heritage practices.

Prerequisites

No previous knowledge of topics is required. However, an understanding of open access will maximise the benefit of the taught content for attendees.

Programme

10:30 - Welcome and introductions

10:50 - iDAH Programme

Joanna Dunster, Head of (Research) Infrastructure, AHRC

11:05 - Session 1 Opening up heritage research

This session covers the topics of understanding the research landscape in GLAM organisations, benefits of openness for heritage research, basic concepts of open principles and frameworks.

Ilkay Holt, Scholarly Communications Lead, BL

Susan Miles, Scholarly Communications Speacialist, BL

11:45 - Q&A / Discussion

12:00 - Break

12:15 - Session 2 Getting started with heritage GLAM repositories

This session covers topics on the role of repository infrastructure in open access to heritage research and positioning research repositories in an organisation including policy and development.

Ilkay Holt, Scholarly Communications Lead, BL

Susan Miles, Scholarly Communications Speacialist, BL

12:45 - Lunch

13:30 - Session continues

13:55 - Q&A / Discussion

14:10 - Session 3: Realising and expanding the benefits

This module covers technical overview and requirements for running a cultural heritage repository including an overview of BL’s Shared Research Repository, platforms and software, content administration, technical features.

Graham Jevon, Digital Services Specialist, BL

Nora Ramsey, Assistant to Digital Services Specialist, BL

14:30 - Break

14:40 - Session continues

15:00 - Q&A / Discussion

15:15-15:30 - Closing Remarks

Book your place

In-person sessions are planned for a maximum of 35 people per event and registrants from cultural heritage institutions will be prioritised. Registration for the event is free. Please fill in this form to book your place.

Please note that registrations for in-person attendance will close at 4pm Friday 26th May and confirmation for in-person attendance will be sent to the registered email address.

Registrations for online attendance will close at 6pm on Tuesday 30th May. Zoom access link will be sent to the registered email address day prior to the event.

Members of the Research Infrastructure Services Team at the British Library will be delivering the training programme. The team has over 25 years of broad experience and extensive knowledge in supporting open scholarship across the sector and with international partners. They also provide a Shared Research Repository Service for the cultural heritage organisations.

Please contact [email protected] if you have any questions or comments about this training programme.

Posted by Digital Research Team at 11:00 AM

Tags

Collaborations, Data, Digital scholarship, Events

03 April 2023

Topics in contemporary Digital Scholarship via five years of our Reading Group

Since March 2016, the Digital Scholarship Reading Group at the British Library has discussed articles, videos, podcasts, blog posts and chapters that touch on digital scholarship in libraries. Digital Curator Mia Ridge previously shared our readings up to May 2018 and taken a thematic look at our readings at the intersection of digital scholarship and anti-racism in July 2020.

As the Living with Machines project draws to an end this (northern) summer, Mia provides an updated list of our readings since June 2018: I started including more pieces on deep learning, machine learning, AI ('artificial intelligence'), big data, data science, digital history, digitised newspapers, and user experience design for digital collections when we began discussing what became Living with Machines in early 2017. This was partly a way for me to catch up with relevant topics, and partly to lay the groundwork for LwM across the organisation. You can see that reflected in our topics up to May 2018 and onward.

Of course, the group continued to cover other topics, and sessions were suggested and/or led by colleagues including Adi Keinan-Schoonbaert, Annabel Gallop, Graham Jevon, Jez Cope, Lucy Hinnie, Mary Stewart, Nora McGregor, Sarah Miles, Sarah Stewart and Stella Wisdom. Especial thanks to Rossitza Atanassova and Deirdre Sullivan who’ve been helping me run the group in recent years. In 2021 we started using the January session to invite colleagues across the Library to look around and pick topics for discussion in the year ahead.

So what did we discuss from June 2018 to the end of 2022?

'Big Data' and Digital Collections – pick one or more of pdf, CHISpaper.pdf, Digital archives as Big data.pdf, 1461444816661553.pdf
Access & Management for Meaningful Collections - Machine Learning & Crowdsourcing – take a look at https://www.theguardian.com/world/2021/jul/18/secrets-of-rebel-slaves-in-barbados-will-finally-be-revealed; https://blogs.bl.uk/living-knowledge/2022/02/behind-the-scenes-at-the-british-library-graham-jevon-cataloguer-for-the-endangered-archives-program.html; https://blogs.bl.uk/endangeredarchives/2021/07/help-trace-the-stories-of-enslaved-people-in-the-caribbean-using-colonial-newspapers.html
Becoming a Desk(top) Profession: Digital Photography and the Changing Landscape of Archival Research
Cancel Culture, Social Media & the Importance of Forgetting - listen, read or watch whatever you are able to, or just come along for the discussion! Podcast (20 mins) - https://www.bbc.co.uk/sounds/play/p08ybt8d; Short article - https://journals.sagepub.com/doi/10.1177/1527476420918828; Video (10 mins) - https://podcasts.ox.ac.uk/emma-smith-forgetting-digital-age
Citation Capture: Enhancing Understanding of the Use of Unique and Distinct Collections within Academic Research.
Climate Change and the Digital Humanities – watch Jo Walton’s Digital Humanities Climate Coalition (appx. 6 minutes long, 24:00-30:00); Helen Hardy’s Collections & Climate (appx. 7 minutes long, 42:24-51:00). Also Digital Humanities Climate Change Manifesto – mentioned in Jo Walton’s talk; Mobilising museums for climate action toolkit – In a nutshell, pp. 4-5; UKRI targets net zero computing blog post; Communicating Climate Risk, a Toolkit – Executive summary and key messages
Cultural Analytics - Data Cultures, Culture as Data, https://eresources.remote.bl.uk:2127/10.22148/16.035.
Data science or data humanities? Opportunities, barriers, and rewards in digitally-led analysis of history, culture and society (in-person excursion to the Turing lecture by Melissa Terras)
Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence
Deepfakes and Misinformation – pick one of 'Deepfakes & Misinformation: How dangerous are deepfakes?' - a blog post from the Dutch AV museum; 'Snapshot Paper - Deepfakes and Audiovisual Disinformation'; 'How do we work together to detect AI-manipulated media?' from the Witness Media Lab
Digital Art History and the Computational Imagination
Digitisation & Data Sovereignty - the CARE Principles (PDF summary); The CARE Principles for Indigenous Data Governance, http://eresources.remote.bl.uk:2240/10.5334/dsj-2020-043; Working with the CARE principles: operationalising Indigenous data governance
Digitizing and Enhancing Description Across Collections to Make African American Materials More Discoverable on Umbra Search African American History
Emerging Formats Research and Collecting - Interactive Storytelling, https://eresources.remote.bl.uk:2127/10.1007/978-3-030-62516-0_27
Emerging Formats: Complex digital media and its impact on the UK Legal Deposit Libraries
Engaging in Collaboration from Trading Zones of Digital History
Enhanced Curation and Contextual Collections - Berendse, Z, 2020. Browsing History: Archiving Video Game Context; also Newman, J (2011) '(Not) playing games: player-produced walkthroughs as archival documents of digital gameplay'; blog post by Shu-Wen Lin: Level Up: Playing to Document and Preserve Video Games.
Exploring the Bloodaxe Archive: a creative and critical dialogue
George Oates: Making and Remaking Collections Online
Ground Truth: In the Archives That Train Machine Learning – online excursion to a lecture at the National Archives by Kate Crawford
How NOT to create a digital media scholarship platform: the history of the Sophie 2.0 project
Infrastructure studies meet platform studies in the age of Google and Facebook
Is Santa Claus Real?
Language Preservation & Discoverability - Bugis manuscripts and Batak manuscripts (and provenance). Also check out Equitable Access: Using Metadata to Level the Playing Field in a Multilingual Country (video, from 0:00-11:40); Challenging legacies slidedeck (Arlis 13 May 2022) – Alan Danskin, Collection Metadata Standards Manager, BL
New Scholarship in the Digital Age: Making, Publishing, Maintaining, and Preserving Non-Traditional Scholarly Objects and Library of Congress Digital Scholarship Working Group Report
Of global reach yet of situated contexts: an examination of the implicit and explicit selection criteria that shape digital archives of historical newspapers
Open Access in Digital Scholarship - Open Humanities: Why Open Science in the Humanities is not Enough
Oral History & COVID Collection – pick from Charlie Morgan’s Oral History Society blog post on the rapid response approach to collecting tricky remote interviews: https://www.ohs.org.uk/general-interest/when-the-crisis-fades-what-gets-left-behind/; references the excellent Eira Tansey blog post: https://eiratansey.com/2020/06/05/no-one-owes-their-trauma-to-archivists-or-the-commodification-of-contemporaneous-collecting/. “First, Do No Harm”: Tread Carefully Where Oral History, Trauma, and Current Crises Intersect, Jennifer A. Cramer, https://www.tandfonline.com/doi/full/10.1080/00940798.2020.1793679; collecting during COVID in this short blog post: https://oralhistoryreview.org/current-events/nhs-70-covid-19/ (Stephanie Snow and Angela Whitecross)
Participation in heritage crowdsourcing
Practical AI Ethics – pick one of Lessons from archives: strategies for collecting sociocultural data in machine learning or Managing Bias When Library Collections Become Data
Queer Criticalities, Instagram, and the Ethics of Museum Display
Reviews in Digital Humanities – pick an article that matches your interests
Serving researchers in a self-service world
Slow librarianship – pick one of https://meredith.wolfwater.com/wordpress/2021/10/18/what-is-slow-librarianship/ or http://www.inthelibrarywiththeleadpipe.org/2017/the-innovation-fetish-and-slow-librarianship-what-librarians-can-learn-from-the-juicero/
Strategies for the Curiosity‐Driven Museum Visitor
The Consequences of Framing Digital Humanities Tools as Easy to Use
The Equivalence of “Close” And “Distant” Reading; Or, toward a New Object for Data-Rich Literary History
The Nightmare of Surveillance Capitalism
Towards a User-Centric Evaluation of UK Non-Print Legal Deposit: A Digital Library Futures White Paper
Triggers, Decontextualisation & Digitised Collections - pick one of On Trigger Warnings: https://www.aaup.org/report/trigger-warnings; Labeling and Rating Systems: http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/labelingrating; What to Know When Watching Gone With the Wind (4 minutes 25 seconds): https://www.youtube.com/watch?v=0DF2FKRToiQ; Digitization Selection Criteria as Anti-Racist Action: https://journal.code4lib.org/articles/14667
User experience design for libraries – pick one of slides for 'UX, ethnography and possibilities: for Libraries, Museums and Archives' by Ned Potter https://www.slideshare.net/thewikiman/ux-ethnography-and-possibilities-for-libraries-museums-and-archives; a short overview article: 'User Experience (UX) in Libraries: Let’s Get Physical (and Digital)' by Leo Appleton https://insights.uksg.org/articles/10.1629/uksg.317/
Visual Style in Two Network Era Sitcoms
Web 3 and Commercialisation of the Web. Take a look at: https://web3isgoinggreat.com/what; https://en.wikipedia.org/wiki/Web3; NFTs Weren’t Supposed to End Like This by Anil Dash https://www.theatlantic.com/ideas/archive/2021/04/nfts-werent-supposed-end-like/618488/; Web3 Is Headed Our Way. Are Cultural Institutions On Board? https://jingculturecrypto.com/we-are-museums-wac-fellowship-web3/
What Can Big Data Teach Us about Eviction? by Jo Guldi
What is Digital Humanities? Read Ain't No Way Around It: Why We Need to Be Clear About What We Mean By “Digital Humanities” and have a look through the Index of Digital Humanities Conferences
Where are we now? A review of research on the history of women's soccer in Ireland
Why Microsoft and Warner Bros. Archived the Original ‘Superman’ Movie on a Futuristic Glass Disc
You and AI – Just An Engineer: The Politics of AI (video)

Posted by Digital Research Team at 4:50 PM

Tags

Data, Digital scholarship, LIS research, Research collaboration

31 March 2023

Mapping Caribbean Diasporic Networks through the Correspondence of Andrew Salkey

This is a guest post by Natalie Lucy, a PhD student at University College London, who recently undertook a British Library placement to work on a project Mapping Caribbean Diasporic Networks through the correspondence of Andrew Salkey.

Project Objectives

The project, supervised by curators Eleanor Casson and Stella Wisdom, focussed on the extensive correspondence contained within Andrew Salkey’s archive. One of the initial objectives was to digitally depict the movement of key Caribbean writers and artists, as it is evidenced within the correspondence, many of whom travelled between Britain and the Caribbean as well as the United States, Central and South America and Africa. Although Salkey corresponded with a diverse range of people, we therefore focused on the letters in his archive which were from Caribbean writers and academics and which illustrated patterns of movement of the Caribbean diaspora. Much of the correspondence stems from 1960s and 1970s, a time when Andrew Salkey was particularly active both in the Caribbean Artists Movement and, as a writer and broadcaster, at the BBC.

Photograph of Andrew Salkey

Andrew Salkey was unusual not only for the panoply of writers, artists and politicians with whom he was connected, but that he sustained those relationships, carefully preserving the correspondence which resulted from those networks. My personal interest in this project stemmed from the fact that my PhD seeks to consider the ways that the Caribbean trickster character, Anancy, has historically been reinvented to say something about heritage and identity. Significant to that question was the way that the Caribbean Artists Movement, a dynamic group of artists and writers formed in London in the mid-1960s, and of which Andrew Salkey was a founder, appropriated Anancy, reasserting him and the folktales to convey something of a literary ‘voice’ for the Caribbean. For this reason, I was also interested in the writing networks which were evidenced within the correspondence, together with their impact.

What is Gephi?

Prior to starting the project, Eleanor, who had catalogued the Andrew Salkey archive and Digital Curator, Stella, had identified Gephi as a possible software application through which to visualise this data. Gephi has been used in a variety of projects, including several at Harvard University, examples of the breadth and diversity of those initiatives can be found here. Several of these projects have social networks or historical trading routes as their focus, with obvious parallels to this project. Others notably use correspondence as their main data.

Gathering the Data

Andrew Salkey was known as something of a chronicler. He was interested in letters and travel and was also a serious collector of stamps. As such, he had not only retained the majority of the letters he received but categorised them. Eleanor had originally identified potential correspondents who might be useful to the project, selecting writers who travelled widely, whose correspondence had been separately stored by Salkey, partly because of its volume, and who might be of wider interest to the public. These included the acclaimed Caribbean writers, Samuel Selvon, George Lamming, Jan Carew and Edward Kamau Brathwaite and publishers and political activists, Jessica and Eric Huntley.

Our initial intention was to limit the data to simple facts which could easily be gleaned from the letters. Gephi required that we did so on a spreadsheet ,which had to conform to a particular format. In the first stages of the project, the data was confined to the dates and location of the correspondence, information which could suggest the patterns of movement within the diaspora. However, the letters were so rich in detail, that we ultimately recorded other information. This included any additional travel taken by any of the correspondents, and which was clearly evidenced in the letters, together with any passages from the correspondence which demonstrated either something of the nature and quality of the friendships or, alternatively, the mutual benefit of those relationships to the careers of so many of the writers.

Creating a visual network

Dr Duncan Hay was invited to collaborate with me on this project, as he has considerable expertise in this field, his research interests include web mapping for culture and heritage and data visualisation for literary criticism. After the initial data was collated, we discussed with Duncan what visualisations could be created. It became apparent early on that creating a visualisation of the social networks, as opposed to the patterns of movement, might be relatively straightforward via Gephi, an application which was particularly useful for this type of graph. I had prepared a spreadsheet but, Gephi requires the data to be presented in a strictly consistent way which meant that any anomalies had to be eradicated and the data effectively ‘cleaned up’ using Open Refine. Gephi also requires that information is presented by way of a system of ‘nodes’; ‘edges’ and ‘attributes’ with corresponding spreadsheet columns. In our project, the ‘nodes’ referred to Andrew Salkey and each of the correspondents and other individuals of interest who were specifically referred to within the correspondence. The edges referred to the way that those people were connected which, in this case, was through correspondence. However, what added to the potential of the project was that these nodes and edges could be further described by reference to ‘attributes.’ The possibility of assigning a range of ‘attributes’ to each of the correspondents allowed a wealth of additional information to be provided about the networks. As a consequence, and in order to make any visualisation as informative as possible, I also added brief biographical information for each of the writers and artists to be inputted as ‘attributes’ together with some explanation of the nature of the networks that were being illustrated.

The visual illustration below shows not only the quantity of letters from the sample of correspondents to Andrew Salkey (the pink lines), but also shows which other correspondents formed part of those networks and were referenced as friends or contacts within specific items of correspondence. For example, George Lamming references academic, Rex Nettleford and writer and activist, Claudia Jones, the founder of the Notting Hill Carnival, in his correspondence, connections which are depicted in grey.

Data visualisation of nodes and lines representing Andrew Salkey's Correspondence Network

Gephi: Andrew Salkey correspondence network

The aim was, however, for the visualisation to also be interactive. This required considerable further manipulation of the format and tools. In this illustration you can see the information that is revealed about the prominent Barbadian writer, George Lamming which, in an interactive format, can be accessed via the ‘i’ symbols beside many of the nodes coloured in green.

Whilst Gephi was a useful tool with which to illustrate the networks, it was less helpful as a way to demonstrate the patterns of movement, one of the primary objectives of the project. A challenge was, therefore, to create a map which could be both interactive and illustrative of the specific locations of the correspondents as well as their movement over time. With Duncan’s input and expertise, we opted for a hybrid approach, utilising two principal ways to illustrate the data: we used Gephi to create a visualisation of the ‘networks’ (above) and another software tool, Kepler.gl, to show the diasporic movement.

A static version of what ultimately will be a ‘moving’ map (illustrating correspondence with reference to person, date and location) is shown below. As well as demonstrating patterns of movement, it should also be possible to access information about specific letters as well as their shelf numbers through this map, hopefully making the archive more accessible.

Data visualisation showing lines connecting countries on a map showing part of the Americas, Europe and Africa

Patterns of diasporic movement from Andrew Salkey's correspondence, illustrated in Kepler.gl

Whilst we are still exploring the potential of this project and how it might intersect with other areas of research and archives, it has already revealed something of the benefits of this type of data visualisation. For example, a project of this type could be used as an educational tool, providing something of a simple, but dynamic, introduction to the Caribbean Artists Movement. Being able to visualise the project has also allowed us to input information which confirms where specific letters of interest might be found within the archive. Ultimately, it is hoped that the project will offer ways to make a rich, yet arguably undervalued, archive more accessible to a wider audience with the potential to replicate something of an introductory model, or ‘pilot’ for further archives in the future.

Posted by Digital Research Team at 11:09 AM

Tags

Americas, Black & Asian Britain, Collaborations, Contemporary Britain, Data, Digital scholarship, Experiments, Manuscripts, Maps, Projects, Research collaboration

28 March 2023

BL Labs Symposium 30 March 2023: AI and GLAM data

Don’t forget to register for the 2023 BL Labs Symposium (https://us02web.zoom.us/webinar/register/WN_oAApT1laSFSCm28Kyfz4bA)

Following the latest advancements in AI is almost a job in itself. The constant excitement sometimes feels almost bewildering, and it leaves us a little room to really get stuck into peculiarities and joys of data and AI methods and tools emerging in Galleries, Libraries, Archives and Museums (GLAM). For the second part of the BL Labs Symposium this year, we will be looking to spend some time with the examples of real data, tools and methods emerging in the GLAM AI world.

We will start our Data and AI session with an exciting presentation by Yannis Assael from Deep Mind. Yannis will show us Ithaca, the first Deep Neural Network interactive interface built to restore and attribute ancient Greek inscriptions. We expect this to be a real game changer for the use of AI for the collections that include complex and incomplete fragments of text.

We will also explore some British Library examples of AI and machine learning, mainly using the examples of data derived from our newspaper and map collections. Kalle Westerling will reflect on the latest from the Living with Machines project, this is a ground-breaking research collaboration between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL, King’s College). Gethin Rees will tell us about his work that is engaging public with geospatial data and in the process improving our capabilities to locate national collections.

BL Labs are dedicated to opening up the British Library’s data, especially for all researchers who want to use it for different types of computational research. This remains a daunting task. But we have been working on it! Silvija Aurylaite, BL Labs Manager, will share the BL Labs direction of travel, including sharing our new BL Labs website in Beta. The site will be live for the first time, with the Symposium audience kicking off our testing and engagement phase.

We hope that this session will give us some time to share and reflect on the ongoing AI work in GLAM with all its excitement, challenges and opportunities. All going well, there may be even a chance to get your hands on some new datasets.

We hope you can join us at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post. We are also delighted to be going ahead with an informal drinks and networking drop in session at the Library between 6.30pm and 7.30pm and you are all most welcome to join us. Register for this and / or the Symposium here.

Posted by Digital Research Team at 7:38 AM

Tags

BL Labs, Collaborations, Data, Digital scholarship, Events, Research collaboration

20 March 2023

Digital Storytelling at the 2023 BL Labs Symposium

One half of the 2023 British Library Labs Symposium will be dedicated to digital storytelling. This has been a significant part of BL Labs work over the years; we have collaborated with experimental artists from David Normal’s creative reuse of British Library Flickr images for his giant lightbox collage Crossroads of Curiosity installation at the 2014 Burning Man festival, to working with first runner up in the BL Labs 2016 competition Michael Takeo Magruder on his 2019 exhibition Imaginary Cities.

People looking at lightbox collage artworks

Crossroads of Curiosity by David Normal

In the last few years, due to the COVID-19 pandemic disruption, digital stories and engagement have become mainstream across the Galleries, Libraries, Archives and Museums (GLAM) sector. New types of digital storytelling mixing social media, online exhibitions embedding narratives and digital objects, and interactive online events reaching entirely new audiences, delighted us all. However, we also discovered that there can be a saturation point with online engagement, and that many digital developments have some way to go to reach their full potential.

As we are hopefully entering healthier times, new opportunities to mix virtual and physical worlds are starting to open up. With this in mind, we felt that this is the right moment to explore a new age of digital storytelling at the 2023 BL Labs Symposium.

The idea is to explore what is changing in the world of technological possibilities and how they are continuing to develop. We have envisaged a journey that will take us from the big picture of the arising digital possibilities to more specific examples from the British Library’s work. In true BL Labs spirit we will also celebrate initiatives that creatively reuse the Library’s digital collections.

To help us look into the big trends, we are delighted to be joined by Zillah Watson, whose extraordinary breath of experience working with BBC, Meta, BFI and Royal Shakespeare Company amongst many others, will help us to get a deeper sense of the opportunities of virtual reality (VR). Zillah will look into what it means, not just to be dazzled with technological possibilities, but also to enter the magic of storytelling.

Talking of magic, we are lucky to welcome award winning Director, Anrick Bregman, and award winning Producer, Grace Baird. Anrick and Grace will take us deeper into the potential of using VR to uncover hidden stories. Anrick’s film A Convict Story is an interactive VR project built on British Library data that brings to life a story discovered by the linking of data from centuries ago, using data research powered by machine learning.

Even closer to home, our own Stella Wisdom and Ian Cooke, will talk about their current work on curating the British Library’s forthcoming Digital Storytelling exhibition (2 June – 15 October 2023), which will explore the ways technology provides opportunities to transform and enhance the way writers write and readers engage. Drawing on the Library’s collection of contemporary digital publications and emerging formats to highlight the work of innovative and experimental writers. It will feature interactive works that invite and respond to user input, reading experiences influenced by data feeds, and immersive story worlds created using multiple platforms and audience participation. This is an exciting development, as we can see how earlier British Library creative digital experiments, collaborations and research projects are building into an exhibition in its own right.

We hope you can join us for discussion at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post.

You can book your place here.

Posted by Digital Research Team at 9:28 AM

Tags

BL Labs, Collaborations, Contemporary Britain, Data, Digital scholarship, Events, Experiments