Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

02 August 2023

Writing tools for Interactive Fiction - an updated list

In the spring of 2020, during the first UK lockdown, I wrote an article for the British Library English and Drama blog, titled ‘Writing tools for Interactive Fiction’. Quite a few things have changed since then and as the Library launched its first exhibition on Digital Storytelling this June, it seemed like the perfect time to update this list with a few additions.

Interactive fiction (IF), or interactive narrative/narration, is defined as “software simulating environments in which players use text commands to control characters and influence the environment.”

The British Library has been collecting examples of UK interactive fiction as part of the Emerging Formats Project, which is a collaborative effort from all six UK Legal Deposit Libraries to look at the collection management requirements of complex digital publications. Lynda Clark, the British Library Innovation Fellow for Interactive Fiction, built the Interactive Narratives collection on the UK Web Archive (UKWA) during her placement. Because of Legal Deposit Regulations, most of the items in the Interactive Narratives collection can only be accessed on Library premises – which also extends to other collections in the UK Web Archive, such as the New Media Writing Prize collection.

Lynda also conducted analysis on genres, interaction patterns and tools used to build these narratives.

 

Many of these tools are free to use and don’t require any previous knowledge of programming languages. This is not meant to be an exhaustive list, but it might be a useful overview of some of the tools currently available, if you’d like to start experimenting with writing your own interactive narrative. We are also very excited to be able to offer a week-long Interactive Fiction Summer School this August at the Library, running alongside the Digital Storytelling exhibition.

For an easier navigation, these are the tools included in this article:

 

Twine

Twine is an open-source tool to write text-based, non-linear narratives. Created by Chris Klimas in 2009, Twine is perfect to write Choose Your Own Adventure-like stories without knowing how to code. The output is an HTML file, which facilitates publishing and distribution, as it can be run on any computer with an Internet connection and a web browser. If you have any knowledge of CSS or Javascript it’s possible to add extra features and specific designs to your Twine story, but the standard Twine structure only requires you to type text and put brackets around the phrases that will become links in the story (linking to another passage or branching into different directions). There is an online version or a downloadable version that runs on Windows, MacOS and Linux. Twine has multiple story formats, with different features and ways to write the interactive bits of your story. The Twine Reference is a good place to start, but there is also a Twine Cookbook (containing ‘recipes’, instructions and examples to do a variety of things).

Example of text from Cat Simulator 3000. 'You dream of mice. You dream of trout. You dream of balls of yarn. You dream of world domination. You dream of opening your own bank account. You dream of the nature of sentience.' Followed by the prompt 'Wake up'.
Some quality cat dreams.
(from Emma Winston’s Cat Simulator 3000)

 

As the most used tool in the UKWA collection, there are many examples of IF written in Twine, from cat and teatime simulators (Emma Winston’s Cat Simulator 3000 and Damon L. Wakes’ Lovely Pleasant Teatime Simulator), to stories that include a mix of video, images and audio (Chris Godber’s Glitch), and horror games made for Gothic Novel Jam using the British Library’s Flickr collection of images (Freya Campbell’s The Tower – NB some content warnings apply). Lynda Clark also authored an original story as a conclusion to her placement: The Memory Archivist incorporates many of the themes emerged during her research and won The BL Labs Artistic Award 2019.

 

ink/inky & inklewriter

Cambridge-based video game studio inkle is behind another IF tool – or two. Ink is the scripting language used to author many of inkle’s videogames – the idea behind it is to mark up “pure-text with flow in order to produce interactive scripts”. It doesn’t require any programming knowledge and the resulting scripts are relatively easy to read. Inky is the editor to write ink scripts in – it’s free to download and lets you test your narrative as you write it. Once you’re happy with your story, you can export it for the web, as well as a JSON file. There’s a quick tutorial to walk you through the basics, as well as a full manual on how to write in ink. ink was also used to write 80 Days, another work collected by the British Library as part of the emerging formats project and currently exhibited as part of the Digital Storytelling exhibition.

A side by side showing the back end and front end of what writing in ink looks like.
A page from 80 Days, written using ink. To read in full detail, please click on the image.

 

inklewriter is an open-source, ready-to-use, browser-based IF “sketch-pad”. It is meant to be used to sketch out narratives more than to author fully-developed stories. There is no download required and the fact that it is a simple and straightforward tool to experiment with IF makes it a good fit for educators. Tutorials are included within the platform itself so that you can learn while you write.

This year’s Interactive Fiction Summer School at the British Library will teach attendees how to write interactive fiction using ink, with a focus on dialogue and writing with the player in mind. Dr. Florencia Minuzzi will lead the 5-day course, together with a number of guest speakers whose work is featured in the Digital Storytelling exhibition – including Corey Brotherson, Destina Connor, Dan Hett and Meghna Jayanth. The school runs from Monday 21st to Friday 25th August – no previous coding experience necessary!

A screenshot from 80 Days Ⓒ inkle. Two men facing each other with the prompt 'begin conversation'.
A screenshot from 80 Days Ⓒ inkle.

 

Bitsy

Bitsy is a browser-based editor for mini games developed by Adam Le Doux in 2016. It operates within clear constraints (8x8 pixel tiles, a 3-colour palette, etc.), which is actually one of the reasons why it is so beloved. You can draw and animate your own characters within your pixel grid, write the dialogue and define how your avatar (your playable character) will interact with the surrounding scenery and with other non-playable characters. Again, no programming knowledge is necessary. Bitsy is especially good for short narratives and vignette games. After completing your game, you can download it as an HTML file and then share it however you prefer. There is Bitsy Docs, as well as some comprehensive tutorials and even a one-page pamphlet covering the basics.

GIF animation from the Bitsy game 'British Library Simulator'
Shout-out to the Emerging Formats Project
(from Giulia Carla Rossi’s The British Library Simulator)

 

To play (and read) a Bitsy work you should use your keyboard to move the avatar around and interact with the ‘sprites’ (interactive items, characters and scenery – usually recognisable as sporting a different colour from the non-interactive background). You can wander around a Zen garden reflecting on your impending wedding (Ben Bruce’s Zen Garden, Portland, The Day Before My Wedding), alight the village fires to welcome the midwinter spirits (Ash Green’s Midwinter Spirits), experience a love story through mixtapes (David Mowatt’s She Made Me A Mix Tape), or if you’re still craving a nice cuppa you can review some imaginary tea shops (Ben Bruce’s Five Great Places to Get a Nice Cup of Tea When You Are Asleep). You can even visit a pixelated version of the British Library and discover more about our contemporary and digital collections with The British Library Simulator.

 

Inform 7

While Twine allows you to write hypertext narratives (where readers can progress through the story by clicking on a link), Inform 7 lets you write parser-based interactive fiction. Parser-based IF requires the reader to type commands (sometimes full sentences) in order to interact with the story.

A how to guide showing what text options are available for playing text based explorer games in Inform. Helpful tips like 'Try the commands that make sense! Doors are for opening; buttons are for pushing; pie is for eating!'
How to Play Interactive Fiction (An entire strategy guide on a single postcard)
<style="font-family: inherit;">Written by Andrew Plotkin -- design by Lea Albaugh. This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License

 

Inform 7 is a free-to-use, open-source (as of April 2022) tool to write interactive fiction. Originally created as Inform by Graham Nelson in 1993, the current Inform 7 was released in 2006 and uses natural language (based on the English language) to describe situations and interactions. The learning curve is a bit steeper than with Twine, but the natural language approach allows for users with no programming experience to write code in a simplified language that reads like English text. Inform 7 also has a Recipe Book and a series of well-documented tutorials. Inform also runs on Windows, MacOS and Linux and lets you output your game as HTML files.

While the current version of Inform is Inform 7, narratives using previous versions of the system are still available – Emily Short’s Galatea is always a good place to start. You could also explore mysterious ruins with your romantic interest (C.E.J. Pacian’s Love, Hate and the Mysterious Ocean Tower), play a gentleman thief (J.J. Guest’s  Alias, the Magpie) or make more tea (Joey Jones’ Strained Tea).

 

ChoiceScript

ChoiceScript is a javascript-based scripting language developed by Adam Strong-Morse and Dan Fabulich of Choice of Games. It can be used to write choice-based interactive narratives, in which the reader has to select among multiple choices to determine how the story will unfold. The simplicity of the language makes it possible to create Choose-Your-Own-Adventure-style stories without any prior coding knowledge. The ChoiceScript source is available to download for free on the Choice of Games website (it also requires writers to have Node.js installed on their machine). Once your story is complete, you can publish it for free online. Otherwise, Choice of Games offer the possibility of publishing your work with them (they publish to various platforms, including iOS, Android, Kindle and Steam) and earn royalties from it. There is a tutorial that covers the basics, including a Glossary of ChoiceScript terms. The Choice of Game blog also includes some articles with tips on how to design and write interactive stories, especially long ones.

Genres of works built using ChoiceScript are again quite varied – from sci-fi stories exploring the relationships between writers and readers (Lynda Clark’s Writers Are Not Strangers), to crime/romantic dramas (Toni Owen-Blue’s Double/Cross) and fantasy adventures (Thom Baylay’s Evertree Inn).

 

Downpour

Downpour is a game-making tool for phones currently in development. Created by v buckenham, Downpour is a tool that will allow users to make interactive games in minutes, only using their phone’s camera and linking images together. There is no expectation of previous programming knowledge and by removing the need to access a computer, Downpour promises to be a very approachable tool. Release is currently planned for 2023 on iOS and Android – if you want to be notified when it launches you can sign up here.

Downpour banner (purple writing over pink background)
Downpour banner.

 

More resources

As I mentioned before, this is in no way a comprehensive list – there are a lot of other tools and platforms to write IF, both mainstream as well as slightly more obscure ones (Ren’Py, Quest, StoryNexus, Raconteur, Genarrator, just to mention a few). Try different tools, find the one that works best for you or use a mix of them if you prefer! Experiment as much as you like.

If you’d like to discover even more tools to build your interactive project, Everest Pipkin has an excellent list of Open source, experimental, and tiny tools.

Emily Short’s Interactive Storytelling blog also offers a round-up of very interesting links about interactive narratives.

If you want to be inspired by more independent games and interactive stories, Indiepocalypse offers a curated selection of video and/or physical games in the form of a monthly anthology.

To conclude, I’ll leave you with a quote by Anna Anthropy from her book Rise of the Videogame Zinesters:

“Every game that you and I make right now [...] makes the boundaries of our art form (and it is ours) larger. Every new game is a voice in the darkness. And new voices are important in an art form that has been dominated for so long by a single perspective. [...]

There’s nothing to stop us from making our voices heard now. And there will be plenty of voices. Among those voices, there will be plenty of mediocrity, and plenty of games that have no meaning to anyone outside the author and maybe her friends. But [...] imagine what we’ll gain: real diversity, a plethora of voices and experiences, and a new avenue for human beings to tell their stories and connect with other human beings.”

This post is by Giulia Carla Rossi, Curator for Digital Publications

14 July 2023

Share Family: British National Bibliography (Beta) service is live

Contents

Introduction

Share Family and National Bibliographies

       What is a National bibliography?

       BNB in the Share Family

Benefits

Future developments

Beta service

Further information

 

Introduction

The British National Bibliography (BNB), first published in January 1950, is a weekly listing of new books and journals published or distributed in the United Kingdom and the Republic of Ireland.  Over the last seventy-three years, the BNB has adapted to changing customer needs by embracing new technologies, from cards in the 1950s to mark-up languages for data exchange in the 1970s and CD-ROM in the 1980s. The BNB now provides online access to details of over 5 million publications and forthcoming titles, ranging in scope from computer science to history, from novels to textbooks.

 

Two examples of bibliographies including information like title, author, place of publication, year, description, prices etc.
1. Examples of British National Bibliography records, April 19th 2023. Please click the image to see it in full size & detail.

In 2011, the Library launched the Linked Open Data BNB.  At that time, linked data was an emerging technology using Web protocols to link data sets, as envisaged in Sir Tim Berners-Lee’s concept of a Semantic Web[1].  Our initial foray into linked data was successful from a technical perspective. We were able to convert BNB data held in Machine Readable Cataloging (MARC) format into linked data structures and make it available in a variety of schemas under an open licence.  Nevertheless, we lacked the capacity to re-model our data in order to realise the potential of linked data.  As the technology matured, we began to look around for partners with whom we could collaborate to take BNB forward.

As described in my September 2020 blogpost, British Library Joins Share-VDE Linked Data Community, the British Library joined the Share Community (now the Share Family) to develop our linked data service. The Share Linked Data Environment is “a global family built on collaboration that brings libraries, archives and museums together with a common goal and joins their knowledge in an ever-widening network of inter-connected bibliographic data.” (Share Family, 2022).

 

Share Family and National Bibliographies

“The Share Family is a suite of innovative tools and services, developed and driven by libraries, for libraries, in an international collaborative, consortial effort. Share-VDE enables the discovery of knowledge to increase user engagement with library and cultural heritage collections.”[2]

Screenshot: Share family components showing layers like Advanced API, Advanced Entity Model, Authority Service, Deliverables etc.
2. Share family components[3]. Please click the image to see it in full size & detail.

The Share Family has supported us through the transition from our traditional MARC data to linked open data.  We provided a full copy of the British National Bibliography to the Share team for identification and clustering of entities, e.g. works, publications, persons. Working with colleagues from other institutions on Share-VDE working groups we contribute to the development of the underlying data structures and the presentation of data.  This collaborative approach has enabled delivery of the British National Bibliography as the first institutional tenant of the Share Family National Bibliographies Portal

What is a National bibliography?

“National bibliographies are a permanent record of the cultural and intellectual output of a nation or country, which is witnessed by its publishing output. They gather the bibliographic information of current publications to preserve and provide ongoing access to this record.”

IFLA Bibliography Section

The IFLA (International Federation of Library Associations and Institutions) Register of national bibliographies contains 52 entries, ranging from Andorra to Vietnam.  National bibliographies vary in scope, but each provides insights into the intellectual and cultural history of society, literature and publishing.  The Share Family National Bibliographies Portal offers the potential for clustering and searching multiple national bibliographies on a single platform.

BNB in the Share Family

Screenshot of the BNB home screen stating 'Search for people, original works and publications
3. Screenshot BNB home screen. Please click the image to see it in full size & detail.

The British Library is proud that the British National Bibliography is the first tenant selected for the Share Family National Bibliographies Portal.

BNB is now available to explore in Beta: https://bl.natbib-lod.org. You can search for publications, original works and people, as illustrated by these examples:

You can use the national bibliography to search for a specific publication, such as a large print edition of the novel Small island by Andrea Levy.

Screenshot: Bibliographic description of large print edition of Small Island by Andrea Levy.
4. Screenshot: Bibliographic description of large print edition of Small Island by Andrea Levy. Please click the image to see it in full size & detail.

 

You can also find original works inspired by earlier works:

Screenshot: Results set for publication of the work, Small island by Helen Edmundson
5. Screenshot: Results set for publication of the work, Small island by Helen Edmundso. Please click the image to see it in full size & detail.

 

Alternatively, you can search for works by a specific author… 

Screenshot showing original works by Douglas Adams
6. Screenshot: Original works by Douglas Adams. Please click the image to see it in full size & detail.

 

…or about a specific person

Screenshot showing original works about Douglas Adams
7. Screenshot: Original works about Douglas Adams. Please click the image to see it in full size & detail.

 

…or by organization

Screenshot showing results set for BBC
8. Screenshot: Results set for BBC. Please click the image to see it in full size & detail.

 

Benefits

What benefit do we expect to gain from this collaboration?

  • We profit from practical experience our collaborators have gained through other linked data initiatives
  • We gain access to a state of the art, extensible infrastructure designed for library data
  • We gain a new channel for dissemination of the BNB, in aggregation with other national bibliographies

We are able to re-tool our metadata for the 21st Century:

  • Our data will be remodelled and clustered making it more compatible with current data models, including the IFLA Library Reference Model, RDA: Resource Description and Access, and Bibframe
  • Our data will be enriched with URIs that will make it more effective in linked data environments
  • The entity-centred view of the British National Bibliography offers new perspectives for researchers

 

Future developments

Conversion of the BNB and publication in the National Bibliographies Portal is only the beginning. 

  • The BNB data from the Cluster Knowledge base will also be published in the triple store
  • Original records will be available to the British Library as Bibframe 2.0, for dissemination or reuse as linked data
  • Users will be provided with access to the data via data dumps and a SPARQL endpoint
  • Our MARC records will be enriched with original Share URIs and URIs from external sources
  • Other national bibliographies will join BNB in the national bibliographies portal

The British National Bibliography represents only a fraction of the Library’s data.   You can explore the British Library’s collection through our catalogue, which we plan to contribute to Share-VDE in future.

 

Beta service

The British National Bibliography in the Share Family is being made available in Beta. The service is still being tested. The interface and the functionality are subject to change and may not work for everyone.  You can tell us what you think about the service or report problems by contacting [email protected].

 

Further information:

British National Bibliography https://bnb.bl.uk  

Share VDE http://www.share-family.org/

Share Family wiki https://wiki.share-vde.org/wiki/Main_Page

Share VDE Virtual Discovery Environment in linked open data https://svde.org/

National Bibliographies in Linked Open Data https://natbib-lod.org

British National Bibliography Linked Open Data Portal https://bl.natbib-lod.org

 

Footnotes

[1]  Berners-Lee, Tim; James Hendler; Ora Lassila (May 17, 2001). "The Semantic Web". Appeared in: Scientific American. (284(5):34-43 (May 2001). 

[2] Share-VDE: supporting the creation, management and discovery of linked open data for libraries: executive summary. Share-VDE Executive Committee. December 7th, 2022. Share-VDE Website (viewed 19th June 2023)

[3] Share Family – Linked data ecosystem. How does it work?  http://www.share-family.org/  (viewed on 23rd June 2023)

06 July 2023

Our team at Digital Humanities 2023 Conference, 10-14 July

Three of us from the British Library Digital Research Team will be attending Digital Humanities 2023 in Graz, Austria next week. The last DH Conference we attended was in Utrecht in 2019 and we can’t wait to participate again in person this year. 

We are looking forward to meeting new and old DH-ers and to having exciting in-person conversations in between the conference sessions throughout the week. 

In particular we want to invite you to come and visit us during the conference poster session on Wednesday 12 July from 18:00. There will be drinks and nibbles on offer and ample time for discussions.

Here is a list of our posters and we look forward to talking to you about our collaborations and projects:

Rossitza will present a poster about collaborations as part of her AHRC-RLUK Professional Practice Fellowship project Datafication and reuse of the descriptions of the incunabula collection at the British Library (pp.505-506)

As part of the Living with Machines project Mia contributed to the poster about Metadata Enrichment in the Living with Machines Project: User-focused Collaborative Database Development in a Digital Humanities Context (pp.553-555)

Stella will present the UK Digital Comics: Challenges and Opportunities of a Collaborative Doctoral Partnership. A Co-designed Comic Poster (pp.596-597)

We will also be at some of the pre-conference workshops on Tuesday. Rossitza will attend the all-day OCR4all - Open-Source OCR and HTR Across the Centuries, and Stella will participate in a couple of half-day workshops: the AudiAnnotate Workshop with Radio Venceremos, Rebel Radio Station and SpokenWeb: Using IIIF with AV to Build Editions and Exhibits and Creating, storing, and sharing your own web archives with open source Webrecorder tools

The pre-conference communications have been great and you can find out more about the conference programme in the impressive Digital Humanities 2023: Book of Abstracts | Zenodo We are thrilled to be joining this exciting event held in this stunning Austrian city.

Wir kommen, Graz. Bis bald!

04 July 2023

MIX 2023 Storytelling in Immersive Media

This Friday we are looking forward to hosting MIX 2023 at the Library. Presented in partnership with Bath Spa University’s Centre for Cultural and Creative Industries, The Writing Platform and MyWorld, this conference explores the intersection of writing and technology, creating an opportunity for scholars and practitioners to share and discuss research and practice in the rapidly evolving field of storytelling in immersive environments.

Text on image says "MIX 2023 Storytelling in Immersive Media, 7th July 2023" underneath are partner logos

Our opening keynote speaker is Adrian Hon, co-founder and CEO at Six to Start, creators of the world’s best selling smartphone fitness game, Zombies, Run!, which is currently showcased in the Library’s Digital Storytelling exhibition (2 June – 15 October 2023). Following Adrian’s talk is a jam packed programme of presentations and panel discussions examining where and how creative writing and emerging technologies meet. MIX 2023 sessions will cover a range of themes and topics including interactive and locative works, text in immersive media, digital and film poetry, narrative games, digital preservation, archiving and curation, and storytelling with AI. There will also be an area at this event for attendees to experience VR works of poetry and literature, including The Abandoned Library by Dreaming Methods, led by Andy Campbell and Judi Alston.

If this event sounds up your street, there is still time to book a place for MIX 2023 on Friday 7th July, 09:00 - 17:00. It will take place in person, in the Library’s Knowledge Centre and will not be live-streamed. The ticket price covers a sandwich lunch, refreshments during the day, and includes access to an evening performance of An Island of Sound by award winning poet J.R. Carpenter and audiovisual composer Jules Rawlinson. 

Artwork from 80 Days showing profile faces of characters Passepartout and Phileas Fogg

If you would like to develop your interactive writing skills, then you may also be interested in signing up for our Fiction as Dialogue Interactive Fiction Summer School, which will run from Monday 21st to Friday 25th August 2023. Led by Dr. Florencia Minuzzi, veteran games writer and narrative designer, this course will teach participants how to create interactive narratives using ink, a writer-friendly open-source scripting language that does not require programming knowledge.

This summer school will also feature expert guest speakers, including Corey Brotherson, the writer for in-development interactive narrative Windrush Tales, and the adapting writer/editor of Yomi Ayeni’s acclaimed steampunk transmedia series, Clockwork Watch. Meghna Jayanth, a video game writer and narrative designer, known for her writing on inkle's 80 Days, for which she won the UK Writers’ Guild Award for Best Writing in a Video Game. Dan Hett, a prolific digital artist and writer from Manchester, whose work c-ya-laterrrr won the 2020 New Media Writing Prize, and game designer Destina Connor, Co-director of Tea-Powered Games, an independent game company dedicated to telling interesting stories in innovative ways.

Text on image says "Everything Forever, 10 years of electronic legal deposit"

2023 marks the 50th anniversary of the British Library and also 10 years since the introduction of non-print (electronic) legal deposit, providing a perfect moment to reflect on our achievements and also to look forwards to the future. Our Digital Storytelling exhibition events, including MIX 2023, create opportunities to celebrate pioneering and experimental writing, and also to consider what new forms of digital storytelling may arise in the coming years. We hope you can join us.

22 June 2023

Explore Windrush Tales in our Digital Storytelling exhibition

On the 22nd June in 1948 the HMT Empire Windrush docked in Tilbury, Essex, bringing people from the Caribbean who had been invited to help rebuild "the motherland" after the devastation of the Second World War.

Seventy-five years later, stories from the Windrush generation are shared in a ground breaking illustrated text-based interactive narrative from 3-Fold Games. Windrush Tales is still in development, but a preview can be read exclusively in the British Library’s current Digital Storytelling exhibition, which is open until 15 October 2023.

Illustration of a boat and story choices
Windrush Tales art by Naima Ramanan © 3-Fold Games

Windrush Tales tells the stories of characters Rose, an aspiring nurse, and her older brother Vernon, through a beautiful illustrated interactive photo album and branching narrative, allowing choices within the game to lead to one of many endings. Apprehensive but tenacious, Rose joins her brother in England with the intention to start work as a nurse in the newly formed NHS. Vernon has been in Britain for several years but, unknown to his sister, has struggled to find and stay in employment. Through his photography, he documents how they adapt to their new life, from grassroots arts and activism, to church and social clubs.

Illustration of a photograph showing the faces of a man and a woman
Windrush Tales art by Naima Ramanan © 3-Fold Games

As part of their extensive research process to develop the game's story lines, creative director Chella Ramanan and writer Corey Brotherson, both descendants of the Windrush generation, have drawn upon their families' experiences, news coverage, books, and exhibitions. They also consulted with people who emigrated from the Caribbean to Britain during the Windrush period from the late 1940s to the early 1970s, and their families, via a workshop organised in collaboration with Jennifer Allsopp, from the University of Birmingham.

Windrush Tales is one of eleven showcased narratives in Digital Storytelling, the first exhibition of its kind at the Library. Curators worked closely with writers, artists and creators to display a range of innovative publications, which reflect the rapidly evolving field of interactive writing, stories that are dynamic, responsive, personalised and immersive.

To accompany this exhibition there is a season of in-person events at the Library. Writer Corey Brotherson from the Windrush Tales team will be speaking about another of his projects, the Clockwork Watch story world at MIX 2023 Storytelling in Immersive Media, a one-day conference exploring the intersection of writing and technology on Friday 7 July 2023. Corey is also a guest tutor at the Fiction as Dialogue, Interactive Fiction Summer School, which runs from 21st to the 25th August 2023.

04 May 2023

Webinar on Open Scholarship in GLAMs through Research Repositories

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then join us on 18th May, Thursday for an online repository training session for cultural heritage professionals.

Image of man looking at a poster that says 'Open Scholarship in GLAMs through Research Repositiories - Webinar on 18 May, Thursday - Register at bit.ly/BLrepowebinar

This event is part of the Library’s Repository Training Programme for Cultural Heritage Professionals. It is designed based on the input received from previous repository training events (this, this and this) to explore some areas of the open scholarship further. They include but are not limited to, research activities in GLAM, benefits of research repositories, scholarly publishing, research data management and digital preservation in scholarly communications.

 

Who is it for?

It is intended for those who are working in cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting the research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics is welcome to attend!

 

Programme

13.00                  Welcome and introductions

      Susan Miles, Scholarly Communications Specialist, British Library

Session 1          Open scholarship in GLAM research  

13.15                  Repositories to facilitate open scholarship

     Jenny Basford, Repository Services Lead, British Library

13.40                 Scholarly publishing dynamics in the GLAM environment

     Ilkay Holt, Scholarly Communications Lead, British Library

14.05                  Q&A

14.20                 Break time

Session 2          Building openness in GLAM research  

14.40                  Research data management

      Jez Cope, Data Services Lead, British Library

15.05                  Digital preservation and scholarly communications

      Neil Jefferies, Head of Innovation, Bodleian Libraries

15.30                  Q&A

15.45                  Closing

 

Register!

The event will take place from 13.00 to 15.45 on 18 May, Thursday. Please register at this link to receive your access link for the online session.

 

What is next?

The last training event of the Library’s Repository Training Programme will be held on 31 May in Cardiff, hosted by the National Museums Cardiff. It will be an update and re-run of the previous face-to-face events. More information about the programme and registration link can be found in this blog post.

Please contact [email protected] if you have any questions or comments about the events.

 

Previous Events

31 January, in-person, Edinburgh, hosted by the National Museums Scotland

8 March, online, hosted by the British Library

31 March, in-person, York, hosted by Archeology Data Service at the University of York

 

About British Library’s Repository Training Programme

The Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support cultural heritage organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories. You can read more about the scoping report and the development of this training programme in this blog post.

02 May 2023

Detecting Catalogue Entries in Printed Catalogue Data

This is a guest blog post by Isaac Dunford, MEng Computer Science student at the University of Southampton. Isaac reports on his Digital Humanities internship project supervised by Dr James Baker.

Introduction

The purpose of this project has been to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information. The first part of this project involved experimenting with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum (described by Rossitza Atanassova in a post announcing her AHRC-RLUK Professional Practice Fellowship project) and trying to find the best ways to detect individual entries and reassemble them as data (given that the text for a single catalogue entry may be spread across multiple pages of a printed catalogue). Then the next part of this project involved building a complete system based on this approach to take the large volume of XML files for a volume and output all of the catalogue entries in a series of desired formats. This post describes our initial experiments with that data, the approach we settled on, and key features of our approach that you should be able to reapply to your catalogue data. All data and code can be found on the project GitHub repo.

Experimentation

The catalogue data was exported from Transkribus in two different formats: an ALTO XML schema and a PAGE XML schema. The ALTO layout encodes positional information about each element of the text (that is, where each word occurs relative to the top left corner of the page) that makes spatial analysis - such as looking for gaps between lines - helpful. However, it also creates data files that are heavily encoded, meaning that it can be difficult to extract the text elements from the data files. Whereas the PAGE schema makes it easier to access the text element from the files.

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the PAGE XML Schema
Raw PAGE XML for a page from volume 8 of the Incunabula Catalogue

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the ALTO XML Schema
Raw ALTO XML for a page from volume 8 of the Incunabula Catalogue

 

Spacing and positioning

One of the first approaches tried in this project was to use size and spacing to find entries. The intuition behind this is that there is generally a larger amount of white space around the headings in the text than there is between regular lines. And in the ALTO schema, there is information about the size of the text within each line as well as about the coordinates of the line within the page.

However, we found that using the size of the text line and/or the positioning of the lines was not effective for three reasons. First, blank space between catalogue entries inconsistently contributed to the size of some lines. Second, whenever there were tables within the text, there would be large gaps in spacing compared to the normal text, that in turn caused those tables to be read as divisions between catalogue entries. And third, even though entry headings were visually further to the left on the page than regular text, and therefore should have had the smallest x coordinates, the materiality of the printed page was inconsistently represented as digital data, and so presented regular lines with small x coordinates that could be read - using this approach - as headings.

Final Approach

Entry Detection

Our chosen approach uses the data in the page XML schema, and is bespoke to the data for the Catalogue of books printed in the 15th century now at the British Museum as produced by Transkribus (and indeed, the version of Transkribus: having built our code around some initial exports, running it over  the later volumes - which had been digitised last -  threw an error due to some slight changes to the exported XML schema).

The code takes the XML input and finds entry using a content-based approach that looks for features at the start and end of each catalogue entry. Indeed after experimenting with different approaches, the most consistent way to detect the catalogue entries was to:

  1. Find the “reference number” (e.g. IB. 39624) which is always present at the end of an entry.
  2. Find a date that is always present after an entry heading.

This gave us an ability to contextually infer the presence of a split between two catalogue entries, the main limitation of which is quality of the Optical Character Recognition (OCR) at the point at which the references and dates occur in the printed volumes.

 

An image of a digitised page with a catalogue entry and the corresponding text output in XML format
XML of a detected entry

 

Language Detection

The reason for dividing catalogue entries in this way was to facilitate analysis of the catalogue data, specifically analysis that sought to define the linguistic character of descriptions in the Catalogue of books printed in the 15th century now at the British Museum and how those descriptions changed and evolved across the thirteen volumes. As segments of each catalogue entry contains text transcribed from the incunabula that were not written by a cataloguer (and therefore not part of their cataloguing ‘voice’), and as those transcribed sections are in French, Dutch, Old English, and other languages that a machine could detect as not being modern English, to further facilitate research use of the final data, one of the extensions we implemented was to label sections of each catalogue entry by the language. This was achieved using a python library for language detection and then - for a particular output type - replacing non-English language sections of text with a placeholder (e.g. NON-ENGLISH SECTION). And whilst the language detection model does not detect the Old-English, and varies between assigning those sections labels for different languages as a result, the language detection was still able to break blocks of text in each catalogue entry into the English and non-English sections.

 

Text files for catalogue entry number IB39624 showing the full text and the detected English-only sections.
Text outputs of the full and English-only sections of the catalogue entry

 

Poorly Scanned Pages

Another extension for this system was to use the input data to try and determine whether a page had been poorly scanned: for example, that the lines in the XML input read from one column straight into another as a single line (rather than the XML reading order following the visual signifiers of column breaks). This system detects poorly scanned pages by looking at the lengths of all lines in the page XML schema, establishing which lines deviate substantially from the mean line length, and if sufficient outliers are found then marking the page as poorly scanned.

Key Features

The key parts of this system which can be taken and applied to a different problem is the method for detecting entries. We expect that the fundamental method of looking for marks in the page content to identify the start and end of catalogue entries in the XML files would be applicable to other data derived from printed catalogues. The only parts of the algorithm which would need changing for a new system would be the regular expressions used to find the start and end of the catalogue entry headings. And as long as the XML input comes in the same schema, the code should be able to consistently divide up the volumes into the individual catalogue entries.

19 April 2023

Repository Training Day in Cardiff: Research in GLAM and research repositories to facilitate open scholarship activities for cultural heritage organisations

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then register for a hybrid repository training day for cultural heritage professionals hosted by the National Museum Cardiff in Wales on 31 May 2023.  

The British Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support GLAM organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories.  

Manuscript illustration of Cardiff from the 17th Century showing a river, fields, a church and other small buildings
Insert from John Speeds County maps of Wales first published in The Theatre of the Empire of Great Britain by George Humble (1610) made available by the National Library of Wales via Flickr Commons

Background 

The very first in-person event was in Edinburgh in January, with a follow-up online session in March and a second in-person event in York, hosted by the Archaeology Data Service (ADS) at the University of York on 23 March.  

We had attendees from the British Museum, National Museums Scotland, National Portrait Gallery, Towards a National Collection (AHRC) and the ADS in various roles including scholarly communications librarian, digital archivist, project manager and senior researchers in their organisations.  

The full programme for this event is available in a previous blog post. During the event, conversations took place on a range of topics from policy development, embedding research culture in organisations to encouraging staff to be involved in research cycles, different types of workflows in different institutions. In the feedback we received from the audience, there is a need to explore more about research data management, scholarly publishing, challenges in smaller organisations, working with emerging formats and building communities of practice.  

Now looking forward, the last hybrid repository training event will be hosted by the National Museum Cardiff in Wales on Wednesday 31 May. You can see the details below and register here. We are looking forward to meeting everyone who is interested in learning more about research repositories from cultural heritage organisations.  

 

Who is this training for? 

We invite everyone who is working in a cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics is welcome to attend. 

 

What will you learn? 

This one-day training session is designed as a starting point to a broader set of knowledge that will help you to: 

 

  • Understand the research landscape in cultural heritage organisations, benefits of openness for heritage research, basic concepts of open principles and influencing decision makers 
  • Lay foundations for repository services including stakeholder engagement, policy development, technical overview and project planning 
  • Adopt common principles and frameworks, technical standards and requirements in establishing repository services in a cultural heritage organisation 
  • Explore basics of the scholarly communications ecosystem in the context of cultural heritage practices. 

 

Prerequisites 

No previous knowledge of topics is required. However, an understanding of open access will maximise the benefit of the taught content for attendees.  

 

Programme  

10:30 - Welcome and introductions

10:50 - iDAH Programme 

    Joanna Dunster, Head of (Research) Infrastructure, AHRC

11:05 - Session 1 Opening up heritage research 

This session covers the topics of understanding the research landscape in GLAM organisations, benefits of openness for heritage research, basic concepts of open principles and frameworks. 

    Ilkay Holt, Scholarly Communications Lead, BL

    Susan Miles, Scholarly Communications Speacialist, BL

11:45 - Q&A / Discussion

12:00 - Break  

12:15 - Session 2 Getting started with heritage GLAM repositories  

This session covers topics on the role of repository infrastructure in open access to heritage research and positioning research repositories in an organisation including policy and development. 

    Ilkay Holt, Scholarly Communications Lead, BL

    Susan Miles, Scholarly Communications Speacialist, BL

12:45 - Lunch

13:30 - Session continues 

13:55 - Q&A / Discussion

14:10 - Session 3: Realising and expanding the benefits 

This module covers technical overview and requirements for running a cultural heritage repository including an overview of BL’s Shared Research Repository, platforms and software, content administration, technical features.   

    Graham Jevon, Digital Services Specialist, BL

    Nora Ramsey, Assistant to Digital Services Specialist, BL

14:30 - Break

14:40 - Session continues

15:00 - Q&A / Discussion 

15:15-15:30 - Closing Remarks

 

Book your place 

In-person sessions are planned for a maximum of 35 people per event and registrants from cultural heritage institutions will be prioritised. Registration for the event is free. Please fill in this form to book your place.

Please note that registrations for in-person attendance will close at 4pm Friday 26th May and confirmation for in-person attendance will be sent to the registered email address.

Registrations for online attendance will close at 6pm on Tuesday 30th May. Zoom access link will be sent to the registered email address day prior to the event. 

Members of the Research Infrastructure Services Team at the British Library will be delivering the training programme. The team has over 25 years of broad experience and extensive knowledge in supporting open scholarship across the sector and with international partners. They also provide a Shared Research Repository Service for the cultural heritage organisations.  

Please contact [email protected] if you have any questions or comments about this training programme.