THE BRITISH LIBRARY

Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

07 December 2018

Introducing an experimental format for learning about content mining for digital scholarship

This post by the British Library’s Digital Curator for Western Heritage Collections, Dr Mia Ridge, reports on an experimental format designed to provide more flexible and timely training on fast-moving topics like text and data mining.

This post covers two topics – firstly, an update to the established format of sessions on our Digital Scholarship Training Programme (DSTP) to introduce ‘strands’ of related modules that cumulatively make up a ‘course’, and secondly, an overview of subjects we’ve covered related to content mining for digital scholarship with cultural heritage collections.

Introducing ‘strands’

The Digital Research team have been running the DSTP for some years now. It’s been very successful but we know that it's hard for people to get away for a whole day, so we wanted to break courses that might previously have taken 5 or 6 hours of a day into smaller modules. Shorter sessions (talks or hands-on workshops) only an hour or at most two long seemed to fit more flexibly into busy diaries. We can also reach more people with talks than with hands-on workshops, which are limited by the number of training laptops and the need to offer more individual

A 'strand' is a new, flexible format for learning and maintaining skills, with training delivered through shorter modules that combine to build attendees’ knowledge of a particular topic over time. We can repeat individual modules – for example, a shorter ‘Introduction to’ session might run more often, or target people with some existing knowledge for more advanced sessions. I haven’t formally evaluated it but I suspect that the ability to pick and choose sessions means that attendees for each module are more engaged, which makes for a better session for everyone. We've seen a lot of uptake – in some cases the 40 or so places available go almost immediately - so offering shorter sessions seems to be working.

Designing courses as individual modules makes it easier to update individual sections as technologies and platforms change. This format has several other advantages: staff find it easier to attend hour-long modules, and they can try out methods on their own collections between sessions. It takes time for attendees to collect and prepare their own data for processing with digital methods (not to mention preparation time and complexity for the instructor), so we've stayed away from this in traditional workshops.

New topics can be introduced on a 'just in time' basis as new tools and techniques emerge. This seemed to address lots of issues I was having in putting together a new course on content mining. It also makes it easier to tackle a new subject than the established 5-6 hour format, as I can pilot short sessions and use the lessons learnt in planning the next module.

The modular format also means we can invite international experts and collaborators to give talks on their specialisms with relatively low organisational overhead, as we regularly run ‘21st Century Curatorship’ talks for staff. We can link relevant staff talks, or our monthly ‘Hack and Yack’ and Digital Scholarship Reading Groups sessions to specific strands.

We originally planned to start each strand with an introductory module outlining key concepts and terms, but in reality we dived into the first one as we already had talks that'd fit lined up.

Content mining for digital scholarship with cultural heritage collections

Tom and Nora trying out AntConcFrom the course blurb: ‘Content mining (sometimes ‘text and data mining’) is a form of computational processing that uses automated analytical techniques to analyse text, images, audio-visual material, metadata and other forms of data for patterns, trends and other useful information. Content mining methods have been applied to digitised and digital historic, cultural and scientific collections to help scholars answer new research questions at scale, analysing hundreds or hundreds of thousands of items. In addition to supporting new forms of digital scholarship that apply content mining methods, methods like Named Entity Recognition or Topic Modelling can make collection items more discoverable. Content mining in cultural heritage draws on data science, 'distant reading' and other techniques to categorise items; identify concepts and entities such as people, places and events; apply sentiment analysis and analyse items at scale.’

An easily updatable mixture of introductory talks, tutorial sessions, hands-on workshops and case studies from external experts fit perfectly into the modular format, and it's worked out well, with a range of topics and formats offered so far. Sessions have included: an Introduction to Machine Learning; Computational models for detecting semantic change in historical texts (Dr Barbara McGillivray, Alan Turing Institute); Computer vision tools with Dr Giles Bergel, from the University of Oxford's Visual Geometry Group; Jupyter Notebooks/Python for simple processing and visualisations of data from In the Spotlight; Listening to the Crowd: Data Science to Understand the British Museum's Visitors (Taha Yasseri, Turing/OII); Visualising cultural heritage collections (Olivia Fletcher Vane, Royal College of Art); An Introduction to Corpus Linguistics for the Humanities (Ruth Byrne, BL and Lancaster PhD student); Corpus Analysis with AntConc.

What’s next?

My colleagues Nora McGregor, Stella Wisdom and Adi Keinan-Schoonbaert have some great ‘strands’ planned for the future, including Stella’s on ‘Emerging Formats’ and Adi’s on ‘Place’, so watch this space for updates!

19 November 2018

The British Library / Qatar Foundation Partnership Imaging Hack Day

The BL/QFP is digitising archive material related to Persian Gulf History as well as Arabic scientific manuscripts, in the past four years we have added in excess of 1.5 million images to the Qatar Digital Library. Our team of ~45 staff includes a group of eight dedicated imaging professionals, who between them produce 30,000 digitised images each month, to exacting standards that focus on presenting the information on the page in a visually clear and consistent manner.

 

Our imaging team are a highly-skilled group, with a variety of backgrounds, experiences and talents, and we wished to harness these. Therefore, we decided to set aside a day for our Imaging team to use their creative and technical skills to ‘hack’ the material in our collection.

By dedicating a whole day for our imaging team to experiment with different ways of capturing the material we are digitising we hoped it would reveal some interesting aspects of the collection, which were not seen through our standardised capture process. It also gave the Imaging team a chance to show off and share their skills amongst themselves and the wider BL/QFP team.

This was how we conceived of our first Imaging Hack Day, and the rest of this blog post outlines how we promoted and organised it.

From its conception the Imaging team were keen for the wider team to be involved, so we asked them to nominate material from the collections we are digitising that they thought could be ‘hacked’ and to state their reasons why.

To begin with it was mostly members of the Imaging team that nominated items. So we decided to wage a PR campaign: firstly the Imaging team delivered a presentation on the 9th of October at one of BL/QFP’s all-staff meetings. The presentation outlined some of the techniques and ideas they had for the hack day, in order to appeal to the rest of the team for nominations. Additionally, on the morning of the 9th members of the Imaging team snuck into the office and planted some not-so-subtle propaganda:

Posters

The impact of the posters and presentation was really pronounced. After having a handful of nominations from people outside of the Imaging team before 9th Oct, within days the number had increased by a factor in excess of five (see graph below). The posters also became highly sought after amongst the team.

Nominations
Graph showing how many shelfmarks were nominated each day, with cumulative totals for members of the imaging team vs non-imaging teams.

 

The day before the Hack Day, anyone who had nominated an item was invited to a prep session with the Imaging team. Here the nominated items were presented, as well as the ideas for hacks. Extra judicious use of Post-Its and Sharpies facilitated feedback, and by the end of the session the Imaging team were armed with lots of ideas, encouragement, and knew they had curatorial expertise from the rest of the BL/QFP team to call upon if necessary.

Postits

As a final surprise, and a sign of appreciation Hack Sacks filled with goodies were secreted into the imaging studio late on the eve of the Hack Day:

Hacksacks

The resulting images/hacks of the Hack Day will be covered in an upcoming post by our studio manager Renata Kaminska. However, in addition the non-material results were manifold. Throughout the lead-up and on the actual day there was a palpable buzz amongst the Imaging team, evidence of the positive impact on their morale. It also led to a greater exchange of knowledge between the Imaging team and their colleagues throughout the BL/QFP. The day allowed for different areas of the team to come together, combine their expertise and find new ways of working and innovative ways of capturing our collections. Finally, it also demonstrated the fantastic experience and skills of our imaging technicians, many of which had not previously been exposed to the rest of the team. It was a real celebration of both the material that we are digitising and our talented imaging studio.

This is a guest post by Sotirios Alpanis, Head of Digital Operations for the British Library's Qatar Project, on Twitter as @SotiriosAlpanis

02 November 2018

Digital Conversation: History and Games

It is very nearly International Games Week; this is an initiative run by volunteers from around the world to reconnect communities through their libraries around the educational, recreational, and social value of all types of games. Here at the British Library we are excited to be hosting the narrative games convention AdventureX on Saturday 10th and Sunday 11th November, and to get the party started on Thursday 8th November we are delighted to run, in partnership with The National Archives and Wellcome, a Digital Conversation event on the topic of History and Games.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2018-11-02/a94ae6e5-8ae4-4fca-b786-91c9fab10c7a.png

Our star Digital Conversation panel features:
  • Toni Brasting, Creative Partnerships Manager at Wellcome Trust, who collaborates with games studios, designers and scientific researchers to create games that inspire conversations about health.
  • Andrew Burn, Professor of Media Education at the UCL Institute of Education, who will launch MissionMaker Beowulf, a digital platform which empowers students to make 3-D adventure games.

A video showing the process of making a game in Missionmaker Beowulf, followed by a video capture of the game
  • James Delaney founder and Managing Director of BlockWorks, who built Minecraft maps for Great Fire 1666 at the Museum of London, to mark the 350th anniversary of London's Great Fire. Furthermore, this summer they teamed up with English Heritage on a castle building project.

Kenilworth Castle in Minecraft

Trailer of Winter Hall by Lost Forest Games

  • Nick Webber, Associate Professor at Birmingham City University, whose research explores the impact of virtual worlds and online games on the practice of history.
  • Stella Wisdom, Digital Curator for Contemporary British Collections at the British Library, who has collaborated on multiple games initiatives.

The Digital Conversation event takes place in The Knowledge Centre at the British Library on Thursday 8th November, 18.30- 20.30; for more details including booking, visit: https://www.bl.uk/events/digital-conversation-history-and-games. Hope to see you there.

This post is by Digital Curator Stella Wisdom, on twitter as @miss_wisdom