Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

27 January 2020

How historians can communicate their research online

This blog post is by Jonathan Blaney (Institute of Historical Research), Frances Madden (British Library), Francesca Morselli (DANS), Jane Winters (School of Advanced Study, University of London)

This blog will be published in several other locations including the FREYA blog and the IHR blog

Large satellite receiver
Source: Joshua Hoehne, Unsplash

On 4 December 2019, the FREYA project in collaboration with UCL Centre for Digital Humanities, Institute of Historical Research, the British Library and DARIAH-EU organized a workshop in London on identifiers in research. In particular this workshop - mainly directed to historians and humanities scholars - focused on ways in which they can build and manage an online profile as researchers, using tools such as ORCID IDs. It also covered best practices and methods of citing digital resources to make humanities researchers' work connected and discoverable to others. The workshop had 20 attendees, mainly PhD students from the London area but also curators and independent researchers.

Presentations

Frances Madden from the British Library introduced the day which was supported by the FREYA project which is funded under the EU’s Horizon 2020 programme. FREYA aims to increase the use of persistent identifiers (PIDs) across the research landscape by building up services and infrastructure. The British Library is leading on the Humanities and social sciences aspect of this work.

Frances described how PIDs are central to scholarly communication becoming effective and easy online. We will need PIDs not just for publications but for grey literature, for data, for blog posts, presentations and more. This is clearly a challenge for historians to learn about and use, and the workshop is a contribution to that effort.

PIDs: some historical context

Jonathan Blaney from the Institute of Historical Research said that there is a context to citation and the persistent identifiers which have grown up around traditional forms of print citation. These are almost invisible to us because they are deeply familiar. He gave an example of a reference to the gospel story of the woman taken in adultery:

John 7:53-8:11

There are three conventions here: the name ‘John’ (attached to this gospel since about the 2nd century) the chapter divisions (medieval and ascribed to the English bishop Stephen Langton) and the verse divisions (from the middle of the 16th century).

When learning new forms of referencing, such as the ones under discussion at the workshop, Jonathan suggested that historians should remember their implicit knowledge has been learned. He finished with an anecdote about Harry Belafonte, retold in Anthony Grafton’s The Footnote: A Curious History. As a young sailor Belafonte wanted to follow up on references in a book he had read. The next time he was on shore leave he went to a library and told the librarian:

“Just give me everything you’ve got by Ibid.”

People in conference room watching a presentation

Demonstrating the benefits

Prof Jane Winters from School introduced what she claimed was her most egotistical presentation by explaining her own choices in curating her online presence and also what was beyond her control. She showed the different results of web searches for herself using Google and DuckDuckGo and pointed out how things she had almost forgotten about can still feature prominently in results.

Jane described her own use of Twitter, and highlighted both the benefits and challenges of using social media to communicate research and build an online profile. It was the relatively rigid format of her institutional staff profile that led her to create her own website. Although Jane has an ORCID ID and a page on Humanities Commons, for example, there are many online services she has chosen not to use, such as academia.edu.

This is all very much a matter of personal choice, dependent upon people’s own tastes and willingness to engage with a particular service.

How to use what’s available

Francesca Morselli from DANS gave a presentation aiming to provide useful resources about identifiers for researchers as well as explaining in a simple yet exhaustive way how they "work" and the rationale behind them.

Most importantly PIDs ensure:

  1. Citability and discoverability (both for humans and machine)
  2. Disambiguation (between similar objects)
  3. Linking to related resources
  4. Long-term archiving and findability

Francesca then introduced the support provided by projects and infrastructures: FREYA, DARIAH-EU and ORCID. Among the FREYA project pillars (PID graph, PID Commons, PID Forum), the latter is available for anyone interested in identifiers.

The DARIAH-EU infrastructure for Arts and Humanities has recently launched the DARIAH Campus platform which includes useful resources on PIDs and managing research data (i.e. all materials which are used in supporting research). In 2018 DARIAH also organized a winter school on Open Data Citation, whose resources are archived here.

Dariah

 

A Publisher’s Perspective

Kath Burton from Routledge Journals emphasised how much use publishers make of digital tools to harvest convent, including social media crawlers, data harvesters and third party feeds.

The importance of maximising your impact online when publishing was explained, both before publishing (filling in the metadata, giving a meaningful title) and afterwards (linking to the article from social media and websites), as well as how publishers can help support this.

Kath went on to give an example of Taylor & Francis’s interest in the possibilities of online scholarly communication by describing its commitment to publishing 3D models of research objects, which is does on via Sketchfab page.

Breakout Groups

After the presentations and a coffee break there were group discussions about what everyone had just heard. During the first part, the groups were asked what was new to them in the presentations. It was clear from discussions around the room that attendees had heard much which was new to them. For example, some attendees had ORCID IDs but many were surprised at the range of things for which they could be used, such as in journal articles and logging into systems. They were also struck by the range of things in which publishers were interested such as research data. Many were really interested in the use of personal websites to manage their profile.

When asked what tallied with their experiences, it became clear that they were keen to engage with these systems, setting up ORCID IDs and Humanities Commons profiles but that they felt that they were too early on in their careers to have anything to contribute to these platforms and felt they were designed for established researchers. Jane Winters stressed that one could adopt a broad approach to the term ‘publications’, including posters, presentations and blog posts and encouraged all to share what they had.

Lastly discussion turned to how the group cites digital resources. This led to an interesting conversation around the citation of archived web pages and how to cite webpages which might change over time, with tools such as the Internet Archive being mentioned. There was also discussion about whether one can cite resources such as Wikipedia and it was clear that this was not something which had been encouraged. Jonathan, who has researched this subject, mentioned that he had found established academics are happy to cite Wikipedia than those earlier in their career.

Conclusions

The workshop effectively demonstrated the sheer range of online tools, social media forums and publishing venues (both formal and informal) through which historians can communicate their research online. This is both an opportunity and a problem. It is a challenge to develop an online presence - to decide which methods are most appropriate for different kinds of research and different personalities - but that is just the first step. For research communication to be truly valuable, it is necessary to focus your effort, manage your online activities and take control of how you appear to others in digital spaces. PIDs are invaluable in achieving this, and in helping you to establish a personal research profile that stays with you as you move through your career. At the start of the day, the majority of those who attended the workshop did not know very much about PIDs and how you can put them to use, but we hope that they came away with an enhanced understanding of the issues and possibilities, the awareness that it does not take much effort or skill to make a real difference to how you are perceived online, and some practical advice about next steps.

It was apparent that, with some admirable exceptions, neither higher education institutions nor PID organisations are successfully communicating the value and importance of PIDs to early career researchers. Workshop attendees particularly welcomed the opportunity to hear from a publisher and senior academic about how PIDs are used to structure, present and disseminate academic work. The clear link between communicating research online and public engagement also emerged during the course of the day, and there is obvious potential for collaboration between PID organisations and those involved with training focused on impact and public engagement. We ended the day with lots of ideas for further advocacy and training, and a shared appreciation for the value of PIDs for helping historians to reach out to a range of different audiences online.

20 January 2020

Using Transkribus for Arabic Handwritten Text Recognition

This blog post is by Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She's on Twitter as @BL_AdiKS.

 

In the last couple of years we’ve teamed up with PRImA Research Lab in Salford to run competitions for automating the transcription of Arabic manuscripts (RASM2018 and RASM2019), in an ongoing effort to identify good solutions for Arabic Handwritten Text Recognition (HTR).

I’ve been curious to test our Arabic materials with Transkribus – one of the leading tools for automating the recognition of historical documents. We’ve already tried it out on items from the Library’s India Office collection as well as early Bengali printed books, and we were pleased with the results. Several months ago the British Library joined the READ-COOP – the cooperative taking up the development of Transkribus – as a founding member.

As with other HTR tools, Transkribus’ HTR+ engine cannot start automatic transcription straight away, but first needs to be trained on a specific type of script and handwriting. This is achieved by creating a training dataset – a transcription of the text on each page, as accurate as possible, and a segmentation of the page into text areas and line, demarcating the exact location of the text. Training sets are therefore comprised of a set of images and an equivalent set of XML files, containing the location and transcription of the text.

A screenshot from Transkribus, showing the segmentation and transcription of a page from Add MS 7474
A screenshot from Transkribus, showing the segmentation and transcription of a page from Add MS 7474.

 

This process can be done in Transkribus, but in this case I already had a training set created using PRImA’s software Aletheia. I used the dataset created for the competitions mentioned above: 120 transcribed and ground-truthed pages from eight manuscripts digitised and made available through QDL. This dataset is now freely accessible through the British Library’s Research Repository.

Transkribus recommends creating a training set of at least 75 pages (between 5,000 and 15,000 words), however I was interested to find out a few things. First, the methods submitted for the RASM2019 competition worked on a training set of 20 pages, with an evaluation set of 100 pages. Therefore, I wanted to see how Transkribus’ HTR+ engine dealt with the same scenario. It should be noted that the RASM2019 methods were evaluated using PRImA’s evaluation methods, and this is not the case with Transkribus evaluation method – therefore, the results shown here are not accurately comparable, but give some idea on how Transkribus performed on the same training set.

I created four different models to see how Transkribus’ recognition algorithms deal with a growing training set. The models were created as follows:

  • Training model of 20 pages, and evaluation set of 100 pages
  • Training model of 50 pages, and evaluation set of 70 pages
  • Training model of 75 pages, and evaluation set of 45 pages
  • Training model of 100 pages, and evaluation set of 20 pages

The graphs below show each of the four iterations, from top to bottom:

CER of 26.80% for a training set of 20 pages

CER of 19.27% for a training set of 50 pages

CER of 15.10% for a training set of 75 pages

CER of 13.57% for a training set of 100 pages

The results can be summed up in a table:

Training Set (pp.)

Evaluation Set (pp.)

Character Error Rate (CER)

Character Accuracy

20

100

26.80%

73.20%

50

70

19.27%

80.73%

75

45

15.10%

84.9%

100

20

13.57%

86.43%

 

Indeed the accuracy improved with each iteration of training – the more training data the neural networks in Transkribus’ HTR+ engine have, the better the results. With a training set of a 100 pages, Transkribus managed to automatically transcribe the rest of the 20 pages with 86.43% accuracy rate – which is pretty good for historical handwritten Arabic script.

As a next step, we could consider (1) adding more ground-truthed pages from our manuscripts to increase the size of the training set, and by that improve HTR accuracy; (2) adding other open ground truth datasets of handwritten Arabic to the existing training set, and checking whether this improves HTR accuracy; and (3) running a few manuscripts from QDL through Transkribus to see how its HTR+ engine transcribes them. If accuracy is satisfactory, we could see how to scale this up and make those transcriptions openly available and easily accessible.

In the meantime, I’m looking forward to participating at the OpenITI AOCP workshop entitled “OCR and Digital Text Production: Learning from the Past, Fostering Collaboration and Coordination for the Future,” taking place at the University of Maryland next week, and catching up with colleagues on all things Arabic OCR/HTR!

 

13 December 2019

Do you want to see my butterfly collection?

Posted on behalf of Sara Lucas Agutoli, artist, associate professor at the Accademia di Belle Arti di Bologna, BL Labs Artist in residence and runner up in the BL Labs Artistic Award 2019.

Sara Lucas Agutoli
Artist: Sara Lucas Agutoli
(Copyright: Ilenia Arosio)

Sara Lucas Agutoli lives and works between London and Bologna.  Her academic research focuses on the concepts of true and false in art, in particular in photography. In her art S. L. Agutoli merges popular themes with a learned and symbolic system of citations. Working with different media, she reflects on the idea of ongoing transformation – of the spaces, of the body, as well as of aesthetics – and creates personal architectures drawing on her inner experiences, knowledge and visions.

When occupied with my full time job, I often spend the time wandering on the net, looking for pictures that trigger my interest, either because they are odd and curious or aesthetically pleasant and elegant.

Since 2011 I’ve enjoyed calling myself a cyber-flâneur1:. unlike the Parisian strollers described by Baudelaire, I walked through cyber avenues, getting lost amid different digital archives. I glimpsed through collections of images instead of windows, stared at close-ups of manuscripts instead of sunsets on rivers. The net was my city and I just followed my nose walking through it. I wanted to make my curiosity an aesthetic operation. In doing so I’ve come to believe that online archives are my personal church of Saint-Julien-le-Puvre, the chosen venue for my cyber-dadaist performances,
see: https://www.moma.org/collection/works/184056

For years my working activity followed a pattern: a few months of research – during which I spend hours and hours on Flickr Commons browsing online archives of museums and institutions saving selected images on my hard disk–, followed by months in the studio working creatively with the pictures accumulated.

I did accumulate images and emotions, from advertising to family album pictures. I wanted to explore how photography was used in different parts of the world, eras and in different economical contexts.

In 2011, while in Montreal for my first art residence, I analysed the different uses of vernacular photography in the 50s in North America and Italy. To do so, I used the open archives of most of the North American Libraries (New York Public library, Congregation of Sister of St. Joseph in Canada, California Historical Society and many others) and a private physical archive located in a tin box in my grandmother house.

This lead to a series of pictures inspired by this contrast. The series was exhibited in a solo show called Fermez les yeux.

Sara Mickey: Fermez les yeux
Sara Mickey: Fermez les yeux

The vastness and the richness of topics of the images I accumulated triggered constantly my creativity and my sense of humour. They often made me ask myself  “why do those pictures exist”?

The images – especially those more vernacular, random and unforeseen – became the objects trouvés I could rework using my imagination and reality.

During this dadaist-inspired net-surfing, the most fertile encounter of the last years has been the one with the collections of two of the major London institutions: the British Library and the Wellcome Collection digital archives.  

I was about to move from Italy to London and so my artistic research was about to change, inspired by this encounter.

I started to become interested in the aesthetics of the Victorian era and in the concept of the museum as an extension of a wunderkammer.

I started collecting  images of naturalia 2 and decided to transform them into artificialia in my studio.  And so I did, merging and morphing creatively these images. In 2013 I produced a digital collage of a butterfly scientific illustration and a medical vulva lithography and it was exhibited in public space in Bologna during CHEAP poster Festival.

Cheap Poster Festival
Posters as part of the CHEAP poster festival

This collage of images from the British Library and the Wellcome Collection became the first piece of the larger project Il muro delle meraviglie – the wall of wonders – for which I chose to use the wall of my living room in my home/atelier in NW London.

Il muro delle meraviglie started like a joke to mock the colonialist aesthetic of Victorian museum collections and it became a work of art. Among the wonders I added subsequently, you can find that first collage of the butterfly and the vulva, which I decided to call  “Do you want to see my butterfly collection?” to make my queer/ feminist perspective encounter the delicacy of the naturalistic illustration of butterfly.

The title, in Italian, refers to an apparently naïve question which has an explicit sexual allusion.

The person who asks “come see my butterflies’ collection” might be suggesting it to obtain something more, as the butterfly is used as a metaphor for the female sex.

Sara Deep Thrash
Intallazione a DEEP THRASH

This work criticises the male chauvinist obsession for cataloguing, intended as an activity aimed more at showing off, than simply showing. 

It represents a feminist critique and re-appropriation of such images.

Here the butterflies become proper “c*nts” and give visibility to the female genitalia.

It has been exhibited for the first time in 2013 on the streets of Bologna (IT) during CHEAP festival and at Queer demonstration thanks to C*ntemporary

If I didn’t have access to the BL and the Wellcome digital archives, all of this wouldn’t have been possible.

Finally, I would like to thank the support I have received from BL Labs and am excited about the new experiments and projects waiting for me around the corner.

Footnotes

  1. Flâneur: Flâneur is a French term meaning ‘stroller’ or ‘loafer’ used by nineteenth-century French poet Charles Baudelaire to identify an observer of modern urban life. Dada raised the tradition of Flânerie to the level of an aesthetic operation. The Parisian walk described by Walter Benjamin in the 1920s id utilized as an art form that inscribes itself directly in the real space and time, rather than on a medium.
  2. Naturalia : Naturalia, which includes creatures and natural objects, with a particular interest in monsters

29 November 2019

Introducing Filipe Bento - BL Labs Technical Lead

Posted by Filipe Bento, BL Labs Technical Lead

Filipe BentoI am passionate about libraries and digital initiatives within them, and am particularly interested in Open Knowledge, scholarly communication, scientific information dissemination, (Linked) Open Data, and all the innovative services that can be offered to promote their ultimate dissemination and usage, not only within academia, but also within the wider community such as industry and society. I have over twenty years experience in developing and supporting library tools, some of which have facilitated automation over manual methods to make the lives of people who work or use libraries easier.

Before working at the British Library, I was an independent consultant in the areas of digital strategies and initiatives, library technologies, information management, digital policies, Software as a Service (SaaS) and Open Source Software (OSS). Previous to that, I worked at EBSCO Information Services in several roles, firstly as the Discovery Service Engineering Support Team Manager (Europe and Latin America) and for three years as the Software Services, Application Programming Interfaces (API) and Applications (Apps) manager. My last role at EBSCO was implementing and managing the EBSCO App Store which involved working with several departments within the organisation such as marketing and legal.

Filipe Bento giving a talk the BAD conference in the Azores
Giving a talk the National Congress of BAD (Portuguese Librarians, Archivists and Documentalists Association), in the Azores

I helped the University of Aveiro's Library become the first Portuguese adopter of reference Open Source Software (OSS)  - OJS [Open Journal Systems] and implemented the institutional digital repository DSpace for the university (which included a massive data transformation and records deposit, often from citations exported from Scopus). I started my career as a lecturer and then as a computer specialist at the University of Aveiro’s Library, coordinating the development of information systems for its many branches for over fifteen years.

My PhD research in Information and Communication in Digital Platforms gave me the opportunity to connect with my professional interests in libraries, especially in the areas of information discovery. In my PhD, I was able to implement VuFind with innovative community features, as a proposal for the university, which involved engaging actively in its developer community, providing general and technical support in the process. My thesis is available via the link "Search 4.0: Integration and Cooperation Confluence in Scientific Information Discovery".

University of Aveiro (main campus), Portugal
University of Aveiro (main campus), Portugal

I have also been very active in a number of communities;
I was the (former) chairman of the board of USE.pt, the Portuguese Ex Libris Systems’ Users Association, and a previous member of the DigiMedia Research Center - Digital Media and Interaction at the University of Aveiro.

In my personal life I had been a radio and club DJ and worked on a number of personal music projects. I enjoy photography and video and am a keen traveler. I especially like being behind the wheels of cars / motorbikes and the propellers of drones.

I am really excited in joining the BL Labs team as I believe it provides an excellent opportunity to apply my skills, knowledge and expertise in library digital collections development, systems, data and APIs in a digital scholarship and wider context. I am really looking forward in offering practical advice and implementations in providing access to data, data curation, data visualisation, text and data mining and interactive web based computing environments such as Jupyter Notebooks to name a few. BL Labs and the British Library offers a rich, innovative and stimulating environment to explore what its staff and users want to do with its incredible and diverse digital collections.

26 November 2019

The British Library / Qatar Foundation Partnership Project Hack Day - Theme: Collaboration

Introduction

On October 16th 2019, the BL/QFP Project opened its doors to its third and biggest Hack Day to date. After the success of the first two, it was decided to extend participation beyond the Imaging Team and invite everyone from the rest of the project to get involved. We wanted to utilise the unique way the BL/QFP is set up within the Library, with different teams and specialities all working in one place, and emphasise collaboration across these different teams. The diverse people and teams within the BL/QFP bring a wide variety of skills and experience, from language and collections knowledge to artistic and technicial expertise. By bringing these together, we were able to learn from and teach each other whilst engaging with the collections and producing a variety of fascinating and thought-provoking hacks.

 

The Hacks

During the launch workshop the vast array of skills and expertise within the team were evident, as well as the abundant enthusiasm and ambition people had. After the proposed projects were raised and discussed, five teams were formed focusing on a wide range of ideas. On the day, almost half of the BL/QFP staff participated in some way, proving the collaboration objective was well and truly met. The diverse outputs, from animations and games to pinhole cameras and data visualisations, were presented at a “show and tell”, and some were displayed on the BL/QFP Twitter page.

Below is a summary of the day, including descriptions of the hacks from each of the five teams, enjoy!

 

Games

Team: Renata Kaminska, Mariam Aboelezz, Anne Courtney, Susannah Gillard and our Quality Assurance Officer

For our Hack Day project, we wanted to make the Qatar Digital Library (QDL) more accessible to non-experts, or people who might not be looking at it with a specific research aim. With this in mind, we decided to develop some quick and easy games to engage users with the collections. Using free browser-based software, we created a word search, jigsaw puzzles, crosswords, and a game of hangman. These all drew on the collections which were already digitised. Where possible, we tried to include links to the items on the QDL. Although the free software had some limitations, we feel that these games offer a foundation to build on in the future.

Games

Crossword using information from a letter from Lieutenant William Bruce, Resident, Bushire, 1814 (IOR/R/15/1/14, ff 125v-127)
Play: https://tinyurl.com/yespowj7

Jigsaw using image from Tarjumah-ʼi ʻAjā’ib al-makhlūqāt (Or 1621, f 391v)
Play: https://tinyurl.com/yjuf9d7o

Jigsaw using image of ‘Persia and Afghanistan. Map I’ (IOR/R/15/1/730, f 87)
Play: https://tinyurl.com/yhpv3erw

Hangman game using words from the QDL collection
Play: https://tinyurl.com/yftf5vca

Word Search using words from the QDL collection
Play: https://tinyurl.com/yj9nugb9

 

Photogrammetry: Astrolabe Quadrant

Team: Darran Murray, Tony Grant, Nick Krebs, Matt Griffin, Rebecca Harris, Matthew Lee, Daniel Loveday and Annie Ward

Our Hack Day project centred on what the Library's Imaging Services can do with the technology and expertise they offer, in particular photogrammetry. Photogrammetry is a photographic process where any type of object can be rendered into a 3D image for display on a 2D screen. This process is beneficial as it displays the complete item, allowing users to see and understand collection items without having to handle them.

We chose an Astrolabe Quadrant from the collection and created a rendition but also an animation of the same object. As you can see from both renditions, they have certain advantages over a single photograph of a 3D object, bringing the object to life.

 

Camera view during photogrammetry creation
Camera view during photogrammetry creation

 

BL St Pancras Studio during photogrammetry creation
British Library St Pancras Studio during photogrammetry creation

 

Photogrammetry rendition of astrolabe quadrant

 

 

Animation created using images from astrolabe photogrammetry rendering
Animation created using images from astrolabe photogrammetry rendering

 

Obscura / Pinhole / Cyanotype

Team: Rebecca Harris, Matthew Lee, Daniel Loveday, Darran Murray and Annie Ward

Our team explored early types of photography. First off, we created a camera obscura in one of the bays in our Imaging Studio by blocking out the light and creating a small hole by the window. An optical phenomenon, this simple hack allowed us to create an upside-down projection of the London skyline, complete with moving clouds and the Shard. Viewing sessions were arranged throughout the day and resulted in a stream of curious visitors.

As well as the camera obscura, we each built our own pinhole cameras using card, stripy tape and light-sensitive paper. Once we can set up a temporary darkroom we plan to take and develop photographs illustrating different aspects of the BL/QFP Project. Lastly, an experiment using cyanotype paper in an old brownie camera is still in progress, taking a long exposure still-life of a spirit level borrowed from the Conservation Team.

Upside-down projection of the London skyline created using the camera obscura
Upside-down projection of the London skyline created using the camera obscura

 

Pinhole cameras
Pinhole cameras

 

Time-lapse video showing creation of pinhole cameras

 

Behind The Scenes: Visualisations

Team: Jordi Clopes Masjuan and Sotirios Alpanis, with translation assistance from George Samaan

Our Hack Day project aimed to illuminate some aspects of our digitisation workflow that are not directly represented in the material displayed on the QDL. We wanted to represent and celebrate some of the hard work that goes into the creation of digitised material, particularly the tasks and processes that most people wouldn’t necessarily think about. The 45+ people involved in our workflow have a huge variety of skills and expertise, and it was some of this that we wanted to capture. We decided to use ‘every day’ objects from our workflow and picture them in interesting ways. Then pick out some interesting facts and figures about the processes they represent. Using Photoshop these two were combined to present the facts in their ‘every-day’ setting.

QDL Homepage combined with a picture of the BL/QFP Team
QDL Homepage combined with a picture of the BL/QFP Team

 

Permission letters sent by Rights Clearance Team to people identified as Rights Holders for material being digitised
Permission letters sent by Rights Clearance Team to people identified as Rights Holders for material being digitised

 

A book undergoing conservation treatment
A book undergoing conservation treatment

 

The British Library’s digital servers
The British Library’s digital servers

 

A Leading Library Assistant’s trolley in the lift carrying collection items
A Leading Library Assistant’s trolley in the lift carrying collection items

 

The Workflow Team’s Kanban Board
The Workflow Team’s Kanban Board

 

A Foliator’s desk drawer
A Foliator’s desk drawer

 

Visualising Data

Team: David Woodbridge, Sotirios Alpanis, Laura Parsons, with assistance from Anna Waghorn

We had the idea to display data about the Project in visually dynamic and appealing ways. We thought we could experiment with displaying authority terms used in the Project’s catalogue records and see what we could learn about data manipulation and data visualisation along the way.

Whilst we were able to export data about the Project and collections from SharePoint (the platform we use to manage items through the digitisation workflow) and IAMS (the Library's cataloguing system for archives, manuscripts, photographs and other visual materials), we needed to tidy up the data to make it useful. For example, the IAMS data is exported as code in an XML file format so Sotirios experimented with extracting particular elements. This work highlighted how data visualisations rely on having well-organised and complete data.

Using Microsoft Power BI, we tried a variety of ways for displaying the data, including network and force-directed graphs. These graphs show relationships between data points, such as the authority terms assigned to different shelfmarks. We also created other visualisations, such as pie charts, that quantified specific aspects of the data, for example showing the numbers of person authorities according to gender, or the language of their name. The challenge being to create something visually appealing but still meaningful.

 

Dashboard displaying data visualisations including network, force-directed and pie graphs.

 

Weaponry on Walls

Team: Hannah Nagle & the British Library’s BAME Staff Network

Working in collaboration with the Library’s BAME Staff Network, we wanted to investigate people’s perceptions of weaponry displayed in our offices. We prepared a survey and sent it out to the BL/QFP Project staff, asking them to fill it in. A work in progress, the results, quotes and related imagery will be collated into a zine illustrating the survey responses to the weapons. We will be using original photographs, images from the QDL and public domain images found on Flickr, including from the British Library’s Flickr account. With this, we hope to start a conversation about what the weaponry can represent to different people and why this is important to keep in mind.

Example of images created to respond to the weaponry on the walls
Example of images created to respond to the weaponry on the walls

 

Further information

If you would like to explore the photographs and documents used in our Hack Day projects from the Qatar Digital Library or find out more about the India Office Records please follow the links below:

 

You can also read about the previous Hack Days in the blog posts below:

 

This is a guest post by the British Library Qatar Foundation Partnership, compiled by Rebecca Harris and Laura Parsons. You can follow the British Library Qatar Foundation Partnership on Twitter at @BLQatar.

The BL/QFP Project’s Imaging Team won the Staff Award at the British Library Labs Symposium 2019 for their Hack Days.

 

20 November 2019

Hacking Web Maps-T

This is a guest blog post by Dr Gethin Rees, Lead Curator for Digital Map Collections at the British Library. It was originally posted on the Pelagios Commons Blog.

 

The Web Maps-T working group aims to enhance the ability to visualise geospatial and temporal Linked Open Data on web maps. The group is coordinated by Gethin Rees and Adi Keinan-Schoonbaert, see this previous post for more details. As a first step, we held a hack workshop in September at the British Library to scope applications to visualise and work with the GeoJSON-T standard. Participants at the workshop included from the Rainer Simon from the Austrian Institute of Technology, Karl Grossner from University Of Pittsburgh, Neil Jakeman from King’s College Digital Lab, Alex Butterworth and Simon Wibberley from Sussex University’s Humanities Lab alongside Mia Ridge and Olivia Vane from Digital Scholarship at the British Library. They came to the workshop with a variety of datasets derived from the art, archaeology and travelogues from the classical world for example, as well as the collections of the British Library, including contributions from the Endangered Archives ProgrammeGeoreferencer and Two Centuries of Indian Print.

Matching datasets to visualisation types
Matching datasets to visualisation types

 

Throughout the event we collaborated using repositories in the Pelagios GitHub organisation and Google Docs. After introductory talks, first by Gethin Rees on user stories and use cases for Web Maps-T, and second by Karl Grossner on GeoJSON-T and the Linked Traces app, participants introduced their own projects and use cases for geospatial and temporal visualisation.

Karl and Alex discuss the finer points of “The Prelude Timeline: On the Growth of My Own Mind” by Alex Butterworth and Stephanie Posavec, for The Wordsworth Trust
Karl and Alex discuss the finer points of “The Prelude Timeline: On the Growth of My Own Mind” by Alex Butterworth and Stephanie Posavec, for The Wordsworth Trust

 

Participants then coalesced around several tasks and divided into groups to work. These tasks included:

  • MoSCoW assessment of Web Maps-T app. This prioritisation technique divided potential features of the app into Must, Should, Could and Won’t. For example, our app must have a timeline, a map, and work with GeoJSON-T. On the other hand it could use animation for visualisation, and won’t include a text box querying for plain English.
  • Classification of datasets and visualisation types. We wanted to explore several forms of visualisation and it quickly became clear that some were better suited to certain types of datasets than others. For histogram, linear journey, period horizontal bars, valid time intervals and beeswarm visualisations we recorded data type, size, examples and pros and cons.
  • These activities were documented in detail and will form the basis of the white paper. Participants that wrote code worked on:
  • WhenJSON— a front-end JavaScript utility library for manipulating temporal data in the GeoJSON-T format (the ‘when’ object). The utility could, for example, help in calculating the interval that separates two data sources.
  • A Minimum Viable Product (MVP) web map with a time-slider componentto test visualisation types for different dataset classifications. This work borrows heavily from the work of Jonathan Skeate. The MVP allowed us to see how datasets looked when visualised using different methods such as histogram or time bars.
Visualisation of Endangered Archives Programme data
Visualisation of Endangered Archives Programme data

 

  • Scoping the adaption Karl Grossner’s Linked Traces app to be used with any dataset and to add a time slider. This could be a more robust and presentable solution than our adaptation of Skeate’s Leaflet Timeline.
Itinerary of the Bordeaux pilgrim in Linked Traces
Itinerary of the Bordeaux pilgrim in Linked Traces

 

Next steps for the group are to write a white paper and to attend the Linked Pasts conference in Bordeaux. All the workshop participants had projects where they were keen to apply Web Maps-T and there must be many more use-cases out there. However, the path from hack event to an open-source, production-ready component that can visualise any GeoJSON-T data we throw at it is not straightforward. Like any open-source project, success rests on people and projects requiring the component and standard for their day-to-day work. There are plenty of potential applications in the Linked Pasts community and beyond, the main resources that we now require are developer time and GeoJSON-T datasets. We will continue to encourage others to contribute, perhaps you have potential applications. Get in touch at visualisation[at]pelagios.org

 

31 October 2019

Digital Conversation: Games, Literature and Learning

Happy Halloween! If you aren't heading out trick or treating this evening, and prefer a quiet night in, then you may like to play some of the wonderful games and interactive fiction that were created in last year's Gothic Novel Jam. My personal favourite is The Lady's Book of Decency, by Sean S. LeBlanc. If you feel like a longer read, Lynda Clark's first novel Beyond Kidding is launched today, I can't wait to get my teeth into this!

It is also very nearly International Games Week (3-9 November 2019), this is an initiative run by volunteers from around the world to reconnect communities through their libraries around the educational, recreational, and social value of all types of games. Check out this map to see if there is an event near you!

If you are based in the UK and work in libraries, you may be interested in coming along to the next Game Library Camp, which is being held at Leeds Central Library on Saturday 9th November. More details can be found at https://librarycamp.game.blog/, I'll be there to lead a discussion for the session "I hope you like jammin’ too"; for sharing advice on running online interactive fiction writing jams. 

Here at the British Library we have a couple of International Games Week (IGW) events, we are excited to be hosting the narrative games convention AdventureX on Saturday 2nd and Sunday 3rd November, all tickets are completely sold out, but the talks will be live streamed, online from http://adventurexpo.org/livestream/, check the schedule for times and watch from the comfort of your sofa.   

Continuing our IGW events at the British Library, on the evening of Monday 4th November we are holding a Digital Conversation on Games, Literature and Learning. This will be a panel discussion exploring how video games, such as Minecraft, can be used to engage learners of all ages with literature, libraries and museums.

Child playing Litcraft on an ipad
Child trying Litcraft at SPARK: The Science and Art of Creativity, a festival of ideas organised by the British Council in Hong Kong, image credit ATUM Images

Jordan Erica Webber will be our chair for the evening. Jordan is a writer and presenter, Co-author of Ten Things Video Games Can Teach Us, host of the Guardian's digital culture podcast Chips With Everything and resident games expert on The Gadget Show.

Our panellists include: 

  • Keith Stuart, Guardian journalist, writer and author of A Boy Made of Blocks, a Richard and Judy Book Club pick and a major bestseller. He has written about how Minecraft has helped his son Zac and will be talking from a parent’s perspective. Keith will also be available to sign books after the panel discussion.
Book cover for A Boy Made of Blocks
A Boy Made of Blocks, by Keith Stuart
  • Dr Lissa Holloway-Attaway and Dr Björn Berg Marklund from the University of Skövde in Sweden, whose whose research specialisms are digital game-based learning, educational games, 'Serious Games' and how games are used in classrooms. They have collaborated with museums and cultural organisations to run workshops with young people to co-create their communities within the Baltic Sea Region by constructing imaginative simulations of their cities and neighbourhoods in Minecraft. Another of their projects is the Augmented Reality, children's book app KLUB, which stands for Kiras och Luppes Bestiarium (Kiras and Luppes Bestiary) where mythical beings from books come to life in 3D.  
The Kiras och Luppes Bestiarium app being demonstrated on a smartphone
The Kiras och Luppes Bestiarium app 
  • Professor Sally Bushell from the Department of English Literature & Creative Writing at Lancaster University; Sally is the Principle Investigator on the Litcraft project, which uses the popular Minecraft gaming platform to build accurate scale models of authorial maps from classic works of literature. Impact is achieved by re-engaging children with literature in a model of positive reinforcement that makes works accessible in entirely new ways, combining the textual and the digital. Reading and writing are integrated with an immersive experience of the literary worlds.

Promotional video for the third Litcraft release - the first pairing of connected texts. This build features the original and iconic castaway tales: Swiss Family Robinson and Robinson Crusoe.

The Digital Conversation event takes place in The Knowledge Centre at the British Library on Monday 4th November, 18.30- 20.30; for more details including booking, visit: https://www.bl.uk/events/digital-conversation-games-literature-and-learning. Hope to see you there. 

This post is by Digital Curator Stella Wisdom, on twitter as @miss_wisdom

30 October 2019

Workshop on “Digitisation Workflows & Digital Research Studies Methodologies”

In this post, Nicolas Moretto, Metadata Systems Analyst at the British Library, reflects on his work trip to India.

Earlier this year I was given the opportunity to attend a workshop on “Digitisation Workflows & Digital Research Studies Methodologies” held at the National Centre for Biological Sciences (NCBS) in Bangalore, India.

The workshop, which was held on the NCBS campus in the northern part of Bangalore, was jointly organised by Tom Derrick (Two Centuries of Indian Print - 2CIP) and our host Venkat Srinivasan who is the archivist at NCBS. Tom represented the 2CIP project while I attended to cover different metadata aspects. The event was attended by colleagues from 26 different institutions. Tom and I were kindly provided with accommodation on the campus.

a photo showing the workshop participants sitting outside the main building at NCBS campus

Attendees of the workshop outside the NCBS main building                                                                                                         

The workshop was intended as an opportunity to learn more about cataloguing, digitisation and OCR, and for the Indian participants to meet colleagues from Bangalore and other parts of India, share experiences, exchange ideas and discuss common standards and best practices. The chance to meet with colleagues working on similar activities – and encountering similar challenges – was an important aspect of the workshop. Most attendees were not professional archivists but had come into archives from academic and other backgrounds and had been exposed to archives and cultural heritage in different ways. All participants shared a high level of enthusiasm for archives and a passion for preserving cultural heritage and the memory of their communities.

workshop participants sitting at desks during the workshop one group of workshop participants in discussion
On the left: The Safeda Room at NCBS. On the right: the NCBS campus offered space for discussions during the breaks

 

The topics of the two-day workshop ranged from talks on description and arrangement of material (archival and related discovery standards), presentations on specific projects to digitisation workflows and OCR. Tom gave a practical demo of OCR tools for Indic scripts. I gave a presentation on each day, covering metadata description as well as reuse and discovery.

Ten of the Indian institutions presented five-minute lightning talks covering a diverse range of initiatives and describing their archival collections. The Ashoka Archives of Contemporary India presented their collection, which includes the Mahatma Ghandi papers as well as material from other Indian politicians and academics. The Keystone Foundation gave an overview of the opportunities and challenges around their work with indigenous communities in India. Their aim is to challenge traditional portrayals of indigenous culture by employing oral history interviews, which give a voice to parts of the culture that would otherwise remain unheard. The French Institute of Pondicherry featured material that had been digitised for several Endangered Archives Programme (EAP) projects, including ceiling murals and glass frames. The participants from FLAME University presented a project of digitising Indian cookbooks, showing the interdependencies between caste and cooking. The multimedia resource Sahapedia (https://www.sahapedia.org/) was presented as a way of curating Indian heritage in an online environment. All participants were looking for ways to make cultural heritage more accessible using digital tools. On the afternoon of the second day, the participants had an opportunity to undertake a hands-on activity testing OCR tools using their own material.

The workshop was well received and feedback was overall positive. The participants voiced interest in receiving more in-depth practical training and how-to guides around cataloguing and metadata capture, setting up systems as well as preservation and conservation.

Maya Dodd speaking during her presentation Venkat shows a group of participants some documents inside the NCBS archive
On the left: MayaDodd from FLAME University presents the Indian recipes project. On the right: Venkat giving a tour of the NCBS archive

 

On the evening of the first day, Venkat gave us a tour of the NCBS archives, which he had built up from scratch, working with NCBS researchers and with the help of student volunteers. The archive was remarkably open, inviting in students and staff even if they did not have an explicit research interest. Venkat was very interested in maintaining it as an open space. His archive is accompanied by an open and evolving exhibition space, which students can contribute to.

Setting up archives in India is not an easy undertaking, and Venkat has put in a tremendous effort to make it work. Even the essentials can be difficult to come by, since there is no supplier for archival materials in India for example, and Venkat had to import all his acid-free boxes from Germany.

On my last day, I accompanied Tom on a visit to the Karnataka State Central Library. The Director of the Department of Public Libraries, Dr. Satish Kumar Hosamani was not present, but his team kindly offered to give us a tour of the library. The Librarian showed us the round reading room and newspaper reading room and the collection of rare books and manuscripts. The State Library is planning to digitise these in the near future. This activity is currently awaiting approval and funding from the Karnataka state government.

A view outside the front of the State Central Library  A view of the reading room inside the State Central Library

On the left: Karnataka State Central Library in Cubbon Park. On the right: the round reading room in the State Central Library

 

Trying to find our way to the library, we discovered the existence of a “British Library Road” in Bangalore but were unable to reach it due to the customary extremely heavy traffic in Bangalore. Getting to and from destinations usually took a long time. The best way to get around over short distances was by “Tuk-tuk”, the ever-present means of transport in Indian cities.

A screenshot of Google Maps centred on British Library Road, Bangalore A photo taken from a tuk tuk of congested traffic in Bangalore
On the left: British Library Road in Bangalore. On the right: view from a Tuk-Tuk - the traffic in Bangalore was eternally gridlocked!