Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

11 February 2020

New project! 'From crowdsourcing to digitally-enabled participation: the state of the art in collaboration, access, and inclusion for cultural heritage institutions'

[Update, March 2020: like so much else, our plans for the 'Collective Wisdom' project have been thrown out by the COVID-19 pandemic. We have an extension from our funders and will look to confirm dates when the global situation (especially around international flights) becomes clearer. In the meantime, the JISCMail Crowdsourcing list has some discussion on starting and managing projects in the current context.]

We - Mia Ridge (British Library), Meghan Ferriter (Library of Congress) and Sam Blickhan (Zooniverse) - are excited to announce that we've been awarded an AHRC UK-US Partnership Development Grant. Our overarching goals are:

  • To foster an international community of practice in crowdsourcing in cultural heritage
  • To capture and disseminate the state of the art and promote knowledge exchange in crowdsourcing and digitally-enabled participation
  • To set a research agenda and generate shared understandings of unsolved or tricky problems that could lead to future funding applications

How will we do that?

We're holding a five day collaborative 'book sprint' (or writing workshop) at the Peale Center for Baltimore History and Architecture in April 2020. Working with up to 12 other collaborators, we'll write a high-quality book that provides a comprehensive, practical and authoritative guide to crowdsourcing and digitally-enabled participation projects in the cultural heritage sector. We want to provide an effective road map for cultural institutions hoping to use crowdsourcing for the first time and a resource for institutions already using crowdsourcing to benchmark their work.

In the spirit of digital participation, we'll publish a commentable version of the book online with an open call for feedback from the extended international community of crowdsourcing practitioners, academics and volunteers. We're excited about including the expertise of those unable to attend the book sprint in our final open access publication.

The book sprint will close with a short debrief session to capture suggestions about gaps in the field and sketch the agenda for the closing workshop. 

In October 2020 we're holding a workshop at the British Library for up to 25 participants to interrogate, refine and advance questions raised during the year and identify high priority gaps and emerging challenges in the field that could be addressed by future research collaborations. We'll work with a community manager to ensure that remote participants are as integrated into the event as much as possible, which will lower our carbon footprint and let people contribute without getting on a plane. 

We'll publish a white paper reporting on this workshop, outlining emerging, intractable and unsolved challenges that could be addressed by further funding for collaborative work. 

Finally, we want this project to help foster the wonderful community of crowdsourcing practitioners, participants and researchers by hosting events and online discussion. 

Why now?

For several years, crowdsourcing has provided a framework for online participation with, and around, cultural heritage collections. This popularity leads to increased participant expectations while also attracting criticism such as accusations of ‘free labour’. Now, the introduction of machine learning and AI methods, and co-creation and new models of ownership and authorship present significant challenges for institutions used to managing interactions with collections on their own terms. 

How can you get involved?

Our call for participants in our April Book Sprint is now open!

Our final workshop will be held in mid- or late-October. The easiest way to get updates such as calls for contributors and links to blog posts is to sign up for the British Library's crowdsourcing newsletters or join the Crowdsourcing group on Humanities Commons

03 February 2020

2019 Winners of the New Media Writing Prize

On Wednesday 15 January 2020 it was the 10th Anniversary Awards Evening of the New Media Writing Prize (NMWP) at Bournemouth University. This international prize encourages and promotes the best in new media writing; showcasing innovative digital fiction, poetry and journalism. The types of interactive writing that we have been examining and researching in the emerging formats work at the Library.

NMWP logo
New Media Writing Prize logo

Before the NMWP winners were announced there was a fun hands-on session in the afternoon, for guests to experience Digital Fiction Curios. This is an immersive experience; re-imagining selected Flash-based digital fiction by the One to One Development Trust in Virtual Reality, made in collaboration with Sheffield Hallam University. Here in the Library we are interested in their playful and innovative approach to preserving the experiences of reading their digital works, and last October the project team were invited to showcase this work to British Library staff for them to try in VR.

Dreaming Methods: Digital Fiction Curios Teaser from One to One Development Trust 

On to the main NMWP awards event, like in previous years, the 2019 competition had attracted strong entries from many parts of the world. With submissions from six continents, the event’s host Jim Pope pointed out that Antarctica was the only geographic area not to have participated yet.

Congratulations to all the 2019 winners:

  • The if:book UK New Media Writing Prize, the main category, was won by Maria Ivanova and her team of volunteers: Anna Gorovaya, Alexey Logvinov, Mike Stonelake, Anton Zayceve and Ekaterina Polyakova, from Belarus for ‘The Life of Grand Duchess Elizabeth’. A stunning biographical narrative, featuring open source archive photographs and quotations from the memoirs of generous philanthropist Grand Duchess Elizabeth Feodorovna of Russia. A granddaughter of English Queen Victoria, who lived during several key events in the history of Russia: including the Russo-Japanese War, the First World War, the revolutions of 1905 and 1917.She became one of the brightest philanthropists of Russia.
  • The Future Journalism award was won by Mahmoud El Wakea’s ‘Made in Prison’, an investigation of Jihadi radicalisation in Egypt.
  • The Unicorn Training Student award was won by Kenneth Sanchez for ‘Escaping the Chaos’. An emotive portrayal of Venezuelan migrants in Peru, with video footage of individuals telling their personal stories.
  • The Dot award for 2019 went to Clare Pollard, editor of Modern Poetry in Translation, the award will enable them to digitise their magazine and to grow their magazine internationally.
The Life of Grand Duchess Elizabeth
Still image from 'The Life of Grand Duchess Elizabeth', Winner of the 2019 if:book UK New Media Writing Prize.

It was gratifying to see that Lynda Clark featured on the main prize shortlist for her work ‘The Memory Archivist’, which was made during her Innovation Placement at the British Library in 2019. Also previous Eccles Centre Fellow, J.R. Carpenter, for the hydro-graphic novel ‘The Pleasure of the Coast’, created in partnership with the Archives Nationales in Paris.

Full shortlists were: 

The 2019 if:book main prize shortlist:

 The Unicorn Student Award 2019 shortlist:

Escaping the Chaos
Still image from 'Escaping the Chaos', Winner of the 2019 Unicorn Training Student award

The Future Journalism Award 2019 shortlist for the best digital interactive journalism, awarded by Future PLC:

Made in Prison
Still image from 'Made in Prison', Winner of the 2019 Future Journalism award 

If reading this blog post is inspiring you to consider entering the Prize in 2020, please do keep your eyes peeled for their call for submissions later in the year. You can follow NMWP on twitter and Facebook. Also do check out the Competition Rules and the FAQs to make sure your creative output fits the competition's criteria. 

This post is by Digital Curator Stella Wisdom (@miss_wisdom

27 January 2020

How historians can communicate their research online

This blog post is by Jonathan Blaney (Institute of Historical Research), Frances Madden (British Library), Francesca Morselli (DANS), Jane Winters (School of Advanced Study, University of London)

This blog will be published in several other locations including the FREYA blog and the IHR blog

Large satellite receiver
Source: Joshua Hoehne, Unsplash

On 4 December 2019, the FREYA project in collaboration with UCL Centre for Digital Humanities, Institute of Historical Research, the British Library and DARIAH-EU organized a workshop in London on identifiers in research. In particular this workshop - mainly directed to historians and humanities scholars - focused on ways in which they can build and manage an online profile as researchers, using tools such as ORCID IDs. It also covered best practices and methods of citing digital resources to make humanities researchers' work connected and discoverable to others. The workshop had 20 attendees, mainly PhD students from the London area but also curators and independent researchers.

Presentations

Frances Madden from the British Library introduced the day which was supported by the FREYA project which is funded under the EU’s Horizon 2020 programme. FREYA aims to increase the use of persistent identifiers (PIDs) across the research landscape by building up services and infrastructure. The British Library is leading on the Humanities and social sciences aspect of this work.

Frances described how PIDs are central to scholarly communication becoming effective and easy online. We will need PIDs not just for publications but for grey literature, for data, for blog posts, presentations and more. This is clearly a challenge for historians to learn about and use, and the workshop is a contribution to that effort.

PIDs: some historical context

Jonathan Blaney from the Institute of Historical Research said that there is a context to citation and the persistent identifiers which have grown up around traditional forms of print citation. These are almost invisible to us because they are deeply familiar. He gave an example of a reference to the gospel story of the woman taken in adultery:

John 7:53-8:11

There are three conventions here: the name ‘John’ (attached to this gospel since about the 2nd century) the chapter divisions (medieval and ascribed to the English bishop Stephen Langton) and the verse divisions (from the middle of the 16th century).

When learning new forms of referencing, such as the ones under discussion at the workshop, Jonathan suggested that historians should remember their implicit knowledge has been learned. He finished with an anecdote about Harry Belafonte, retold in Anthony Grafton’s The Footnote: A Curious History. As a young sailor Belafonte wanted to follow up on references in a book he had read. The next time he was on shore leave he went to a library and told the librarian:

“Just give me everything you’ve got by Ibid.”

People in conference room watching a presentation

Demonstrating the benefits

Prof Jane Winters from School introduced what she claimed was her most egotistical presentation by explaining her own choices in curating her online presence and also what was beyond her control. She showed the different results of web searches for herself using Google and DuckDuckGo and pointed out how things she had almost forgotten about can still feature prominently in results.

Jane described her own use of Twitter, and highlighted both the benefits and challenges of using social media to communicate research and build an online profile. It was the relatively rigid format of her institutional staff profile that led her to create her own website. Although Jane has an ORCID ID and a page on Humanities Commons, for example, there are many online services she has chosen not to use, such as academia.edu.

This is all very much a matter of personal choice, dependent upon people’s own tastes and willingness to engage with a particular service.

How to use what’s available

Francesca Morselli from DANS gave a presentation aiming to provide useful resources about identifiers for researchers as well as explaining in a simple yet exhaustive way how they "work" and the rationale behind them.

Most importantly PIDs ensure:

  1. Citability and discoverability (both for humans and machine)
  2. Disambiguation (between similar objects)
  3. Linking to related resources
  4. Long-term archiving and findability

Francesca then introduced the support provided by projects and infrastructures: FREYA, DARIAH-EU and ORCID. Among the FREYA project pillars (PID graph, PID Commons, PID Forum), the latter is available for anyone interested in identifiers.

The DARIAH-EU infrastructure for Arts and Humanities has recently launched the DARIAH Campus platform which includes useful resources on PIDs and managing research data (i.e. all materials which are used in supporting research). In 2018 DARIAH also organized a winter school on Open Data Citation, whose resources are archived here.

Dariah

 

A Publisher’s Perspective

Kath Burton from Routledge Journals emphasised how much use publishers make of digital tools to harvest convent, including social media crawlers, data harvesters and third party feeds.

The importance of maximising your impact online when publishing was explained, both before publishing (filling in the metadata, giving a meaningful title) and afterwards (linking to the article from social media and websites), as well as how publishers can help support this.

Kath went on to give an example of Taylor & Francis’s interest in the possibilities of online scholarly communication by describing its commitment to publishing 3D models of research objects, which is does on via Sketchfab page.

Breakout Groups

After the presentations and a coffee break there were group discussions about what everyone had just heard. During the first part, the groups were asked what was new to them in the presentations. It was clear from discussions around the room that attendees had heard much which was new to them. For example, some attendees had ORCID IDs but many were surprised at the range of things for which they could be used, such as in journal articles and logging into systems. They were also struck by the range of things in which publishers were interested such as research data. Many were really interested in the use of personal websites to manage their profile.

When asked what tallied with their experiences, it became clear that they were keen to engage with these systems, setting up ORCID IDs and Humanities Commons profiles but that they felt that they were too early on in their careers to have anything to contribute to these platforms and felt they were designed for established researchers. Jane Winters stressed that one could adopt a broad approach to the term ‘publications’, including posters, presentations and blog posts and encouraged all to share what they had.

Lastly discussion turned to how the group cites digital resources. This led to an interesting conversation around the citation of archived web pages and how to cite webpages which might change over time, with tools such as the Internet Archive being mentioned. There was also discussion about whether one can cite resources such as Wikipedia and it was clear that this was not something which had been encouraged. Jonathan, who has researched this subject, mentioned that he had found established academics are happy to cite Wikipedia than those earlier in their career.

Conclusions

The workshop effectively demonstrated the sheer range of online tools, social media forums and publishing venues (both formal and informal) through which historians can communicate their research online. This is both an opportunity and a problem. It is a challenge to develop an online presence - to decide which methods are most appropriate for different kinds of research and different personalities - but that is just the first step. For research communication to be truly valuable, it is necessary to focus your effort, manage your online activities and take control of how you appear to others in digital spaces. PIDs are invaluable in achieving this, and in helping you to establish a personal research profile that stays with you as you move through your career. At the start of the day, the majority of those who attended the workshop did not know very much about PIDs and how you can put them to use, but we hope that they came away with an enhanced understanding of the issues and possibilities, the awareness that it does not take much effort or skill to make a real difference to how you are perceived online, and some practical advice about next steps.

It was apparent that, with some admirable exceptions, neither higher education institutions nor PID organisations are successfully communicating the value and importance of PIDs to early career researchers. Workshop attendees particularly welcomed the opportunity to hear from a publisher and senior academic about how PIDs are used to structure, present and disseminate academic work. The clear link between communicating research online and public engagement also emerged during the course of the day, and there is obvious potential for collaboration between PID organisations and those involved with training focused on impact and public engagement. We ended the day with lots of ideas for further advocacy and training, and a shared appreciation for the value of PIDs for helping historians to reach out to a range of different audiences online.

20 January 2020

Using Transkribus for Arabic Handwritten Text Recognition

This blog post is by Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She's on Twitter as @BL_AdiKS.

 

In the last couple of years we’ve teamed up with PRImA Research Lab in Salford to run competitions for automating the transcription of Arabic manuscripts (RASM2018 and RASM2019), in an ongoing effort to identify good solutions for Arabic Handwritten Text Recognition (HTR).

I’ve been curious to test our Arabic materials with Transkribus – one of the leading tools for automating the recognition of historical documents. We’ve already tried it out on items from the Library’s India Office collection as well as early Bengali printed books, and we were pleased with the results. Several months ago the British Library joined the READ-COOP – the cooperative taking up the development of Transkribus – as a founding member.

As with other HTR tools, Transkribus’ HTR+ engine cannot start automatic transcription straight away, but first needs to be trained on a specific type of script and handwriting. This is achieved by creating a training dataset – a transcription of the text on each page, as accurate as possible, and a segmentation of the page into text areas and line, demarcating the exact location of the text. Training sets are therefore comprised of a set of images and an equivalent set of XML files, containing the location and transcription of the text.

A screenshot from Transkribus, showing the segmentation and transcription of a page from Add MS 7474
A screenshot from Transkribus, showing the segmentation and transcription of a page from Add MS 7474.

 

This process can be done in Transkribus, but in this case I already had a training set created using PRImA’s software Aletheia. I used the dataset created for the competitions mentioned above: 120 transcribed and ground-truthed pages from eight manuscripts digitised and made available through QDL. This dataset is now freely accessible through the British Library’s Research Repository.

Transkribus recommends creating a training set of at least 75 pages (between 5,000 and 15,000 words), however I was interested to find out a few things. First, the methods submitted for the RASM2019 competition worked on a training set of 20 pages, with an evaluation set of 100 pages. Therefore, I wanted to see how Transkribus’ HTR+ engine dealt with the same scenario. It should be noted that the RASM2019 methods were evaluated using PRImA’s evaluation methods, and this is not the case with Transkribus evaluation method – therefore, the results shown here are not accurately comparable, but give some idea on how Transkribus performed on the same training set.

I created four different models to see how Transkribus’ recognition algorithms deal with a growing training set. The models were created as follows:

  • Training model of 20 pages, and evaluation set of 100 pages
  • Training model of 50 pages, and evaluation set of 70 pages
  • Training model of 75 pages, and evaluation set of 45 pages
  • Training model of 100 pages, and evaluation set of 20 pages

The graphs below show each of the four iterations, from top to bottom:

CER of 26.80% for a training set of 20 pages

CER of 19.27% for a training set of 50 pages

CER of 15.10% for a training set of 75 pages

CER of 13.57% for a training set of 100 pages

The results can be summed up in a table:

Training Set (pp.)

Evaluation Set (pp.)

Character Error Rate (CER)

Character Accuracy

20

100

26.80%

73.20%

50

70

19.27%

80.73%

75

45

15.10%

84.9%

100

20

13.57%

86.43%

 

Indeed the accuracy improved with each iteration of training – the more training data the neural networks in Transkribus’ HTR+ engine have, the better the results. With a training set of a 100 pages, Transkribus managed to automatically transcribe the rest of the 20 pages with 86.43% accuracy rate – which is pretty good for historical handwritten Arabic script.

As a next step, we could consider (1) adding more ground-truthed pages from our manuscripts to increase the size of the training set, and by that improve HTR accuracy; (2) adding other open ground truth datasets of handwritten Arabic to the existing training set, and checking whether this improves HTR accuracy; and (3) running a few manuscripts from QDL through Transkribus to see how its HTR+ engine transcribes them. If accuracy is satisfactory, we could see how to scale this up and make those transcriptions openly available and easily accessible.

In the meantime, I’m looking forward to participating at the OpenITI AOCP workshop entitled “OCR and Digital Text Production: Learning from the Past, Fostering Collaboration and Coordination for the Future,” taking place at the University of Maryland next week, and catching up with colleagues on all things Arabic OCR/HTR!

 

13 December 2019

Do you want to see my butterfly collection?

Posted on behalf of Sara Lucas Agutoli, artist, associate professor at the Accademia di Belle Arti di Bologna, BL Labs Artist in residence and runner up in the BL Labs Artistic Award 2019.

Sara Lucas Agutoli
Artist: Sara Lucas Agutoli
(Copyright: Ilenia Arosio)

Sara Lucas Agutoli lives and works between London and Bologna.  Her academic research focuses on the concepts of true and false in art, in particular in photography. In her art S. L. Agutoli merges popular themes with a learned and symbolic system of citations. Working with different media, she reflects on the idea of ongoing transformation – of the spaces, of the body, as well as of aesthetics – and creates personal architectures drawing on her inner experiences, knowledge and visions.

When occupied with my full time job, I often spend the time wandering on the net, looking for pictures that trigger my interest, either because they are odd and curious or aesthetically pleasant and elegant.

Since 2011 I’ve enjoyed calling myself a cyber-flâneur1:. unlike the Parisian strollers described by Baudelaire, I walked through cyber avenues, getting lost amid different digital archives. I glimpsed through collections of images instead of windows, stared at close-ups of manuscripts instead of sunsets on rivers. The net was my city and I just followed my nose walking through it. I wanted to make my curiosity an aesthetic operation. In doing so I’ve come to believe that online archives are my personal church of Saint-Julien-le-Puvre, the chosen venue for my cyber-dadaist performances,
see: https://www.moma.org/collection/works/184056

For years my working activity followed a pattern: a few months of research – during which I spend hours and hours on Flickr Commons browsing online archives of museums and institutions saving selected images on my hard disk–, followed by months in the studio working creatively with the pictures accumulated.

I did accumulate images and emotions, from advertising to family album pictures. I wanted to explore how photography was used in different parts of the world, eras and in different economical contexts.

In 2011, while in Montreal for my first art residence, I analysed the different uses of vernacular photography in the 50s in North America and Italy. To do so, I used the open archives of most of the North American Libraries (New York Public library, Congregation of Sister of St. Joseph in Canada, California Historical Society and many others) and a private physical archive located in a tin box in my grandmother house.

This lead to a series of pictures inspired by this contrast. The series was exhibited in a solo show called Fermez les yeux.

Sara Mickey: Fermez les yeux
Sara Mickey: Fermez les yeux

The vastness and the richness of topics of the images I accumulated triggered constantly my creativity and my sense of humour. They often made me ask myself  “why do those pictures exist”?

The images – especially those more vernacular, random and unforeseen – became the objects trouvés I could rework using my imagination and reality.

During this dadaist-inspired net-surfing, the most fertile encounter of the last years has been the one with the collections of two of the major London institutions: the British Library and the Wellcome Collection digital archives.  

I was about to move from Italy to London and so my artistic research was about to change, inspired by this encounter.

I started to become interested in the aesthetics of the Victorian era and in the concept of the museum as an extension of a wunderkammer.

I started collecting  images of naturalia 2 and decided to transform them into artificialia in my studio.  And so I did, merging and morphing creatively these images. In 2013 I produced a digital collage of a butterfly scientific illustration and a medical vulva lithography and it was exhibited in public space in Bologna during CHEAP poster Festival.

Cheap Poster Festival
Posters as part of the CHEAP poster festival

This collage of images from the British Library and the Wellcome Collection became the first piece of the larger project Il muro delle meraviglie – the wall of wonders – for which I chose to use the wall of my living room in my home/atelier in NW London.

Il muro delle meraviglie started like a joke to mock the colonialist aesthetic of Victorian museum collections and it became a work of art. Among the wonders I added subsequently, you can find that first collage of the butterfly and the vulva, which I decided to call  “Do you want to see my butterfly collection?” to make my queer/ feminist perspective encounter the delicacy of the naturalistic illustration of butterfly.

The title, in Italian, refers to an apparently naïve question which has an explicit sexual allusion.

The person who asks “come see my butterflies’ collection” might be suggesting it to obtain something more, as the butterfly is used as a metaphor for the female sex.

Sara Deep Thrash
Intallazione a DEEP THRASH

This work criticises the male chauvinist obsession for cataloguing, intended as an activity aimed more at showing off, than simply showing. 

It represents a feminist critique and re-appropriation of such images.

Here the butterflies become proper “c*nts” and give visibility to the female genitalia.

It has been exhibited for the first time in 2013 on the streets of Bologna (IT) during CHEAP festival and at Queer demonstration thanks to C*ntemporary

If I didn’t have access to the BL and the Wellcome digital archives, all of this wouldn’t have been possible.

Finally, I would like to thank the support I have received from BL Labs and am excited about the new experiments and projects waiting for me around the corner.

Footnotes

  1. Flâneur: Flâneur is a French term meaning ‘stroller’ or ‘loafer’ used by nineteenth-century French poet Charles Baudelaire to identify an observer of modern urban life. Dada raised the tradition of Flânerie to the level of an aesthetic operation. The Parisian walk described by Walter Benjamin in the 1920s id utilized as an art form that inscribes itself directly in the real space and time, rather than on a medium.
  2. Naturalia : Naturalia, which includes creatures and natural objects, with a particular interest in monsters

29 November 2019

Introducing Filipe Bento - BL Labs Technical Lead

Posted by Filipe Bento, BL Labs Technical Lead

Filipe BentoI am passionate about libraries and digital initiatives within them, and am particularly interested in Open Knowledge, scholarly communication, scientific information dissemination, (Linked) Open Data, and all the innovative services that can be offered to promote their ultimate dissemination and usage, not only within academia, but also within the wider community such as industry and society. I have over twenty years experience in developing and supporting library tools, some of which have facilitated automation over manual methods to make the lives of people who work or use libraries easier.

Before working at the British Library, I was an independent consultant in the areas of digital strategies and initiatives, library technologies, information management, digital policies, Software as a Service (SaaS) and Open Source Software (OSS). Previous to that, I worked at EBSCO Information Services in several roles, firstly as the Discovery Service Engineering Support Team Manager (Europe and Latin America) and for three years as the Software Services, Application Programming Interfaces (API) and Applications (Apps) manager. My last role at EBSCO was implementing and managing the EBSCO App Store which involved working with several departments within the organisation such as marketing and legal.

Filipe Bento giving a talk the BAD conference in the Azores
Giving a talk the National Congress of BAD (Portuguese Librarians, Archivists and Documentalists Association), in the Azores

I helped the University of Aveiro's Library become the first Portuguese adopter of reference Open Source Software (OSS)  - OJS [Open Journal Systems] and implemented the institutional digital repository DSpace for the university (which included a massive data transformation and records deposit, often from citations exported from Scopus). I started my career as a lecturer and then as a computer specialist at the University of Aveiro’s Library, coordinating the development of information systems for its many branches for over fifteen years.

My PhD research in Information and Communication in Digital Platforms gave me the opportunity to connect with my professional interests in libraries, especially in the areas of information discovery. In my PhD, I was able to implement VuFind with innovative community features, as a proposal for the university, which involved engaging actively in its developer community, providing general and technical support in the process. My thesis is available via the link "Search 4.0: Integration and Cooperation Confluence in Scientific Information Discovery".

University of Aveiro (main campus), Portugal
University of Aveiro (main campus), Portugal

I have also been very active in a number of communities;
I was the (former) chairman of the board of USE.pt, the Portuguese Ex Libris Systems’ Users Association, and a previous member of the DigiMedia Research Center - Digital Media and Interaction at the University of Aveiro.

In my personal life I had been a radio and club DJ and worked on a number of personal music projects. I enjoy photography and video and am a keen traveler. I especially like being behind the wheels of cars / motorbikes and the propellers of drones.

I am really excited in joining the BL Labs team as I believe it provides an excellent opportunity to apply my skills, knowledge and expertise in library digital collections development, systems, data and APIs in a digital scholarship and wider context. I am really looking forward in offering practical advice and implementations in providing access to data, data curation, data visualisation, text and data mining and interactive web based computing environments such as Jupyter Notebooks to name a few. BL Labs and the British Library offers a rich, innovative and stimulating environment to explore what its staff and users want to do with its incredible and diverse digital collections.

26 November 2019

The British Library / Qatar Foundation Partnership Project Hack Day - Theme: Collaboration

Introduction

On October 16th 2019, the BL/QFP Project opened its doors to its third and biggest Hack Day to date. After the success of the first two, it was decided to extend participation beyond the Imaging Team and invite everyone from the rest of the project to get involved. We wanted to utilise the unique way the BL/QFP is set up within the Library, with different teams and specialities all working in one place, and emphasise collaboration across these different teams. The diverse people and teams within the BL/QFP bring a wide variety of skills and experience, from language and collections knowledge to artistic and technicial expertise. By bringing these together, we were able to learn from and teach each other whilst engaging with the collections and producing a variety of fascinating and thought-provoking hacks.

 

The Hacks

During the launch workshop the vast array of skills and expertise within the team were evident, as well as the abundant enthusiasm and ambition people had. After the proposed projects were raised and discussed, five teams were formed focusing on a wide range of ideas. On the day, almost half of the BL/QFP staff participated in some way, proving the collaboration objective was well and truly met. The diverse outputs, from animations and games to pinhole cameras and data visualisations, were presented at a “show and tell”, and some were displayed on the BL/QFP Twitter page.

Below is a summary of the day, including descriptions of the hacks from each of the five teams, enjoy!

 

Games

Team: Renata Kaminska, Mariam Aboelezz, Anne Courtney, Susannah Gillard and our Quality Assurance Officer

For our Hack Day project, we wanted to make the Qatar Digital Library (QDL) more accessible to non-experts, or people who might not be looking at it with a specific research aim. With this in mind, we decided to develop some quick and easy games to engage users with the collections. Using free browser-based software, we created a word search, jigsaw puzzles, crosswords, and a game of hangman. These all drew on the collections which were already digitised. Where possible, we tried to include links to the items on the QDL. Although the free software had some limitations, we feel that these games offer a foundation to build on in the future.

Games

Crossword using information from a letter from Lieutenant William Bruce, Resident, Bushire, 1814 (IOR/R/15/1/14, ff 125v-127)
Play: https://tinyurl.com/yespowj7

Jigsaw using image from Tarjumah-ʼi ʻAjā’ib al-makhlūqāt (Or 1621, f 391v)
Play: https://tinyurl.com/yjuf9d7o

Jigsaw using image of ‘Persia and Afghanistan. Map I’ (IOR/R/15/1/730, f 87)
Play: https://tinyurl.com/yhpv3erw

Hangman game using words from the QDL collection
Play: https://tinyurl.com/yftf5vca

Word Search using words from the QDL collection
Play: https://tinyurl.com/yj9nugb9

 

Photogrammetry: Astrolabe Quadrant

Team: Darran Murray, Tony Grant, Nick Krebs, Matt Griffin, Rebecca Harris, Matthew Lee, Daniel Loveday and Annie Ward

Our Hack Day project centred on what the Library's Imaging Services can do with the technology and expertise they offer, in particular photogrammetry. Photogrammetry is a photographic process where any type of object can be rendered into a 3D image for display on a 2D screen. This process is beneficial as it displays the complete item, allowing users to see and understand collection items without having to handle them.

We chose an Astrolabe Quadrant from the collection and created a rendition but also an animation of the same object. As you can see from both renditions, they have certain advantages over a single photograph of a 3D object, bringing the object to life.

 

Camera view during photogrammetry creation
Camera view during photogrammetry creation

 

BL St Pancras Studio during photogrammetry creation
British Library St Pancras Studio during photogrammetry creation

 

Photogrammetry rendition of astrolabe quadrant

 

 

Animation created using images from astrolabe photogrammetry rendering
Animation created using images from astrolabe photogrammetry rendering

 

Obscura / Pinhole / Cyanotype

Team: Rebecca Harris, Matthew Lee, Daniel Loveday, Darran Murray and Annie Ward

Our team explored early types of photography. First off, we created a camera obscura in one of the bays in our Imaging Studio by blocking out the light and creating a small hole by the window. An optical phenomenon, this simple hack allowed us to create an upside-down projection of the London skyline, complete with moving clouds and the Shard. Viewing sessions were arranged throughout the day and resulted in a stream of curious visitors.

As well as the camera obscura, we each built our own pinhole cameras using card, stripy tape and light-sensitive paper. Once we can set up a temporary darkroom we plan to take and develop photographs illustrating different aspects of the BL/QFP Project. Lastly, an experiment using cyanotype paper in an old brownie camera is still in progress, taking a long exposure still-life of a spirit level borrowed from the Conservation Team.

Upside-down projection of the London skyline created using the camera obscura
Upside-down projection of the London skyline created using the camera obscura

 

Pinhole cameras
Pinhole cameras

 

Time-lapse video showing creation of pinhole cameras

 

Behind The Scenes: Visualisations

Team: Jordi Clopes Masjuan and Sotirios Alpanis, with translation assistance from George Samaan

Our Hack Day project aimed to illuminate some aspects of our digitisation workflow that are not directly represented in the material displayed on the QDL. We wanted to represent and celebrate some of the hard work that goes into the creation of digitised material, particularly the tasks and processes that most people wouldn’t necessarily think about. The 45+ people involved in our workflow have a huge variety of skills and expertise, and it was some of this that we wanted to capture. We decided to use ‘every day’ objects from our workflow and picture them in interesting ways. Then pick out some interesting facts and figures about the processes they represent. Using Photoshop these two were combined to present the facts in their ‘every-day’ setting.

QDL Homepage combined with a picture of the BL/QFP Team
QDL Homepage combined with a picture of the BL/QFP Team

 

Permission letters sent by Rights Clearance Team to people identified as Rights Holders for material being digitised
Permission letters sent by Rights Clearance Team to people identified as Rights Holders for material being digitised

 

A book undergoing conservation treatment
A book undergoing conservation treatment

 

The British Library’s digital servers
The British Library’s digital servers

 

A Leading Library Assistant’s trolley in the lift carrying collection items
A Leading Library Assistant’s trolley in the lift carrying collection items

 

The Workflow Team’s Kanban Board
The Workflow Team’s Kanban Board

 

A Foliator’s desk drawer
A Foliator’s desk drawer

 

Visualising Data

Team: David Woodbridge, Sotirios Alpanis, Laura Parsons, with assistance from Anna Waghorn

We had the idea to display data about the Project in visually dynamic and appealing ways. We thought we could experiment with displaying authority terms used in the Project’s catalogue records and see what we could learn about data manipulation and data visualisation along the way.

Whilst we were able to export data about the Project and collections from SharePoint (the platform we use to manage items through the digitisation workflow) and IAMS (the Library's cataloguing system for archives, manuscripts, photographs and other visual materials), we needed to tidy up the data to make it useful. For example, the IAMS data is exported as code in an XML file format so Sotirios experimented with extracting particular elements. This work highlighted how data visualisations rely on having well-organised and complete data.

Using Microsoft Power BI, we tried a variety of ways for displaying the data, including network and force-directed graphs. These graphs show relationships between data points, such as the authority terms assigned to different shelfmarks. We also created other visualisations, such as pie charts, that quantified specific aspects of the data, for example showing the numbers of person authorities according to gender, or the language of their name. The challenge being to create something visually appealing but still meaningful.

 

Dashboard displaying data visualisations including network, force-directed and pie graphs.

 

Weaponry on Walls

Team: Hannah Nagle & the British Library’s BAME Staff Network

Working in collaboration with the Library’s BAME Staff Network, we wanted to investigate people’s perceptions of weaponry displayed in our offices. We prepared a survey and sent it out to the BL/QFP Project staff, asking them to fill it in. A work in progress, the results, quotes and related imagery will be collated into a zine illustrating the survey responses to the weapons. We will be using original photographs, images from the QDL and public domain images found on Flickr, including from the British Library’s Flickr account. With this, we hope to start a conversation about what the weaponry can represent to different people and why this is important to keep in mind.

Example of images created to respond to the weaponry on the walls
Example of images created to respond to the weaponry on the walls

 

Further information

If you would like to explore the photographs and documents used in our Hack Day projects from the Qatar Digital Library or find out more about the India Office Records please follow the links below:

 

You can also read about the previous Hack Days in the blog posts below:

 

This is a guest post by the British Library Qatar Foundation Partnership, compiled by Rebecca Harris and Laura Parsons. You can follow the British Library Qatar Foundation Partnership on Twitter at @BLQatar.

The BL/QFP Project’s Imaging Team won the Staff Award at the British Library Labs Symposium 2019 for their Hack Days.

 

20 November 2019

Hacking Web Maps-T

This is a guest blog post by Dr Gethin Rees, Lead Curator for Digital Map Collections at the British Library. It was originally posted on the Pelagios Commons Blog.

 

The Web Maps-T working group aims to enhance the ability to visualise geospatial and temporal Linked Open Data on web maps. The group is coordinated by Gethin Rees and Adi Keinan-Schoonbaert, see this previous post for more details. As a first step, we held a hack workshop in September at the British Library to scope applications to visualise and work with the GeoJSON-T standard. Participants at the workshop included from the Rainer Simon from the Austrian Institute of Technology, Karl Grossner from University Of Pittsburgh, Neil Jakeman from King’s College Digital Lab, Alex Butterworth and Simon Wibberley from Sussex University’s Humanities Lab alongside Mia Ridge and Olivia Vane from Digital Scholarship at the British Library. They came to the workshop with a variety of datasets derived from the art, archaeology and travelogues from the classical world for example, as well as the collections of the British Library, including contributions from the Endangered Archives ProgrammeGeoreferencer and Two Centuries of Indian Print.

Matching datasets to visualisation types
Matching datasets to visualisation types

 

Throughout the event we collaborated using repositories in the Pelagios GitHub organisation and Google Docs. After introductory talks, first by Gethin Rees on user stories and use cases for Web Maps-T, and second by Karl Grossner on GeoJSON-T and the Linked Traces app, participants introduced their own projects and use cases for geospatial and temporal visualisation.

Karl and Alex discuss the finer points of “The Prelude Timeline: On the Growth of My Own Mind” by Alex Butterworth and Stephanie Posavec, for The Wordsworth Trust
Karl and Alex discuss the finer points of “The Prelude Timeline: On the Growth of My Own Mind” by Alex Butterworth and Stephanie Posavec, for The Wordsworth Trust

 

Participants then coalesced around several tasks and divided into groups to work. These tasks included:

  • MoSCoW assessment of Web Maps-T app. This prioritisation technique divided potential features of the app into Must, Should, Could and Won’t. For example, our app must have a timeline, a map, and work with GeoJSON-T. On the other hand it could use animation for visualisation, and won’t include a text box querying for plain English.
  • Classification of datasets and visualisation types. We wanted to explore several forms of visualisation and it quickly became clear that some were better suited to certain types of datasets than others. For histogram, linear journey, period horizontal bars, valid time intervals and beeswarm visualisations we recorded data type, size, examples and pros and cons.
  • These activities were documented in detail and will form the basis of the white paper. Participants that wrote code worked on:
  • WhenJSON— a front-end JavaScript utility library for manipulating temporal data in the GeoJSON-T format (the ‘when’ object). The utility could, for example, help in calculating the interval that separates two data sources.
  • A Minimum Viable Product (MVP) web map with a time-slider componentto test visualisation types for different dataset classifications. This work borrows heavily from the work of Jonathan Skeate. The MVP allowed us to see how datasets looked when visualised using different methods such as histogram or time bars.
Visualisation of Endangered Archives Programme data
Visualisation of Endangered Archives Programme data

 

  • Scoping the adaption Karl Grossner’s Linked Traces app to be used with any dataset and to add a time slider. This could be a more robust and presentable solution than our adaptation of Skeate’s Leaflet Timeline.
Itinerary of the Bordeaux pilgrim in Linked Traces
Itinerary of the Bordeaux pilgrim in Linked Traces

 

Next steps for the group are to write a white paper and to attend the Linked Pasts conference in Bordeaux. All the workshop participants had projects where they were keen to apply Web Maps-T and there must be many more use-cases out there. However, the path from hack event to an open-source, production-ready component that can visualise any GeoJSON-T data we throw at it is not straightforward. Like any open-source project, success rests on people and projects requiring the component and standard for their day-to-day work. There are plenty of potential applications in the Linked Pasts community and beyond, the main resources that we now require are developer time and GeoJSON-T datasets. We will continue to encourage others to contribute, perhaps you have potential applications. Get in touch at visualisation[at]pelagios.org