THE BRITISH LIBRARY

Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

21 August 2019

Chevening British Library Fellowship working with Chinese historical texts

Chevening is the UK government’s international awards programme aimed at developing global leaders. In 2015, the Foreign and Commonwealth Office (FCO) has partnered with the British Library to offer professionals two new fellowships every year. These fellowships are unique opportunities for one-year placements at the Library, working with exceptional collections under the Library’s custodianship. Past and present Chevening Fellows at the Library have focused on geographically diverse collections, from Latin America through Africa to South Asia, with different themes such as Nationalism, Independence, and Partition in South Asia, 1900-1950 and Big Data and Libraries.

We are thrilled to announce that one of the two placements available for the 2020/2021 academic year will focus on automating the recognition of historical Chinese handwritten texts. This is a special opportunity to work in the Library’s Digital Scholarship Department, and engage with unique historical collections digitised as part of the International Dunhuang Project and the Lotus Sutra Manuscripts Digitisation Project. Focusing on material from Dunhuang (China), part of the Stein collection, this Fellowship will engage with new digital tools and techniques in order to explore possible solutions to automate the transcription of these handwritten texts.

Chinese Lotus Sutra scroll with Tibetan divination texts on the back (Shelfmark: Or.8210/S.155). Digitised as part of the Lotus Sutra Manuscripts Digitisation Project.
Chinese Lotus Sutra scroll with Tibetan divination texts on the back (Shelfmark: Or.8210/S.155). Digitised as part of the Lotus Sutra Manuscripts Digitisation Project.

 

The context for this fellowship is the Library‚Äôs efforts towards making its collection items available in machine-readable format, to enable full-text search and analysis. The Library has been digitising its collections at scale for over two decades, with digitisation opening up access to diversely rich collections. However, it‚Äôs important for us to further support discovery and digital research by unlocking the huge potential in automatically transcribing our collections. Until recently, Western language print collections have been the main focus, especially newspaper collections. A flagship collaboration with the Alan Turing Institute, a project called ‚ÄúLiving with Machines,‚ÄĚ is underway to apply Optical Character Recognition (OCR) to UK newspapers, design and implement new methods in data science and artificial intelligence, and analyse these materials at scale.

Taking a broader perspective on Library collections, we have started to explore opportunities with non-Latin collections too. Members of the Digital Scholarship team are engaging closely with the exploration of OCR and Handwritten Text Recognition (HTR) systems for Bangla and Arabic. Digital Curators Tom Derrick, Nora McGregor and Adi Keinan-Schoonbaert have teamed up with PRImA Research Lab and the Alan Turing Institute to ran four competitions in 2017-2019, inviting providers of text recognition methods to try them out on our historical material. Another initiative which Tom is engaged with is exploring Transkribus for Bengali printed texts. He trained Transkribus’ HTR+ recognition engine, which ended up transcribing this material at 94% character accuracy! Tom and Adi’s recent blog post in EuropeanaTech Insight (issue on OCR) summarises these initiatives.

Regions and text lines demarcated as ground truth for RASM2019 ICDAR2019 Competition on Recognition of Historical Arabic Scientific Manuscripts (Shelfmark: Add MS 7474). Digitised and available on Qatar Digital Library.
Regions and text lines demarcated as ground truth for RASM2019 ICDAR2019 Competition on Recognition of Historical Arabic Scientific Manuscripts (Shelfmark: Add MS 7474). Digitised and available on Qatar Digital Library.

 

The Chevening Fellow will contribute to our efforts to identify OCR/HTR systems that can tackle digitised historical collections. They will explore the current landscape of Chinese handwritten text recognition, look into methods, challenges, tools and software, use them to test our material, and demonstrate digital research opportunities arising from the availability of these texts in machine-readable format.

This fellowship programme will start in September 2020 for a 12-month period of project-based activity at the British Library. The successful candidate will receive support and supervision from Library staff, and will benefit from professional development opportunities, networking and stakeholder engagement, gaining access to a range of organisational training and development opportunities (such as the Digital Scholarship Training Programme), as well as staff-level access to unique British Library collections and research resources.

For more information and to apply, please visit the Chevening British Library Fellowship page: https://www.chevening.org/fellowship/british-library/, and the ‚ÄúAutomating the recognition of historical Chinese handwritten texts‚ÄĚ Fellow page: https://www.chevening.org/fellowship/british-library-chinese-handwritten-texts/.

Applications close at 12pm (GMT), 5 November 2019. Good luck!

 

This blog post is by Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She's on Twitter as @BL_AdiKS.

20 August 2019

Innovation Labs and the digital divide

Guest posting by Milena Dobreva-McPherson, Associate Professor Library and Information Studies UCL Qatar with contributions from Tuesday Bwalya, Lecturer, Library and Information Science Department, The University of Zambia (UNZA) and Fidelity Phiri, Visiting Researcher, UCL Qatar.

Can you recall seeing an interesting digital cultural heritage object from Zambia lately? If you search the Europeana Collections portal, you will find some 2500 digital objects coming from European heritage institutions. Alongside these items, you can enjoy the sound recording of a grunting and splashing Hippopotamus captured on 2 July 1985 on Luangwa river in Zambia. This object was aggregated from the British Library’s sound collection

Digitisation efforts of various Zambian institutions date back to 2002; for example, at the National Archives of Zambia (which does not have its own website at the time of writing this post), finding digital content originating from Zambian institutions is currently a challenge, unless you are visiting these institutions in person. One possible reason is that institutions in Zambia digitise for the purposes of internal collection management, preservation, and on-site use, like many other organisations. A rare exception is the digitised collection of the records of the United National Independence Party (UNIP) of Zambia, which was created in 2007 in collaboration with the Endangered Archives Programme of the British Library. While it cannot be accessed on any Zambian digital platform, it is available on the website of the British Library.

Is this situation (of very little accessible digital material online in the archives) common for all cultural sectors? Let us have a look at museums. In this domain, the Livingstone Museum was the first to carry out digitisation activities in 2009. The National Museum Board of Zambia, an umbrella organisation for 5 national and 2 community museums, also has an online presence with digitised images. However, trying to explore the Photo gallery or Audio/video files in the Multimedia section on the website returns the ominous 404 Page not found error although the Board definitely has plenty of objects to share. 

Certainly, one could argue that the poor institutional online digital presence is to be expected in a country within the Global South where a digital divide still exists.  After all, even finding data to assess the scale of this digital divide is a challenge, and the body of publications on digital divide in Africa had been quite limited with some 100 identified works over 12-year period (2000-2012). There is also a lack of recent estimates on the state of technological use in museums. Back in 2002, Lorna Abungu suggested that "[a]t present, out of 357 known museums throughout the African continent (including the Indian Ocean islands), only seventy-five have ‚Äď on an institutional level ‚Äď at least basic Internet access for e-mail." 

And, while tackling the digital divide is one of the big challenges of the Global South, when we look at it specifically from the digital cultural heritage perspective it has a global effect. Those within the divide are not able to use modern information and communication technologies to their full advantage. This is one of the reasons digitisation is either delayed or caters only for on-site use in Zambia, for example. But for those on the other side of the divide it results in impaired access to the digital heritage currently being accumulated in the regions affected by the digital divide. This is why the users searching for the sounds of hippopotamus splashing will have a chance to discover them only if they are deposited in a collection on the other side of the divide. 

To foster a change within this current situation of a lack of accessibility to the digital cultural heritage of Zambia, UCL Qatar joined forces with the National Museums Board of Zambia to deliver a day-long workshop on Innovation Labs in Cultural Heritage Institutions which was hosted on 1 August, 2019 by the Livingstone Museum. You can read more about this event , in a 'Reflections from the First Sub-Saharan African Workshop on Digital Innovation Labs in Cultural Heritage Institutions' blog post.

Fig. 4. Workshop participants
Fig. 1. After discussing how to overcome some of the disadvantages of the digital divide:
Participants in the Innovation Labs in Cultural Heritage Institutions which was hosted on 1 August, 2019 by the Livingstone Museum

There was a clear message from Mahendra Mahey, of British Library Labs that innovation in user engagement can start small, with the use of open source tools and popular web platforms. This event provided useful insights on the questions newcomers to the Innovation Lab community have to ask. In September, a Book Sprint to develop the first guide for setting up, running and maintaining a Digital Cultural Heritage Innovation Labs will be held in Doha, Qatar. 

Here are some of these interesting questions for the wider labs community:

  • Keeping in mind how the level of technological innovation is different on both sides of the divide; what should an innovation lab within the divide offer? Incremental innovation to the state of technology around or advanced innovation to match the global leaders?
  • How much can open platforms support innovation for these labs?
  • Can the route of using predominantly open tools and platforms for innovation labs be used also as a way to enhance open science in the Global South? 

Until a shift in the digital access happens, we will continue browsing some digital content on Zambian heritage coming from other cultural heritage organisations outside Zambia, beyond the digital divide.

Fig. 4. Workshop participants Dr Milena Dobreva-McPherson, is Associate Professor Library and Information Studies at UCL Qatar with international experience of working in Bulgaria, Scotland and Malta. Since graduating M.Sc. (Hons) in Informatics in 1991, Milena specialized in digital humanities and digital cultural heritage in the Bulgarian Academy of Sciences, where she earned her PhD in 1999 in Informatics and Applied Mathematics and served as the Founding Head of the first Digitisation Centre in Bulgaria (2004); she was also a member of the Executive Board of the National Commission of UNESCO. Milena‚Äôs research interests are in the areas of innovation diffusion in the cultural heritage sector; citizen science; and users of digital libraries. Milena is a member of the editorial board of the IFLA Journal - Sage, and of the International Journal on Digital Libraries (IJDL) - Springer and a member of the steering committed of the three biggest conference series in digital libraries, IJDL, TPDL and ICADL. Consultant of the Europeana Task Force on Research Requirements.  

Tuesday Mr Tuesday Bwalya, Lecturer, Library and Information Science Department, The University of Zambia (UNZA). He holds a Master‚Äôs Degree in Information Science from China. In addition, Mr. Bwalya has received training in India and Belgium in Library Automation with Free and Open Source Library Management Systems such as Koha and ABCD. His research interests include free and open source library management systems; open access publishing; database systems; web development; records management; cataloguing and classification.

Fidelity Fidelity Phiri is currently employed as Librarian at Moto Moto Museum and a visiting researcher at UCL Qatar. He has worked for National Museums Board of Zambia since 2001. He  holds a Bachelor's degree in Library and Information Science from the University of Zambia. Fidelity  also graduated in April 2019 from UCL Qatar and  is a holder of a Master‚Äôs degree in Library and Information studies. His research interests are in bibliometrics studies and digital humanities/units  that provide access to digital collections.

Acknowledgements: We would like to thank Fred Nyambe for the photos and Dania Jalees for the editing.

Reflections from the First Sub-Saharan African Workshop on Digital Innovation Labs in Cultural Heritage Institutions

Guest posting by Milena Dobreva-McPherson, Associate Professor Library and Information Studies UCL Qatar with contributions from Tuesday Bwalya, Lecturer, Library and Information Science Department, The University of Zambia (UNZA) and Fidelity Phiri, Visiting Researcher, UCL Qatar.

Recently UCL Qatar joined forces with the National Museums Board of Zambia to deliver a day-long workshop on Innovation Labs in Cultural Heritage Institutions which was hosted on 1 August, 2019 by the Livingstone Museum, Zambia. This workshop was the first of its kind in Sub Saharan Africa and was made possible with the support of the Africa and the Middle East Teaching Fund of the UCL Global Engagement Office. Initially planned for 15 professionals from the cultural heritage sector, it attracted 27 participants (see Fig. 1) coming from six towns located in four out of the ten provinces in Zambia (see map).

Fig. 1.  Participants by sector and gender in the First Sub Saharan Workshop on Innovation Labs in Cultural Heritage Institutions in Zambia, 1‚ÄĆ August 2019
Fig. 1.  Participants by sector and gender in the First Sub Saharan Workshop on Innovation Labs in Cultural Heritage Institutions in Zambia, 1‚ÄĆ August 2019

After two vibrant events about Digital Innovation Labs in Cultural Heritage organisations, this was the first event bringing together a higher proportion of participants from museums and archives in addition to the libraries represented. The Building Library Labs event was the first of its kind ever held at the British Library in September 2018, followed by a second workshop in Copenhagen (March, 2019); both attracted mostly library professionals though there were a few attendees from Archives, Galleries and Museums.  

The Innovation Labs emerged as specialised library units supporting a variety of users in experimenting with digital content in the mid 2000s. However, engaging users with digital content is equally important for museums, archives and galleries. And the exchange of institutional experience across the digital cultural heritage sector is essential for professionals who work there, especially when the number of Innovation Labs around the world is growing steadily. The presenters at the event in Zambia included Milena Dobreva-McPherson, UCL Qatar, Fidelity Phiri, Mr Tuesday Bwalya, University of Zambia, Mr Fred Nyambe (Registrar of Collections, Livingstone Museum) and Mr Brian Mwale, (Chief Librarian, National Archives of Zambia). Fiona Clancy (Digitisation Workflow Manager, British Library), Mahendra Mahey (BL Labs Manager, British Library), and Somia Salim, who is an MA student in Library and Information Studies at UCL Qatar, also contributed online (see full programme with links to some of the presentations).

The call for innovation in the heritage sector was clearly communicated in the welcome address delivered on behalf of the Livingstone district acting commissioner Harriet Kawina; this had been duly reported in several publications in Zambian national newspapers (see for an example Fig.2).

Fig. 2. Article on the event in the MAST independent newspaper, 5.08.2019
Fig. 2. Article on the event in the MAST independent newspaper, 5 August 2019

The mixture of presentations discussing the current trends in user engagement with digital content and local examples of digitisation projects and how it works in reality, created a great opportunity to discuss the stumbling blocks in opening content for wider access and use. For some Zambian institutions, the main issue is a lack of a coherent and systematic digitisation efforts, and there was a shared feeling amongst attendees that there needed to be more guidance and clear policies about digitisation for them to follow, which are still not currently in place. Other institutions accumulated digital content and keep it available only internally, not looking into or even considering access and use to external audiences using online platforms on a systematic basis. 

The workshop discussions were lively and engaged; they identified that there is definitely a larger scope to learn from each other locally. In addition, there was a growing realisation amongst organisations that opening their digital content for use by an external audience is now the next step on the agenda of those who have already accumulated it. The feedback of one of the participants, which perhaps summarised this the most clearly, suggested what needs to happen after this workshop in three-steps: 

  • Put the knowledge acquired in the workshop to use ASAP.
  • Conduct a follow up workshop to determine progress in the innovation labs created.
  • Organise a massive awareness campaign to introduce potential users to the innovation labs created.

The workshop participants also experienced the traditional scheduled power outage for the day which explains why the photo illustrating the presentation of certificates is a bit dark (but hey, in the digital world we can easily fix such glitches!)

Fig.3. Participant receiving a certificate from Assoc. Prof. Milena Dobreva
Fig.3. Participant receiving a certificate from Associate Professor Milena Dobreva

Bringing for the first time to the Sub Saharan region the knowledge about innovation labs, fostering dialogue between representatives of different cultural heritage institutions, and discussing the issue of improving access to digital content is just a humble first step in what we hope will help local institutions to improve user engagement and overcome the current digital divide which keeps available digital content hidden from the world.  Read more about Innovation Labs and the digital divide.

Fig. 4. Workshop participants Dr Milena Dobreva-McPherson, is Associate Professor Library and Information Studies at UCL Qatar with international experience of working in Bulgaria, Scotland and Malta. Since graduating M.Sc. (Hons) in Informatics in 1991, Milena specialized in digital humanities and digital cultural heritage in the Bulgarian Academy of Sciences, where she earned her PhD in 1999 in Informatics and Applied Mathematics and served as the Founding Head of the first Digitisation Centre in Bulgaria (2004); she was also a member of the Executive Board of the National Commission of UNESCO. Milena‚Äôs research interests are in the areas of innovation diffusion in the cultural heritage sector; citizen science; and users of digital libraries. Milena is a member of the editorial board of the IFLA Journal - Sage, and of the International Journal on Digital Libraries (IJDL) - Springer and a member of the steering committed of the three biggest conference series in digital libraries, IJDL, TPDL and ICADL. Consultant of the Europeana Task Force on Research Requirements.  

 

Tuesday Mr Tuesday Bwalya, Lecturer, Library and Information Science Department, The University of Zambia (UNZA). He holds a Master‚Äôs Degree in Information Science from China. In addition, Mr. Bwalya has received training in India and Belgium in Library Automation with Free and Open Source Library Management Systems such as Koha and ABCD. His research interests include free and open source library management systems; open access publishing; database systems; web development; records management; cataloguing and classification.

 

Fidelity Fidelity Phiri is currently employed as Librarian at Moto Moto Museum and a visiting researcher at UCL Qatar. He has worked for National Museums Board of Zambia since 2001. He  holds a Bachelor's degree in Library and Information Science from the University of Zambia. Fidelity  also graduated in April 2019 from UCL Qatar and  is a holder of a Master‚Äôs degree in Library and Information studies. His research interests are in bibliometrics studies and digital humanities/units  that provide access to digital collections.

Acknowledgements: We would like to thank Fred Nyambe for the photos and Dania Jalees for the infographic and the editing.