Living Knowledge blog

Behind the scenes at the British Library

Introduction

Experts and directors at the British Library blog about strategy, key projects and future plans Read more

26 November 2019

British Library Shared Research Repository launched in beta

Portolano__Egerton_MS_2855__f.4r_
Research undertaken by British Library staff is often reported – even celebrated – on these pages. Imagine the careful research that goes into interpreting the manuscript fragments of medieval bibliophile and bookseller John Bagford, or putting on an exhibition such as Karl and Eleanor Marx: Life in the Reading Room, or indeed in supporting our contribution to UK library infrastructure activities, such as our recent Open and Engaged Conference.

As a national library, research informs and supports almost every aspect of our work, be it curation, conservation, preservation, digital innovation, cultural programming or learning. Whether it’s a major exhibition or a new way to discover or understand a unique part of our collections, it has been enabled by staff research.

Virtually all major museums, galleries, archives and libraries are in the same position. Although research is not our primary function, we all undertake significant amounts of research often based on our collections, and it’s important we make the outputs of that research as open as possible to allow future researchers to take advantage of and build on our work.

To make our research more visible, discoverable and reusable for further research, we’re excited to announce the launch of our Shared Research Repository.

The Shared Repository, currently a beta service, brings together the openly available research outputs produced by staff and research associates of six cultural and heritage organisations: the British Library; the British Museum; MOLA (Museum of London Archaeology); National Museums Scotland; Royal Botanic Gardens, Kew; and Tate. Each partner has their own repository and is responsible for their own content, but users can also explore the combined content using the shared search from the homepage. Articles, book chapters, datasets, exhibition texts, conference presentations, blogs and many more types of our research are now discoverable and downloadable by researchers worldwide. The repository currently holds just a selection of outputs to give a flavour of our research activities, with many more to be added in the coming months.

While UK Higher Education institutions have well established repositories (which are often essential to help manage their research submissions to the Research Excellence Framework research funding process), the research produced by cultural and heritage organisations is often not as visible as we’d like it to be. And indeed should be, since much of it is undertaken with at least some public funding in our role as Independent Research Organisations.

Even within our six current Shared Repository organisations our research is varied and wide-ranging. But browsing the first items already in the repositories also reveals interesting parallels and shared research interests, as in these examples:

If all goes well we’ll be looking at how we can extend the service both in the volume of content available, and the number and range of partner organisations including beyond the cultural sector.

Do visit our beta Shared Research Repository and explore the research outputs currently deposited. We’d love to have your feedback so please get in touch with our Repository Services team if you’d like to find out more: openaccess@bl.uk.

Sara Gould

Repository Services Lead

28 October 2019

Open and Engaged: Open Access Week at the British Library

HelenHardy1-smaller

There are opportunities and benefits for growth in open access and open scholarship when experience and knowledge is shared between Higher Education Institutes and cultural heritage organisations.

On Tuesday 22nd October, The British Library celebrated Open Access Week with the event, Open and Engaged - Forging links between higher education and cultural heritage to foster open scholarship (see #OpenEngaged19). This one-day event brought together representatives from Universities, Museums, research organisations, libraries and the private sector to examine how these links can be forged.

Liz Jolly, the Library’s Chief Librarian, opened the event with the hope that the day would allow participants to gain insights into a variety of points of view, and commence a dialogue to open up the cultural heritage sector and improve the user experience for researchers across the globe.

Our first keynote was from Helen Hardy, Digital Collections Programme Manager at the Natural History Museum (pictured above). She explained the opportunities and challenges NHM face in releasing their data and the impact that digitisation is having. Helen outlined the industrial scale processes involved in creating digital images from their very varied collection of 80 million items, which range from pinned insects to dinosaur bones. Creative solutions they are using include AI for shape recognition and colour analysis in images, as well as using Lego to create custom hardware for digitisation.

Mark Sweetnam-2-smallerDr Mark Sweetnam, Assistant Professor in English with Digital Humanities, Trinity College Dublin.

The second keynote was a view from higher education by Dr Mark Sweetnam, Assistant Professor in English with Digital Humanities, Trinity College Dublin. Using his experience with the Cultura EU project Mark highlighted how the project developed an interface to access very different digital collections; as well as working with cultural heritage institutions who are a conduit to researchers and users based outside of academia.

Three parallel sessions presented attendees with a choice of topics to engage with in more detail.

The accessibility and inclusive access session saw Tom Scott (Wellcome Trust) and Ben Watson (University of Kent) share the ways in which they are supporting researchers of diverse backgrounds and ability, some with complex digital access needs. Both highlighted that designing with accessibility in mind improves engagement for all. Whether placing the full-text of digitised images into alt-text or appropriately marking up headings in documents, these simple interventions that support assistive technologies such as screen readers also increase visibility of content to search engines.

Art for All: In this session Dr Andrea Wallace and Professor Simon Tanner highlighted the risk that the UK is falling behind on OA in the heritage sector, specifically referring to the ‘Survey of GLAM open access policy and practice’. In response Simon announced the launch of ‘Art for All’, a community action group working to support UK cultural heritage organisations to open digital collections for unrestricted public reuse. The group takes the view that no new rights should arise in faithful reproductions of public domain works and they will advocate for an increase in public funding for digitization.

The open collections and impact session looked at the experience of two organisations who have increased the openness of their collections. Linda Spurdle (Birmingham Museums Trust) had identified that image charges are a barrier to academics use meaning that the Museums’ collection is “missing” from publications. Jason Evans (National Library of Wales), explained how their use of Wikidata is improving access to data as well as enriching it.

Panel-smallerThe concluding panel session on Plan S and open scholarship.

The day concluded with a panel session ‘How can higher education and cultural heritage institutions better work together to ensure the success of Plan S and open scholarship?’. Plan S launched in September 2018 and will require all publications from 2021 onwards that result from research funded by public grants must be published in compliant Open Access journals or platforms. Dr Kathryn Eccles (Oxford Internet Institute) advocated that to be open and engaged is to do more than saying that there is a route through digitised material. Engagement with open collections can lead to a greater range of novel and more playful outputs such as artist engagements or infographics. In response, Dr Torsten Reimer (British Library) observed that whilst higher education institutions focus on making research publications open there is not the same focus on increasing engagement, while for cultural heritage organisations the opposite is true.

Finally, JD Hill (British Museum) raised some concerns about the impact of Plan S on humanities publishing. The fundamental issue he identified was that given the small percentage of research funding that arts and humanities researchers receive, the combined costs of image rights and article processing charges are discriminatory. However, he did challenge arts and humanities researchers to grow their own open ecosystem for a more radical route to open scholarship.

Susan Miles

Scholarly Communications Specialist

 

15 October 2019

Why we collect digital publications

Cover_image-LK

Recently there have been a number of discussions on Twitter and elsewhere about access to e-books at the British Library. We know some Readers have concerns specifically about the way that EPUB format books are displayed on our computer terminals in the Reading Rooms. We also know that there is some confusion around how and why the Library is collecting some books in digital and some in print.

From our own testing of access to e-books in our Reading Rooms, we are aware that there are problems in some cases, particularly for certain formats of e-books and also for Readers who want to read a whole book or significant portion of a book.

Readers asked why the British Library didn’t also purchase print copies of books and give Readers the option to request a work in either print or digital. Others pointed out difficulties with citing eBooks created in the EPUB format that do not have page numbers, and print and digital versions having different citation standards. Readers asked why they cannot use their own devices to access content or take photographs or screen captures. There were also questions about whether the Library could improve the access service for Readers that have a disability and require a print copy or a digital experience that was more accessibility-friendly.

I personally think it’s important for libraries to be transparent about how they collect, preserve, and make accessible their collections: libraries exist to provide people with access to information and to preserve a record of published works for the future. Libraries also benefit from feedback, which informs improvement to services. Additionally, Reader comments are a great way of proving that preservation is successful and alerting library staff to potential issues.

As someone who works with Non-Print Legal Deposit (NPLD) and digital publications on a daily basis, this blog post is an opportunity to provide some context to the points raised by Readers on social media. It will also touch upon some broader points about digital publishing in relation to the UK’s legal deposit collection and the Legal Deposit Libraries' (LDLs) work to manage this collection. 

 

What is legal deposit?

Legal Deposit is not specific to the United Kingdom; many countries have legislation in place that allows memory organisations to collect, preserve, and provide access to published works. Legal deposit in the UK has existed in some form for print materials since 1662 with the involved libraries changing over time. 

The current UK Legal Deposit Libraries consist of the British Library; the National Library of Scotland; the National Library of Wales; the Bodleian Libraries at the University of Oxford; Cambridge University Libraries; and the Library of Trinity College, Dublin. Together they share responsibility for the national collection of print and digital works published or distributed in the UK.

In 2013, UK Parliament updated existing legal deposit legislation to reflect the digital age and respond to the continual evolution of publishing by allowing the deposit of published works in digital formats, thereby ensuring their long-term preservation. This update also mitigated the possibility of a 'digital black hole' in the national collection for works created in digital-only formats. Where a publication exists in print and digital form, the LDLs can agree with publishers to collect the digital publication instead of print.

In many cases, the LDLs and publishers have agreed to move from a model of print deposit to digital deposit, as this provides benefits for the LDLs and publishers. We believe that there are also significant benefits for Readers.

However, the LDLs do continue to collect some publications in print where a digital version also exists. We do this where the print medium is crucial for the delivery and understanding of the content. Examples include loose leaf format books as well as books that have high-resolution images. Representatives in collection management from each of the LDLs determine which publishers could be transitioned from print to digital deposit, and which works are best suited for deposit in print.

 

Benefits of digital deposit

There are benefits to both publishers and libraries for moving from deposit of print books to deposit of digital books. Many publishers have established digital distribution systems, to which the Legal Deposit Libraries can be added with relative simplicity. Digital deposit also requires only one copy to be deposited for all six libraries, which can again help to simplify the effort required from publishers.

For the LDLs, there are a number of benefits. Our experience has been that the supply of books increases following a move from print to electronic deposit. In some cases, we have been able to acquire digital copies of books that we had previously been unsuccessful in claiming in printed form.

The costs associated with managing digital collections are different to those for print collections, and direct comparisons are hard to make. It is not, as many people expect, always more cost effective to collect digital publications compared to print. However, digital legal deposit has allowed all the LDLs to work more collectively and collaboratively to share resources and solve problems.

We believe that there are benefits to Readers too, beyond the increase in publications that we now receive. Publications that we receive in digital form are automatically processed and are much more quickly discoverable and accessible following receipt compared to print. Once in our digital repository, the publications can be delivered almost immediately, provided it is at least seven days after publication by the publisher. For the British Library, this means immediate access for our Readers at Boston Spa as well as in London. We don’t need to put restrictions on the numbers of items ordered in one day, so Readers can request as many digital publications as they need.

Despite these benefits, we know that we haven’t solved all the problems with access to e-books in particular. We are working to improve access, as well as anticipate the needs that Readers will have in the near-future, as research topics, practices, and tools change.      

Finally, we know that publishing is changing as well as research practice. Digital technologies have changed what can be published, who can publish and how publications are distributed and read (including reading by machines). The collections we are building now, and the systems and services we develop to support those collections, need to be fit for purpose for now and in the future. Collecting digital publications at large scale helps us to build our capabilities, and understand how we need to respond to change.

 

Digital Infrastructure  

There’s a lot of work that happens behind the scenes to allow Readers to access digital works within the collection. This section provides an overview of what’s needed to support the ongoing implementation of legal deposit.

The legal deposit collection grew exponentially with the ability to collect published works in digital formats, and implementation is ongoing. This work involves helping publishers make the transition from print to digital deposit; taking on board new publishers that currently do not deposit their published works; ensuring that deposit is ongoing, sustainable, and that there are no gaps; as well as building and maintaining a network of systems and knowledge needed to support the collection.

At the beginning of implementation, the LDLs prioritised collecting e-books and e-journals from larger publishers, tackling the “short tail” of available publications. Collecting the UK web domain as well as building thematic collections of websites was also in scope, and web archiving became a large focus for preservation and access. In recent years, the Libraries have started collecting other content types, including sheet music and geospatial data. 

To support these digitally published works, the LDLs built a digital collection management infrastructure. This infrastructure is primarily located at the British Library and is comprised of a complex network of systems, workflows, processes, tools, and policies that work together to support an end-to-end collection management lifecycle. This lifecycle includes acquisition and deposit; ingest into a digital repository and active preservation once securely stored; cataloguing; discovery; and access at each of the LDLs.

The content files, as well as any associated files (e.g. metadata, cover images), are securely stored on four geographically separate nodes located at the British Library’s locations in London and Boston Spa as well as at the National Library of Scotland and the National Library of Wales.

Where possible, workflow steps are automated, but they rely heavily on the knowledge of Library staff to build and support. And this knowledge is growing as publishers create their works in an array of formats—some of which might only be suitable for certain software applications and hardware—and apply a range of approaches for structuring and supplying metadata.

Non-Print Legal Deposit and its ongoing implementation speak to the changing nature of digital publishing and specifically to how digital technology affects the creation of digital publications as well as to how Readers consume and access content. As digital technology continues to evolve—and publishers apply whichever technology to create their works—the Libraries will need to continue to develop their digital collection management service.

 

File formats and content types

A main objective for allowing the Libraries to collect published works in digital formats was to ensure comprehensive collecting. Another main driver is preservation for ensuring the longevity of the collection.

For a publication to be deposited with the British Library, it must be the version that is made publically available and created in a format that is suitable for long-term preservation.

At present, the British Library accepts the following formats for content types collected under NPLD:

  • eBooks (EPUB, PDF)
  • eJournals (PDF)
  • Web archive (WARC)
  • Geospatial data (raster and vector formats)
  • Sheet music (PDF)

This list of formats will change over time as the collection continues to grow and file formats evolve or even become obsolete. The Libraries actively monitor the stability of file formats already represented in the collection as well as remain aware of trends in digital publishing to understand what is on the horizon.

 

eBooks created in the EPUB file format

Digital preservation is a crucial discipline needed to inform how digital files can be preserved and made accessible to current and future generations of Readers.

Many of the comments we have received concern e-books and the EPUB file format, so it’s important to spend some time here explaining why the LDLs collect this content type in this format.

In comparison to PDF, EPUB is the more suitable preservation format. This is for a few reasons: 

  • It is widely used and supported within the publishing community
  • It is based on open standards
  • It is community supported and the specification is openly available
  • There are a considerable amount of software applications and hardware devices that support access

Whilst EPUB is preferred from a preservation perspective, Readers might experience challenges with citing content from publications created in this format since it commonly does not include page numbers. A feature of EPUBs is that they can be reflowable, where content appears as one long document and the presentation adapts to the viewing software. Readers can change the size of the font, as well as the font itself in some cases. If a reflowable EPUB did support page numbers, these could be different with each individual viewing and the appearance would adjust to how the Reader has chosen to view the content within the viewer.

The EPUB 3 version of this format supports fixed layout, which resembles more closely the layout of a print book and metadata helps specify the orientation and position of the pages. This means that the page will stay the same no matter how it is viewed and can support page numbers. A citation challenge remains, however, since the page numbers in the EPUB version might not be the same as in its print counterpart where both exist.

For all types of EPUBs, citation guidance recommends using paragraph numbers to reference content and counting the paragraphs from the beginning of an eBook's chapter in which the cited content appears. While this solution is not ideal, the challenge of citing eBooks is not specific to Readers using legal deposit publications but exists for all publications created in this format.

 

Access to digital publications

The LDLs endeavour to provide as good an access experience as possible, but it must also be in compliance with the restrictions outlined in the Legal Deposit libraries (Non-Print Works) Regulations 2013.

Access to Non-Print Legal Deposit publications is restricted to use onsite at computer terminals at each of the Libraries. There are additional restrictions that apply:  

  • Single concurrent access per item per library
    • Readers at different LDLs can access the same eBook at the same time, but not if they are at the same site. 
  • No digital copies can be removed from a reading room
  • No digital sharing or screenshots

Whilst some Readers might not find the access experience to be ideal, the Libraries must enforce access conditions that are, in most cases, unique to them. A bespoke access solution was built to accommodate the access restrictions as outlined in the Regulations. This solution is comprised of commercially available and open source software and browsers (where these exist) as well as software that has been either developed or configured by British Library staff or external specialists.


What's next for UK legal deposit?

The Libraries are actively reviewing access to legal deposit publications to identify how to improve the access experience. The recent discussions are therefore topical and comments from Readers help us better understand what a better experience would look like.

In 2018, the UK’s Department for Digital, Culture, Media and Sport also undertook a post-implementation review of the implementation of the NPLD Regulations and made the following recommendations:

  • Accessibility for disabled users to be brought in line with the Equality Act 2010
  • Ability to collect newspapers in the form of digital facsimiles.
  • Understand how access to the UK Web Archive can be increased while protecting rights holders
  • Understand how NPLD regulations can better align with UK copyright law

This review, as well as responses from library and publisher representatives, is publically available on DCMS’ website, and these recommendations will be reviewed by library and publisher representatives as well as external subject matter experts and the general public in the coming months.

As I mentioned earlier in this post, the implementation of Non-Print Legal Deposit is ongoing. It is important for the Libraries to build this collection and ensure that it is preserved and made accessible now and in the future. The Libraries welcome feedback from Readers about their experience of using this collection; Readers can email Customer-Services@bl.uk. The Twitter hashtags #UKNPLD and #UKLegalDeposit publicise news about this collection, including publications and ongoing research to support its collection management.

Caylin Smith
Legal Deposit Libraries Senior Project Manager