THE BRITISH LIBRARY

Collection Care blog

4 posts from January 2014

22 January 2014

Digital Preservation Training Programme: snuggling up with OAIS

An important aspect of digital preservation advocacy in recent years has been the recognition of the importance of developing professional knowledge and skills at all stages of a career. To support this, research projects like DigCurV (Digital Curator Vocational Education Europe) and APARSEN have spent considerable time developing frameworks and curricula for digital preservation training, aimed at cultural heritage professionals at different levels of seniority and career stage. In the UK, the Digital Preservation Coalition (DPC), of which the British Library is a member, provides a range of opportunities for digital preservation training and continuing professional development.

It was thanks to a scholarship from the DPC that I was able to attend the Digital Preservation Training Programme (DPTP) at the University of London Computer Centre last November. Led by Ed Pinsent and Stephanie Taylor, the course aimed to provide delegates with the skills and knowledge necessary to respond to emerging issues in digital preservation from an organisational perspective.

The attendance was international, with representatives from a diverse range of institutions including library, archive, record management, technical and business backgrounds. There were many motivations for attending the course, from delegates who were taking responsibility for initiating a digital preservation strategy within their entire organisation, to others who were preparing for digital archives to be ingested into their collections. As a recent graduate in archive administration, I have some foundation knowledge in digital preservation however I could recognise gaps in my understandings on issues such as preservation systems, cost models and risk assessments.

The core modules of the DPTP evolved around understanding the OAIS (Open Archival information System) model, of existing tools developed specifically for practical digital preservation purposes, and of current legislation and standards of this sector. The DPTP was well managed and engaging, with presentations broken up by short individual tasks which allowed time to examine and absorb tools and concepts, and group activities which were more problem solving and collaborative.

OAIS

The first rule of digital preservation club… OAIS. OAIS photograph courtesy of Flickr user wlef70 / Creative Commons Licensed

Group work encouraged an imaginative but analytical approach to developing digital preservation strategies within an institution, evolving in complexity over the duration of the course. By day three the group tasks had broadened to a critical analysis of a genuine institution’s digital preservation implementations and mapping them against the OAIS model. This was an excellent exercise as it encouraged groups to think of different stages, people and status of records throughout digital preservation activities and examine the strengths and weaknesses of these within a particular organisation. This was then balanced on the ‘three-legged stool’ model of resources, technology and organisation to pinpoint where improvement should be focused. Finally considered in this holistic approach was how all of this tied into ISO 16363 Audit and certification of trustworthy digital repositories.

One of the highlights of the course was a talk by Sharon McMeekin of the DPC on cost models and risk management. This session explored the financial risks of data loss and benefit realisation of successful digital preservation practice, evidencing the value of digital records so that institutions perceive them as valuable.

The course delivered an excellent introduction to many aspects of digital preservation in just three days. For those who would rather a shorter introduction to digital preservation, I would also highly recommend the DPCs ‘Getting started…’ and ‘What I wish I knew before I started…’ conferences. As an early career information professional, I am noticing an increase of job advertisements stating digital preservation as an essential or desirable proficiency. This discipline is still emerging and evolving, and it is a great time to develop skills in digital preservation.

For further information on the DPTP see: http://www.dptp.org/
And for DPC events: http://www.dpconline.org/events

Ann MacDonald, Internship Digital Preservation

16 January 2014

From a caterpillar to a butterfly: the story of a rolled Thai painting

When a rolled rectangular panel painting arrived in the studio, it resembled a caterpillar in more ways than one. It was long, brown, spotty and unattractive! One could only glimpse, peering round the edges, the beauty it would become!

Rolled painting

Surface damage

CC by The object before conservation

The painting dated 1850-1880 was acquired at an auction in 2007. When it was first assessed for conservation, it was tightly rolled, and the assessment was based on viewing the exterior of the roll only. However, the curator, Jana Igunma, told us that the painting had been pulled open once for viewing at the auction.

Looking round the edges it was possible to see that the painting was executed in water based paints on a thick layer of gesso and backed with linen. The panel was additionally made stiff by another layer of very discoloured board. It looked as if, at some point of its life, it had been forcibly pulled from that board and the painting, in self defence, curled up into a tight tube! The painting was now waiting to be opened again, but the stiff and acidic ‘cocoon’ had to go first if the butterfly was to be born!

We proceeded with caution, making sure that the cracked paint layers were not damaged in the process.

Backing removal

Pressing with weights

CC by The backing was gradually removed and then pressed using glass weights

Extensive trials and tests were carried out to determine the most efficient method for removing the backing; only some of the paper could be removed dry with the rest firmly attached. Alcohol and water solution proved most efficient in allowing the gradual removal of the backing without too much penetration of the paint layers, protected additionally by the thick layer of gesso. The paint layers and moisture penetration were checked regularly to see whether there was any indication that the procedure had a detrimental effect on the paint.

The removal of the backing, coupled with gentle pressure applied from the verso, relaxed the painting enough to allow it to be opened gradually.

Glass weights

CC by Glass weights were used to press and flatten the painting

When the backing board was completely removed, the paint layers were reassessed. Although the surfaced was covered in tiny cracks resembling a spider web, most of the pigments appeared in good condition; the exception being the white pigment applied on gold. A few small chunky flakes were re-attached using isinglass (fish based adhesive) applied by brush. The most ‘dramatic’ of those flakes involved an eye, as illustrated below! The white on gold areas were additionally consolidated by applying another adhesive as fine mist over the areas.

Before    After

CC by Left: The deity on the left with no eye, and right: the eye re-attached in place

After conservation, the object was hinged onto a lightweight board and boxed. This completed the ‘chrysalis-like’ transformation of the object from a roll to a beautiful painting that can now be fully appreciated!

Final painting
CC by The painting after conservation

The rectangular painted linen panel depicts scenes from the life of Buddha, which is a very rare subject in Thai manuscript or miniature painting. The painting style is clearly Rattanakosin (1782 - present) . Similar scenes from the life of the Buddha are depicted on mural paintings in Thai Buddhist temples built during the 19th century. The fact that this painting was produced in order to be framed and mounted on a wall indicates that it was possibly done to cater for Western tastes, and might even have been a copy of a mural.

A more thorough study of the painting now that it can be fully accessed may reveal more than we know about it at present.

Iwona Jurkiewicz

I would like to thank Cordelia Rogerson for providing the initial assessment of the painting, Jana Igunma for giving us the information about the object and painting conservators: Nicola Costaras, Audrey Marko, Beatrice Villemin and Odile Hubert, who with their expertise, time and encouragement helped me bring this project to a successful conclusion.

13 January 2014

Read All About It #2 - Building a Future

This is the second in a series of blog posts discussing the challenges of caring for the national newspaper collection - how we’ve worked to preserve it and keep it accessible in the past and how we are going to do so in the future.

The national newspaper collection is on the move. Its current home at Colindale is no longer fit for purpose – either as a repository able to offer long term sustainability to the collection; or as a facility for readers to experience the modern, dynamic newspaper and news service that we want to offer. This recent BBC News report paints a vivid picture.

We know the collection is vulnerable, and if we don’t act now to move it into better conditions, we risk more of it falling into such bad condition that we will be unable to issue it without increased damage or loss, if at all.

Our survey says…

In 2001, as part of a three year project to survey all of the Library’s collections on all of its sites, we surveyed the newspaper collections at Colindale using the PAS (Preservation Needs Assessment Survey) methodology. The results showed that the newspaper collection is the most vulnerable of all of the Library’s collections and gave us a statistically sound picture of the state of this national collection. Our results showed that 34% of the collection at Colindale was unstable – 19.4% in poor condition, 14.6% unusable.

We know that improved storage is the best way of preserving the whole collection for the long term, and our new Newspaper Storage Building (NSB) is undergoing its final testing as I type.

However, this is just the latest – and most ambitious – effort to strike a balance between the long-term preservation needs of the collection and our duty to make it available to users.

The ties that bind

To the bindery workshop!  

When reader facilities were added to the original Colindale repository in 1932, a bindery was also created on the 3rd floor. Here, new legal deposit intake was bound, and older papers were conserved – pulled down, de-acidified, repaired and re-sewn and re-bound. Treatment and binding styles varied depending on the age, type and size of newspaper - machine sewn; hand-sewn on tapes or cords, buckram and leather, half and quarter; finished in foils, mostly, but occasionally gold leaf.

As the conservation and binding of newspapers proved to be less and less cost and time effective over the years, benefiting only a small part of a vast collection, the bindery was closed in 2001. However, because of the work that was done, there are many thousands of volumes in perfectly good condition today that otherwise wouldn’t be.

Below, the bindery at Colindale in full production in the 1980s.

Colindale in the 1980s

CC by Newspapers ready for sewing, by machine and by hand

Colindale in the 1980s

CC by Forwarding and finishing

Lights! Camera! Microfilm!

We know that not everyone is a massive fan of microfilm. From a user point of view it has few of the advantages of digital and it’s not the real thing. But for the long term preservation of content it has proved its worth and without the large-scale microfilming programmes undertaken in the 1970s and onwards, a significant portion of our content would simply be unavailable today in any form.

Microfilming

Microfilming

CC by Microfilming at Colindale began in the 1950s. In 1971 a dedicated microfilm unit was completed. At its height the unit operated 20 cameras and the BL produced (internally and externally) approximately 13 million frames of newspaper content annually

For we are living in a digital world, and I am a digital girl...(sorry, Madonna)

We still copy newspapers today, to increase access to content and to preserve the originals, but the format tends to be digital rather than microfilm. For instance the Library is working in partnership with DC Thompson Family History to digitise 40 million pages of 19th and early 20th century newspapers and make them available on the British Newspaper Archive website. Interestingly, where we can’t scan the original newspapers, the microfilm we created over the last 50 years is proving an invaluable alternative scanning source.

“What are you able to build with your blocks? Castles and palaces, temples and docks.” (from Block City by Robert Louis Stevenson)

New storage building

CC by The new storage building, with the main void at the back and the support building in front

Well, what we’ve been able to build with our blocks is a brand new storage facility for the national newspaper collection at Boston Spa, known lovingly as NSB – Newspaper Storage Building (we love to tell it like it is!). This state-of-the-art building will secure the long term future of the collection. In a complete (improved) reversal of storage fortune for the collection, it will be stored in the dark which will protect it from the damaging light levels that were unable to be controlled at Colindale.

The temperature will be 14⁰C and relative humidity 55%, a vast improvement on what was able to be achieved at Colindale. More importantly, it will be maintained at a steady level which overall will provide an environment for the collection that will slow down the rate of deterioration. Crucially, the oxygen level is purposely low at 14-15%, eliminating the risk of fire (ignition is impossible). The ingest and retrieval of newspapers is automated, which means in turn that the storage can be high density.

Lying down on the job

Not us – the collection! If you read our first post, you’ll know that the collection varies in size enormously, from volumes no bigger than a pocket diary to volumes weighing nearly 20 kg. Storing these large and heavy volumes vertically is causing physical damage, particularly where the boards are no longer attached and providing support, so in the new building the collection will be stored horizontally in stacks which will ease the pressure on the bindings and stabilise the text block. A ‘stack’ consists of a bottom board, a stack of volumes, and a top board. The boards and the stack are secured by straps. The stacks are stored on huge carrier trays in the storage racking, each holding various permutation of stack sizes.

It all stacks up

We’ve set a maximum stack height of 400 mm for each stack. Volumes will be grouped together by condition and stacked by size, with bound volumes being alternated spine to foredge to provide a stable stack with an even weight distribution. In order to do this, we’ve undertaken a massive data gathering exercise, determining the size of every item in the collection and assigning a condition rating of good, poor, or unusable.

Size Footprint plot

The collection was divided into seven sizes or footprints, relating to the board sizes on which items will be stacked. Footprint 1 is any volume up to 380 mm (h) x 310 mm (w), while footprint 7 caters for volumes between 820-1012 mm (h) x 680-770 mm (w) – we have several hundred of these. 

It’s a wrap

Knowing the condition of each item in the collection is important if we are to direct our resources appropriately and effectively. For this project, it was even more crucial because of the handling and transport logistics involved in moving from one building to the other. To protect items that are particularly vulnerable, we are shrink-wrapping those in poor and unusable condition.

Shrink-wrapped volumes

CC by A stack of three shrink-wrapped volumes, being tested for stability

Construction

Crane

CC by One of the giant cranes is lifted into place. These will run up and down each aisle delivering carrier trays through a sealed air lock to the work stations in the support building

Crane

Workstation

CC by The workstations in the support building

Building stacks

CC by Stacks being built in a dedicated test facility

It’s no small undertaking to move such a large and vulnerable collection half way up the country, so in our third post on this topic we’ll spend some time with Moves Manager Sarah Jane Newbery to find out what the challenges are – and how it’s all progressing.

For more information on the newspaper moves see: www.bl.uk/newspaper-moves and follow us @BL_CollCare.

Sandy Ryan

06 January 2014

Scalable Preservation Environments: the nuts and bolts of digital preservation software tools

The British Library is a partner in the SCAPE Project, a Seventh Framework Programme (FP7) project co-funded by the European Union. Its aim is to enhance the state of the art of digital preservation in three ways: by developing infrastructure and tools for scalable preservation actions; by providing a framework for automated, quality-assured preservation workflows and by integrating these components with a policy-based preservation planning and watch system. Other partners include leading European libraries, universities and companies. A full list is available on the SCAPE website.

Digital preservation tools

CC by CC BY-NC 3.0

The British Library's Digital Preservation Team undertakes the R&D necessary to ensure the Library is able to implement the right technology and best practices to support digital preservation, at the right time. We have previously blogged here about our “Twelve Principles of Digital Preservation”.

Staff from the Digital Preservation Team - whilst representing the British Library’s interests within the project - lead the project in two key areas: we chair the technical coordination committee responsible for all technical developments within the project, and we lead a work package on creating and evaluating the execution of workflows for large scale digital repositories. We are also involved in two other “testbed” work packages related to web archiving and research datasets, as well as work packages surrounding the take-up of project outputs involving dissemination, demonstrations and training.

Our technical work within the project includes development and enhancement of characterisation and quality assurance tools and associated large scale workflows for characterisation of content within web archives, file format validation & identification of DRM in ebooks, and quality assured file format migration of TIFF files to JP2. Similar work by other partners includes characterisation of large audio/video files, audio migration, large scale ingest to a repository, arc to warc migration and other types of file format migration.

For execution of these tools and workflows across large scale data sets, the project uses Apache Hadoop. At the tool level however, software is discrete and can be used separately or within other large scale processing frameworks. The project is also creating services around policy-based preservation planning (Plato) and watch (Scout), and defining the necessary interfaces to enable all these entities to work together.

Some of the digital preservation tools and services that have been developed within the project include;

Tools:

xcorrSound  - a suite of tools for automated quality assurance of audio migration processes.

XCorrSound

The tools can:

  • Find overlaps between sequential audio files
  • Find occurrences of a smaller section of audio within a larger dataset
  • Compare two audio files to see how they correlate

Matchbox can automatically find duplicates images, for example duplicate scans, or match images from two separate scans of a book.

Matchbox

Jpylyzer  - a JP2 (JPEG2000 part 1) validator and properties extractor.

Jpylyzer

This tool can be used to:

  • Verify if JP2 files conform to the JP2 specification
  • Extract information about the encoding profile used for the file. This can be compared to an institutional encoding profile for verification

c3po (screencast) - a software tool for visualising and investigating the content types contained within a collection

Nanite can characterise files contained in web archives (arc/warc) without first extracting the files. The tool can be used on a Hadoop cluster.

Pagelyzer - visual, structural and hybrid comparison of web pages.

Pagelyzer

Services:

Plato is a preservation planning tool that integrates content characterisation, preservation actions and automated object comparison.

Scout is a preservation watch system that consolidates information from several sources (web, content, registries, policies) and monitors that information against a defined policy.

Scout

As you can see there is a wide variety of tools being produced or enhanced within the project. There are many more that are not listed. If you are interested in finding out more about any of these tools take a look at http://www.scape-project.eu/tools. More in-depth blog posts can be found on the Open Planets Foundation blog: http://www.openplanetsfoundation.org/blog.

William Palmer

Digital Preservation Technical Lead, SCAPE Project.