Digital scholarship blog

Enabling innovative research with British Library digital collections

207 posts categorized "Experiments"

23 January 2018

Using Transkribus for handwritten text recognition with the India Office Records

In this post, Alex Hailey, Curator, Modern Archives and Manuscripts, describes the Library's work with handwritten text recognition.

National Handwriting Day seems like a good time to introduce the Library’s initial work with the Transkribus platform to produce automatic Handwritten Text Recognition models for use with the India Office Records.

Transkribus is produced and supported as part of the READ project, and provides a platform 'for the automated recognition, transcription and searching of historical documents'. Users upload images and then identify areas of writing (text regions) and lines within those regions. Once a page has been segmented in this way, users transcribe the text to produce a 'ground truth' transcription – an accurate representation of the text on the page. The ground truth texts and images are then used to train a recurrent neural network to produce a tool to transcribe texts from images: a Handwritten Text Recognition (HTR) model.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2018-01-22/8f108ba6-3247-429a-995c-6db42a4d3d7f.png
Page segmented using the automated line identification tool. The document structure tree can be seen in the left panel.

After hearing about the project at the Linnean Society’s From Cabinet to Internet conference in 2015, we decided to run a small pilot project using material digitised as part of the Botany in British India project.

Producing ground truth text and Handwritten Text Recognition (HTR) models

We created an initial set of ground truth training data for 200 images, produced by India Office curators and with the help of a PhD student. This data was sent to the Transkribus team to produce our first HTR model. We also supplied material for the construction of a dictionary to be used alongside the HTR, based on the text from the botany chapter of Science and the Changing Environment in India 1780-1920 and contemporary botanical texts.

The accuracy of an HTR model can be determined by generating an automated transcription, correcting any errors, and then comparing the two versions. The Transkribus comparison tool calculates a Character Error Rate (CER) and a Word Error Rate (WER), and also provides a handy visualisation. With our first HTR model we saw an average CER of 30% and WER of 50%, which reflected the small size of the training set and the number of different hands across the collections.

(Transkribus recommends using collections with one or two consistent hands, but we thought we would push on regardless to get an idea of the challenges when using complex, multi-authored archives).

Doc18776img16
WER and CER are quite unforgiving measures of accuracy. The image above has 18.5% WER and 9.5% CER

For our second model we created an additional 500 pages of ground truth text, resulting in a training set of 83,358 words over 14,599 lines. We saw a marked improvement in results with this second HTR model – an average WER of 30%, and CER of 15%.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2018-01-22/a59e02fd-b126-424b-97c8-57aa42172c10.png
Graph showing the learning curve for our second HTR model, measured in CER

Improvements in the automatic layout detection and the ability to run the HTR over images in batch means that we can now generate ground truth more quickly by correcting computer-produced transcriptions than we could through a fully-manual process. We have since generated and corrected an additional 200 pages of transcriptions, and have expanded the training dataset for our next HTR model.

Lessons learned and next steps

We have now produced over 800 pages of corrected transcriptions using Transkribus, and have a much better idea of the challenges that the India Office material poses for current HTR technologies. Pages with margins and inconsistent paragraph widths prove challenging for the automatic layout detection, although the line identification has improved significantly, and tends to require only minor corrections (if any). Faint text, numerals, and tabulated text appeared to pose problems for our HTR models, as did particularly elaborate or lengthy ascenders and descenders.

More positively, we have signed a Memorandum of Understanding with the READ project, and are now able to take part in the exciting conversations around the transcription and searching of digitised manuscript materials, which we can hopefully start to feed into developments at the Library. The presentations from the recent Transkribus Conference are a good place to start if you want to learn more.

The transcriptions will be made available to researchers via data.bl.uk, and we are also planning to use them to test the ingest and delivery of transcriptions for manuscript material via the Universal Viewer.

By Alex Hailey, Curator, Modern Archives and Manuscripts

If you liked this post, you might also be interested in The good, the bad, and the cross-hatched on the Untold Lives blog.

22 January 2018

BL Labs 2017 Symposium: Data Mining Verse in 18th Century Newspapers by Jennifer Batt

Dr Jennifer Batt, Senior Lecturer at the University of Bristol, reported on an investigation in finding verse using text and data-mining methods in a collection of digitised eighteenth-century newspapers in the British Library’s Burney Collection to recover a complex, expansive, ephemeral poetic culture that has been lost to us for well over 250 years. The collection equates to around 1 million pages, around 700 or so bound volumes of 1271 titles of newspapers and news pamphlets published in London and also some English provincial, Irish and Scottish papers, and a few examples from the American colonies.

A video of her presentation is available below:

Jennifer's slides are available on SlideShare by clicking on the image below or following the link:

Datamining for verse in eighteenth-century newspapers
Datamining for verse in eighteenth-century newspapers

https://www.slideshare.net/labsbl/datamining-for-verse-in-eighteenthcentury-newsapers 

 

 

30 December 2017

The Flitch of Bacon: An Unexpected Journey Through the Collections of the British Library

Digital Curator Dr. Mia Ridge writes: we're excited to feature this guest post from an In the Spotlight participant. Edward Mills is a PhD student at the University of Exeter working on Anglo-Norman didactic literature. He also runs his own (somewhat sporadic) blog, ‘Anglo-Normantics’, and can be found Tweeting, rather more frequently, at @edward_mills.

Many readers of [Edward's] blog will doubtless be familiar with the work being done by the Digital Scholarship team, of which one particularly remarkable example is the ‘In the Spotlight‘ project. The idea behind the project, for anyone who may have missed it, is absolutely fascinating: to create crowd-sourced transcriptions of part of the Library’s enormous collection of playbills. The part of the project that I’ve been most involved with so far is concerned with titles, and it’s a two-part process; first, the title is identified out of the (numerous) lines of text on the page, and once this has been verified by multiple volunteers, it is then fed back into the database as an item for transcription.

PlaybillsPizarro
In the Spotlight interface

Often, though, the titles alone are more than sufficient to pique my interest. One such intriguing morsel came to light during a recent transcribing stint, when I found myself faced with a title that raised even more questions than Love, Law, & Physic:

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2017-12-21/85a34802-64e9-4beb-8156-9aa1517413cd.png
Playbill for a performance of The Flitch of Bacon

In my day-job, I’m actually a medievalist, which meant that any play entitled The Flitch of Bacon was bound to pique my interest. The ‘flitch’ refers to an ancient – and certainly medieval –  custom in Dunmow, Essex, wherein couples who could prove that they had never once regretted their marriage in a year and a day would be awarded a ‘flitch’ (side) of bacon in recognition of their fidelity. I first came across the custom of these ‘flitch trials’ while watching an episode of the excellent Citation Needed podcast, and was intrigued to learn from there that references to the trials existed as far back as Chaucer (more on which later). The trials have an unbroken tradition stretching back centuries, and videos from 1925, 1952 and 2012 go some way towards demonstrating their continuing popularity. What the British Library project revealed, however, was that the flitch also served as the driver for artistic creation in its own right. A little bit of digging revealed that the libretto to the 1776 Flitch of Bacon farce has been digitised as part of the British Library’s own collections, and the lyrics are every bit as spectacular as one might expect them to be.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2017-12-21/36b47ae7-9dc4-48dc-8d5a-3e023eae6f27.png
Rev. Henry Bate, The Flitch of Bacon: A Comic Opera in Two Acts (London: T. Evans, 1779), p. 24.

So far, so … unique. But, of course, the medievalist that dwells deep within me couldn’t resist digging into the history of the tradition, and once again the British Library’s collections came up trumps. The official website for the Dunmow Flitch Trials (because of course such a thing exists) proudly asserts that ‘a reference … can even be found within Chaucer’s 14th-century Canterbury Tales‘, which of course can easily be checked with a quick skim through the Library’s wonderful catalogue of digitised manuscripts. The Wife of Bath’s Prologue opens with the titular wife describing her attitude towards her first three husbands, whom she ‘hadde […] hoolly in myn honde’. She keeps them so busy that they soon come to regret their marriage to her, forfeiting their right to ‘the bacoun …that som men fecche in Essex an Donmowe’ in the process:

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2017-12-21/8e410cff-7b1c-4413-ae03-635c2f58fac9.png
‘The bacoun was nought fet for hem I trowe / That som men fecche in Essex an Donmowe’. From the Wife of Bath’s Tale (British Library, MS Harley 7334, fol. 89r).

Chaucer’s reference to the flitch custom is frequently taken, along with William Langland’s allusion in Piers Plowman to couples who ‘do hem to Donemowe […] To folwe for the fliche’, to be the earliest reference to the tradition that can be found in English literature. Once again, though, the British Library’s collections can help us to put this particular statement to the test; as you’ve probably guessed by now, they show that there is indeed an earlier reference to the custom waiting to be found.

Baconanglonorman

Our source for this precocious French-language reference is MS Harley 4657. Like many surviving medieval manuscripts, this codex is often described as a ‘miscellany’: that is, a collection of shorter works brought together into a single volume. In the case of Harley 4657, the book appears to have been designed as a coherent whole, with the texts copied together at around the same time and sharing quires with each other; this is perhaps explained by the fact that the texts contained within it are all devotional and didactic in nature. (Miscellanies that were, by contrast, put together at a later date are known as recueils factices – another useful term, along with the ‘flitch of bacon’, to slip into conversation with friends and family members.) The bulk of the book is taken up by the Manuel des pechez, a guide to confession that was later translated into English by Robert Manning as Handling Synne. It’s in this text that the flitch custom makes an appearance, as part of a description of how many couples do not deserve any recompense for loyalty on account of their mutual mistrust (fol. 21):

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2017-12-21/7f90c385-77be-4cdb-9160-94c0aa7ce873.png

17 October 2017

Imaginary Cities – Collaborations with Technologists

Posted by Mahendra Mahey (Manager of BL Labs) on behalf of Michael Takeo Magruder (BL Labs Artist/Researcher in Residence).

In developing the Imaginary Cities project, I enlisted two long-standing colleagues to help collaboratively design the creative-technical infrastructures required to realise my artistic vision.

The first area of work sought to address my desire to create an automated system that could take a single map image from the British Library’s 1 Million Images from Scanned Books Flickr Commons collection and from it generate an endless series of everchanging aesthetic iterations. This initiative was undertaken by the software architect and engineer David Steele who developed a server-side program to realise this concept.

David’s server application links to a curated set of British Library maps through their unique Flickr URLs. The high-resolution maps are captured and stored by the server, and through a pre-defined algorithmic process are transformed into ultra-high-resolution images that appear as mandala-esque ‘city plans’. This process of aesthetic transformation is executed once per day, and is affected by two variables. The first is simply the passage of time, while the second is based on external human or network interaction with the original source maps in the digital collection (such as changes to meta data tags, view counts, etc.).


Time-lapse of algorithmically generated images (showing days 1, 7, 32 and 152) constructed from a 19th-century map of Paris

The second challenge involved transforming the algorithmically created 2D assets into real-time 3D environments that could be experienced through leading-edge visualisation systems, including VR headsets. This work was led by the researcher and visualisation expert Drew Baker, and was done using the 3D game development platform Unity. Drew produced a working prototype application that accessed the static image ‘city plans’ generated by David’s server-side infrastructure, and translated them into immersive virtual ‘cityscapes’.

The process begins with the application analysing an image bitmap and converting each pixel into a 3D geometry that is reminiscent of a building. These structures are then textured and aligned in a square grid that matches the original bitmap. Afterwards, the camera viewpoint descends into the newly rezzed city and can be controlled by the user.

Takeo_DS-Blog3-2_Unity1
Analysis and transformation of the source image bitmap
Takeo_DS-Blog3-3_Unity2
View of the procedurally created 3D cityscape

At present I am still working with David and Drew to refine and expand these amazing systems that they have created. Moving forward, our next major task will be to successfully use the infrastructures as the foundation for a new body of artwork.

You can see a presentation from me at the British Library Labs Symposium 2017 at the British Library Conference Centre Auditorium in London, on Monday 30th of October, 2017. For more information and to book (registration is FREE), please visit the event page.

About the collaborators:

Takeo_DS-Blog3-4_D-Steele
David Steele

David Steele is a computer scientist based in Arlington, Virginia, USA specialising in progressive web programming and database architecture. He has been working with a wide range of web technologies since the mid-nineties and was a pioneer in pairing cutting-edge clients to existing corporate infrastructures. His work has enabled a variety of advanced applications from global text messaging frameworks to re-entry systems for the space shuttle. He is currently Principal Architect at Crunchy Data Solutions, Inc., and is involved in developing massively parallel backup solutions to protect the world's ever-growing data stores.

Takeo_DS-Blog3-5_D-Baker
Drew Baker

Drew Baker is an independent researcher based in Melbourne Australia. Over the past 20 years he has worked in visualisation of archaeology and cultural history. His explorations in 3D digital representation of spaces and artefacts as a research tool for both virtual archaeology and broader humanities applications laid the foundations for the London Charter, establishing internationally-recognised principles for the use of computer-based visualisation by researchers, educators and cultural heritage organisations. He is currently working with a remote community of Indigenous Australian elders from the Warlpiri nation in the Northern Territory’s Tanami Desert, digitising their intangible cultural heritage assets for use within the Kurdiji project – an initiative that seeks to improve mental health and resilience in the nation’s young people through the use mobile technologies.

26 September 2017

BL Labs Symposium (2017), Mon 30 Oct: book your place now!

Bl_labs_logo

Posted by Mahendra Mahey, BL Labs Manager

The BL Labs team are pleased to announce that the fifth annual British Library Labs Symposium will be held on Monday 30 October, from 9:30 - 17:30 in the British Library Conference Centre, St Pancras. The event is FREE, although you must book a ticket in advance. Don't miss out!

The Symposium showcases innovative projects which use the British Library’s digital content, and provides a platform for development, networking and debate in the Digital Scholarship field.

Josie-Fraser
Josie Fraser will be giving the keynote at this year's Symposium

This year, Dr Adam Farquhar, Head of Digital Scholarship at the British Library, will launch the Symposium and Josie Fraser, Senior Technology Adviser on the National Technology Team, based in the Department for Digital, Culture, Media and Sport in the UK Government, will be presenting the keynote. 

There will be presentations from BL Labs Competition (2016) runners up, artist/researcher Michael Takeo Magruder about his 'Imaginary Cities' project and lecturer/researcher Jennifer Batt about her 'Datamining verse in Eighteenth Century Newspapers' project.

After lunch, the winners of the BL Labs Awards (2017) will be announced followed by presentations of their work. The Awards celebrates researchers, artists, educators and entrepreneurs from around the world who have made use of the British Library's digital content and data, in each of the Awards’ categories:

  • BL Labs Research Award. Recognising a project or activity which shows the development of new knowledge, research methods or tools.
  • BL Labs Artistic Award. Celebrating a creative or artistic endeavour which inspires, stimulates, amazes and provokes.
  • BL Labs Commercial Award. Recognising work that delivers or develops commercial value in the context of new products, tools or services that build on, incorporate or enhance the British Library's digital content.
  • BL Labs Teaching / Learning Award. Celebrating quality learning experiences created for learners of any age and ability that use the British Library's digital content.
  • BL Labs Staff Award. Recognising an outstanding individual or team who have played a key role in innovative work with the British Library's digital collections.  

The Symposium's endnote will be followed by a networking reception which will conclude the event, at which delegates and staff can mingle and network over a drink.  

Tickets are going fast, so book your place for the Symposium today!

For any further information please contact [email protected]

04 August 2017

BL Labs Awards (2017): enter before midnight Wednesday 11th October!

Posted by Mahendra Mahey, Manager of of British Library Labs.

The BL Labs Awards formally recognises outstanding and innovative work that has been created using the British Library’s digital collections and data.

The closing date for entering the BL Labs Awards (2017) is midnight BST on Wednesday 11th October. Submit your entry, and help us spread the word to all interested parties over the next few months or so. This will ensure we have another year of fantastic digital-based projects highlighted by the Awards!

This year, BL Labs is commending work in four key areas:

  • Research - A project or activity which shows the development of new knowledge, research methods, or tools.
  • Commercial - An activity that delivers or develops commercial value in the context of new products, tools, or services that build on, incorporate, or enhance the Library's digital content.
  • Artistic - An artistic or creative endeavour which inspires, stimulates, amazes and provokes.
  • Teaching / Learning - Quality learning experiences created for learners of any age and ability that use the Library's digital content.

After the submission deadline of midnight BST on Wednesday 11th October for entering the BL Labs Awards has past, the entries will be shortlisted. Selected shortlisted entrants will be notified via email by midnight BST on Friday 20th October 2017. 

A prize of £500 will be awarded to the winner and £100 to the runner up of each Awards category at the BL Labs Symposium on 30th October 2017 at the British Library, St Pancras, London.

The talent of the BL Labs Awards winners and runners up over the last two years has led to the production of a remarkable and varied collection of innovative projects. In 2016, the Awards commended work in four main categories – Research, Artistic, Commercial and Teaching & Learning:

  • Research category Award (2016) winner: 'Scissors and Paste', by M. H. Beals. Scissors and Paste utilises the 1800-1900 digitised British Library Newspapers, collection to explore the possibilities of mining large-scale newspaper databases for reprinted and repurposed news content.
  • Artistic Award (2016) winner: 'Hey There, Young Sailor', written and directed by Ling Low with visual art by Lyn Ong. Hey There, Young Sailor combines live action with animation, hand-drawn artwork and found archive images to tell a love story set at sea. The video draws on late 19th century and early 20th century images from the British Library's Flickr collection for its collages and tableaux and was commissioned by Malaysian indie folk band The Impatient Sisters and independently produced by a Malaysian and Indonesian team.
BL Labs Award Winners 2016
Image: 'Scissors and Paste', by M. H. Beals (Top-left)
'Curating Digital Collections to Go Mobile', by Mitchell Davis; (Top-right)
 'Hey There, Young Sailor',
written and directed by Ling Low with visual art by Lyn Ong; (Bottom-left)
'Library Carpentry', founded by James Baker and involving the international Library Carpentry team;
(Bottom-right) 
  • Commercial Award (2016) winner: 'Curating Digital Collections to Go Mobile', by Mitchell Davis. BiblioBoard, is an award-winning e-Content delivery platform, and online curatorial and multimedia publishing tools to support it to make it simple for subject area experts to create visually stunning multi-media exhibits for the web and mobile devices without any technical expertise, the example used a collection of digitised 19th Century books.
  • Teaching and Learning (2016) winner: 'Library Carpentry', founded by James Baker and involving the international Library Carpentry team. Library Carpentry is software skills training aimed at the needs and requirements of library professionals taking the form of a series of modules that are available online for self-directed study or for adaption and reuse by library professionals in face-to-face workshops using British Library data / collections. Library Carpentry is in the commons and for the commons: it is not tied to any institution or person. For more information, see http://librarycarpentry.github.io/.
  • Jury’s Special Mention Award (2016): 'Top Geo-referencer -Maurice Nicholson' . Maurice leads the effort to Georeference over 50,000 maps that were identified through Flickr Commons, read more about his work here.

For further information about BL Labs or our Awards, please contact us at [email protected].

21 July 2017

Russian Language Books Research Project by Nadya Miryanova

Finding digitised books in the Russian language in a collection of 65,000 books

Posted by Nadya Miryanova BL Labs School Work Placement Student, currently studying at Lady Eleanor Holles, working with Mahendra Mahey, Manager of BL Labs.

Background

Although there are 200 million items in the British Library, contrary to popular belief, only 1-2% of these items are digitised. The ‘Microsoft’ books are 65,000 digitised volumes - about 22.5 million pages, and they were published between 1789 and 1914; digitised in partnership with Microsoft. They cover a wide range of subject areas including topics such as philosophy, poetry and history and they include Optically Character Recognised (OCR) text from the millions of pages.

In discussion with Mahendra Mahey, Project Manager of BL Labs, we explored making a ‘sub collection’ from this larger set which will hopefully be of use to the library in the future. At first, I simply brainstormed possible ideas and looked at different possibilities for this project, and I thought that since 2017 celebrates a century since the Russian Revolution, I would do some research into the concept of ‘revolution’.

Revolution

Definition - A forcible overthrow of a government or social order, in favour of a new system.

Etymology - Late latin ‘revolvere’, meaning to roll back, which turned into the Old French or Late Latin ‘revolutio’, from which came about our contemporary English word ‘revolution’.

Revolutions date back to as early as 2730 BC, where there was a set rebellion against the reign of the pharaoh Seth-Peribsen of the Second Dynasty of Egypt. The most recent revolution actually happened only last year in 2016, when there was a Turkish coup d'état attempt.

About the Russian Revolution

The British Library have recently opened an exhibition perfectly capturing not only the events that took place in this particularly intense period in history, but also the atmosphere that was omnipresent at the time and on my very first day here at the British Library, I got the chance to explore and study this fascinating exhibition in great depth.

The Russian Revolution was initiated by Lenin and the Bolsheviks, who hoped to create a socialist government, and in 1917, they successfully dismantled Tsarist autocracy in the hope of making society less stratified. The revolution resulted in the rise of the USSR and in the words of Karl Liebknecht, “The Russian revolution was to an unprecedented degree the cause of the proletariat of the whole world becoming more revolutionary”. However, this revolution also led to months of social and political turmoil and provoked the tragedy of the Russian Civil War on an unforeseeable scale, in which 10 million lives were lost. The revolution also produced myths that entered the artistic and intellectual fabric of the modern world, which the exhibitions uncovers and investigates. Learn more about the Russian Revolution by booking your tickets for the Russian Revolution Exhibition at the British Library on the website http://goo.gl/FL9FFt.

Russian Revolution Poster
Russian Revolution Exhibition Poster at the British Library

As part of my research project, I also wanted to incorporate some of the other subjects that I had studied at GCSE, and so I thought this would be a brilliant opportunity to compare the Russian Revolution to the French Revolution, both French and Russian being subjects that I wish to at A-level. The French Revolution was a period of far-reaching social and political upheaval in France that lasted from 1789 until 1799, and was partially carried forward by Napoleon during the later expansion of the French Empire.

Below is a mind-map I made detailing the differences and similarities between the French and the Russian Revolution.

Russian and French Revolution Research
French and Russian Revolution Comparison

Although my initial focus for the project was revolution, we soon established that it was too specific a topic and it would be more beneficial to focus on something broader, that would be useful to a larger group of researchers.

I soon discovered that the Russian titles within the digitised collection had never previously seperated and categorised, and being a native Russian speaker, I thought that this would be a better avenue to go down and explore. This would be a project in commemoration of the 100th anniversary of the Russian Revolution, which would hopefully help researchers looking at books in the Russian language in the future.

Facts about the Russian Language

  • Largest European native language.
    • 7th most spoken language in the world.
  • There are only 200,000 words in the Russian language in comparison to 1,000,000 in English.
  • The stress pattern in a word can drastically change its meaning, e.g. :
    • я плачу  (emphasis on second syllable) - I pay.
    • я плáчу (emphasis on first syllable) -I cry.

Approach

My first task included examining a huge spread sheet containing information about the 65,000 books in the collection.

  • In order to make this task a little less daunting, I first used the ‘Filter’ function in the language column of my Excel spreadsheet, and selected the Russian language. As a result, I found 583 books in total that were written in the Russian Language.
  • I now had to think of a way to organise these books. The possibilities seemed endless, should I sort them into history books? Science books? Books about Russia?
  • In the end, I decided to establish two broad categories as a starting point, fiction vs non-fiction, as this seemed like a logical place to start.
  • In order to access the Russian keyboard, I went onto the site translit.net, which turns normal Latin letters into Cyrillic.
  • I typed in a Russian word, using the English keyboard, that related to one of my two categories, e.g. for non-fiction, I wanted to find history related books, so used the simple word ‘history’, which translates as история.
  • I then copied this word, and pasted it into my spreadsheet.
  • I used the filter function on the 'Titles' section, and this would hopefully produce a number of books that included the word history in their title.
Spread Sheet Screenshot
Screenshot of my spread sheet.


Challenges

In this project, I found that I had to overcome a number of difficulties.

  • In Russian, nouns can have up to 12 inflections and adjectives can have as many as 16. This clearly shows that looking up different versions of the same word was necessary.
  • Like I previously said, I first experimented with simple words, such as history. You would think that there would definitely be books relating to history lurking somewhere in a collection of nearly 600 Russian titles. However, when I conducted my search, the spread sheet had no results. Confused, I tried another simple word, and once again had no definitive results.

Scanning more closely through the list of books, I soon noticed that there were certain spellings and letters that I did not recognise. I decided to research this matter more closely, looking at the history of the Russian language, and found out that the Russian of the 19th century does not directly resemble the Russian language used today. Why? Because of the Russian Revolution, of course.

1918 Spelling Reform Research
Bolshevik Spelling Reform of 1918 Research, detailing the causes for the reform and the changes made to the Russian language

Suddenly, everything made a lot more sense.

This discovery meant that I had to change my approach a little bit, so rather than typing in the Russian words in the spelling that I knew today, I would have to go for a sort of hunt throughout the spreadsheet, looking for words in the titles of the books that could encompass a number of books. In a way, this made the process of my project even more interesting, despite the fact that it took longer.

As I mentioned in my previous blog, the majority of the Russian language books were actually non-fiction. As a result, I decided to create sub-categories for the non-fiction set, which can be seen in the speech-bubble I created below.

Non-fiction categories
Speech bubble containing non-fiction categories

To help me in this task, I decided to create a colour-coding system for classification, so that I could keep track of my progress.

  • Yellow=Classified
  • Purple= латиницa (latin letters)- quite often I found titles which where written in Russian but using latin letters. Purple also used for titles written in another language
  • Blue=unknown classification
  • Orange= near classification
Colour coding system
Screenshot of my spread sheet showing the colour coding system that I used.

Evaluation

In conclusion, I managed to categorise the Russian language books into two broad categories, fiction and non-fiction, and I created 25 sub-collections within the non-fiction category. This project has been extremely enjoyable to work on, and although there were many challenges involved in the process, I have learnt lots during my research journey. In order to improve this project, I would definitely say that more work needs to be done on splitting up the 'history' sub-collection of my non-fiction title, since it is very broad and covers political accounts, as well as books about Russian History. Additionally, I think that this project would also considerably benefit from undergoing a thorough check with curators, in order to help classify some of the books I have not organised into separate collections yet. 

Picture from Russian Book
An illustration from one of the Russian books, По Сѣверо-Западу Россіи, available in the digitised collections. Image can be accessed on British Library Flickr Commons.

 

 

Through the British Library Looking Glass - A Continuation of Nadya Miryanova's Work Experience

Posted by Nadya Miryanova BL Labs School Work Placement Student, currently studying at Lady Eleanor Holles, working with Mahendra Mahey, Manager of BL Labs.

Day 6

Despite the fact that a week of my work experience here has already elapsed, I still can’t quite believe that I am lucky enough to find myself in this magnificent institution, let alone have access to ‘staff-only’ areas and actually be able to work here. One thing I particularly love is that I can enter the library in the early morning, before official opening hours, and see it evolve from a certain peaceful stillness to its usual excited buzz of activity as the day progresses and watch as the library is brought to life once more by the people that visit it.

Photo of me at the book tower
A photograph of me by the book tower in the British Library

Previously, in a very serious and sophisticated catch-up session (including, of course, only work-related matters), Mahendra had discovered that I was a huge fan of the Harry Potter series. Although this subject may seem quite unexpected and completely out of context in this blog, it is actually very relevant, since on the next day, Mahendra had informed me that I would be able to meet the Harry Potter curator. This was something that caught me completely by surprise, but it also shamelessly sparked a child-like excitement within me, having loved the franchise ever since I was seven. A meeting was set for Monday morning, and I waited, with some impatience, to meet Julian Harrison, the curator of medieval manuscripts and also the man who was involved in the organisation of the Harry Potter exhibition.

People looking at exhibition
People looking at an exhibition in the British Library

During the meeting, I was able to gain an insight into the working life of a curator. Julian explained the sorts of things involved in this role, and also talked more about the exhibitions themselves, where inspiration comes from, as well as previous exhibitions and their structure. 

In addition to this, I was able to find out lots of details about the Harry Potter exhibition (it’s fascinating and definitely worth a visit, trust me!). Furthermore, we had an in-depth discussion about the Harry Potter series itself, and we talked about some of the key themes as well as key characters in the books. You’ll soon be able to find out more about the exhibition too, be sure to book your tickets early and visit the British Library to be part of what will truly be a magical experience!

Phoenix
A preview of the "Harry Potter- A History of Magic" Exhibition, coming soon on 27th October 2017

In the afternoon, I went to a classical music concert at the British Museum. As I stepped into the light interior of the museum, I felt a hundred memories instantly come to mind, dating back to various visits with my family and numerous school projects over the years. The British Library and British Museum singers presented a concert performance of ‘Trial by Jury’, an opera in one act, with music by Arthur Sullivan and libretto by W. S. Gilbert. ‘Trial by Jury’ is set at a Court of Justice in 1876. The defendant, Edwin, has recently promised to marry a beautiful woman, Angelina, but has since changed his mind, for which reason Angelina is now suing him for Breach of Promise. After a multitude of entertaining events, involving the Jury, the public, the Usher, and many comic disagreements over the issue, a decision is finally reached. The Judge decides the only real logical solution to the problem is to marry Angelina himself, resulting in happiness for all parties. The choir then performed Te Deum, op 103, by Dvorak, a true choral masterpiece, and the performance itself was very moving.

Although the choir was relatively small in number, their bright and beautiful voices resonated across the room, creating a light-hearted and friendly atmosphere, upheld by the choir’s energy and enthusiasm. I always love seeing how music can unite people to interpret a piece together, and each member was fully involved in this collaborative effort to create stunning music, making the performance an unquestionable success.  

Choir
The British Museum and British Library Singers

When I returned to the office, I checked my e-mails and saw that Laurence Roger, Project Support Officer in the Collections Division, had very kindly offered to help me examine a book about Catullus’ poetry. The book that I eventually saw was dating back to the 18th century, and I spent the last section of my day looking at this book with Laurence, who is very nice, and I felt extremely lucky to be able to have access to it.

Book pic
One of the books that Laurence herself had lent me to look at.

Day 7

My seventh day of work experience arrived, and almost as soon as I got into the office, I set up my desk and eagerly launched straight into my working day. My morning consisted of independent work, where I further developed my research project and carried on with the interview storyboard for Hannah-Rose Murray, a finalist of the BL Labs competition in 2016. Her project was centred on black American activists in the 19th century, particularly their speeches and lectures from the 1830s to the 1890s. This was a period of history that I previously knew little about, and so I enjoyed learning about the influence that black Americans had on British society and seeing the way Hannah went around creating her project, bringing history to life. Read more about her project here. 

Locations of Frederick Douglass
Map displaying the locations of Frederick Douglass’ lectures in the United Kingdom and Ireland, a small section of Hannah-Rose Murray's project

At 12:30, I attended a Welcome Day at the British Library, and this presented me with an excellent opportunity to not only find out more about the different departments of the library, but also to tell some new members of staff about some of the work the Digital Scholarship Department does (I was also provided with a free lunch, always a bonus!). I talked to a variety of departments, ranging from Human Resources to Publishing and Retail, and everyone was extremely friendly, helpful and accommodating.

In the afternoon, I worked independently once again, more specifically on a YouTube transcription of an interview with Melodee Beals, a 2016 research award winner, who created an amazing project entitled ‘Scissors and Paste’. This project utilises the 1800-1900 British Library Newspapers collection to explore the possibilities of mining large-scale newspaper databases for reprinted and re purposed news content.

Melodee presenting her project
Melodee Beals presenting her project, 'Scissors and Paste'

After finishing my working day, I decided to wonder around and explore the British Library. The amazing thing about this place is that it really does resemble a maze, I constantly find myself discovering new places and rooms, with each day presenting something new and different to the previous one.

Day 8

As I entered the lift, I looked at the hard copy of my schedule, and I noticed that a meeting with a fashion company and members of the British Fashion Council was fixed that very morning. Feeling suddenly a little more self-conscious than usual about my appearance, I glanced cautiously in the mirror that was in the lift and my reflection stared back, wondering if anything could be done to cover the consequences that a malfunctioning alarm clock and getting ready in five minutes that morning could bring. After a few fruitless attempts of trying to somehow tame my hair, I finally accepted defeat and entered the meeting room.

The meeting at 9 o’clock was with a luxury womenswear brand. During the meeting, Mahendra introduced BL Labs, showing a presentation that informed the company about Digital Scholarship and detailed previous projects that the department had worked on, including ‘Burning Man’. A project with the fashion company was then initiated, which would involve the Library's collections, and some possible ideas for the project were also brainstormed. The fashion company talked more about their collections and how ideas for projects generally come about. It is inspiring to think how each individual collection, whether an assortment of garments or a literary exhibition of novels, tells its own unique story, and I found out that in many ways the research for the project is itself a sensational journey.

After this meeting, I returned back to my desk and had a quick catch-up with Mahendra, where we evaluated the YouTube transcription work, and the general progress made over the first half of this week. To finish off, I was whisked off to another meeting, this time with Wayne Boucher, a photographer who has a very big interest in beautiful stain-glass windows, and will be keeping in contact with the British Library to promote this stunning artwork.

Tiffany stain glass window
A Tiffany stain-glass window

Day 9

In the morning, I hurriedly entered the British Library through the staff entrance, as usual, but instead of walking over to the doors of the lift, I took a sharp right turn, and walked over to the Post Room. Mahendra had previously organised for me to visit the Post Room with Peter Clarke, Service Delivery Manager, Messenger/Post Service, and today I would be having a tour of certain sections of the building that are off bounds to not only the general public, but also to many members of staff. I was able to see the process of delivery take place, and even help with this crucial procedure, without which many of the library books that researchers and readers need would not be available. I was shown the delivery room by Keiran Duncan-Johnson, Late Team Leader LMS, Messenger/Post Service, Finance Division, and this was a huge, open space, which once more reminded me of the sheer scale of the place. 

I was also kindly shown round other areas of the library  I was previously unfamiliar with by Keiran, such as the modern languages sector and the Alan Turing Institute, both of which are incredible departments that work tirelessly to make great leaps in their corresponding fields of study to change the world for the better.

Alan Turing institute
The Alan Turing Institute

The afternoon commenced with a meeting with the music curator, Chris Scobie. For the second time that day, I was lucky enough to visit a new area of the library that is of limited access, and Chris showed me the music reading room, and most notably, the basement. The basement is where all the music scores and manuscripts lie, and needless to say, I was incredibly excited. As we browsed through the shelves of the collections, I saw multiple familiar names of composers, such as Bach, Beethoven and Brahms, and I even got to read and touch some of Elgar’s letters to Vaughan Williams and look at his original manuscript for his Enigma Variations!  

Elgar Manuscript
A digitised version of the original Elgar manuscript for the theme of the Enigma Variations

Day 10

As I walked down the second floor corridor, I soon came to face the wooden door of the office for what it seemed was the last time. I sighed and a miserable thought came into my head, as I began to contemplate what on earth I was going to do with myself on Monday, when I was no longer going to work here. However, I soon brushed it off, and decided to make the most of my final day at the British Library.

Door to office
The door to the office of the Digital Scholarship Department

My final day consisted of making concluding touches to my numerous projects, including refining and making last minute edits to some of the transcriptions I had done. I then met Christin Hoene from the University of Kent, who was working on a project that was based on the concept of sound within novels. I was able to show her some of the work that I did on Excel with my independent research project, which can be accessed here.

At lunchtime, rather than eating in the staff canteen as usual, I decided to eat my lunch in a free reading space in the centre of the library, whilst reading my book, ‘Mother Tongue’ by Bill Bryson. What I love most about libraries is that there are so many untold stories hiding in the shelves, and I feel like I could sit comfortably in here for hours. In fact, in the space of an hour, you could travel to as many as 10 countries, should you only have the will to open a few different books and immerse yourself in their stories. As Lloyd Alexander once said “Books can truly change our lives: the lives of those who read them, the lives of those who write them. Readers and writers alike discover things they never knew about the world and about themselves”.

Lloyd Alexander quotation
Another great Lloyd Alexander quotation

Lastly, and most importantly, I would like to say a huge thank you to everyone who has made this experience a possibility for me, especially Mahendra, who has not only been very kind and patient, but has also provided me with so many wonderful opportunities and has helped me hugely with a multitude of different things. I have always loved books since a young age, and to be surrounded by so many was in itself very special, but to be able to work in the library and help the Digital Scholarship Department was just incredible. My experience here has taught me multiple valuable things, which is something I am eternally grateful for.

The same way I would never judge a book by its front cover, I will not judge a building by its name, for the British Library is infinitely more than just a residence for books. It is a museum in which there are many exhibitions, it is a research centre, and most importantly, it is an institution that stores the world’s knowledge behind its brick walls.

The-British-Library
The British Library

Inspiration can really come from absolutely anywhere, and from something small you can make something incredibly vast. It makes you think what you could do and what a difference it could make, if only you just choose to try. Inevitably, in life, you have to take risks, but more often than not, lots of these are worth taking in an attempt to brighten and bring artistic colour as well as creativity to the world. In the words of Stephen King, “books are a uniquely portable magic”, something which certainly rings true within the walls of this institution, where so many items are kept and so many new ones are constantly being acquired and discovered.

So, I send a big thank you to the British Library and all who work here, for making what was essentially a childhood dream into a reality and this will truly be a chapter of my life that I will always remember.

Nadya Miryanova

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs