A Geographer’s Initiation Into Digital Humanities: Part 1
A post by Dr Huw Rowlands on his Coleridge Fellowship 2025, 'Cross-cultural Encounters in the Survey of India in the Mid-nineteenth Century'.
To begin at the beginning.
My 2021 doctoral thesis focused on cross-cultural encounters in Aotearoa – New Zealand. I started with an overview of the 18th century voyage of the Endeavour, led by James Cook, to Te Moana nui a Kiwa – the Pacific Ocean. I went on to examine the histories that continue to be created about them in official reports, academic research, museum exhibitions, and documentary film.
Since then, I have been working with the many thousands of maps produced by the Survey of India held in the India Office Records (IOR) Map Collection. I soon became aware of the virtual invisibility of work by the Indian, Burmese and other staff on the maps themselves. With this tucked away at the back of my mind, I have followed my curiosity about digital humanities in British Library and other seminars and workshops, and actively followed the Library’s work on its Race Equality Action Plan. When I came across three series of printed annual reports produced by Survey of India Survey Parties, which listed all survey staff, including those they called ‘Native Surveyors’, these strands quickly came together in my mind and eventually led to my Coleridge Fellowship proposal. The Coleridge Fellowship offers British Library staff the opportunity to pursue a piece of original research and further understanding of the Library’s collections. It was established in 2017 through the generosity of Professor Heather Jackson and her late husband Professor J.R. de J. Jackson, and is named after Samuel Taylor Coleridge (1772-1834).
My aims with the Fellowship are to show the opportunities in the IOR Map Collection to identify a range of individuals involved in mapping what is called in the reports ‘British India’, to learn and demonstrate how data can be extracted and managed, and to reveal its potential in understanding cross-cultural relationships in this context.
Black Boxes
With great support from the Library’s Digital Research and Heritage Made Digital teams among others, particularly Harry Lloyd, Mia Ridge, and Valentina Vavassori, I drew up a plan for the project. The first step was to evaluate the series of reports and choose one set. The next stages are focused on digital methods: firstly to acquire and verify digital images of the chosen reports, use OCR (Optical Character Recognition) to create text files, extract and structure the information I need from them, and lastly visualise the information to create a foundation to help answer my research questions. Each of these stages looked to me like a black box – something clear and present but whose internal workings are a bit of a mystery. At an early planning meeting with the team, we started to explore each black box stage. Black boxes were unpacked onto three white boards: Inputs/Sources, Process, and Results. These initial sketches have become the foundations of my detailed research plan for the digital stages of the project.
Potentially hidden away in or between each black box were what Mia called ‘magic elves’, imaginary creatures who undertake essential but unresourced tasks such as converting information from one form to another. We unpacked the boxes and set out a series of smaller steps, banishing numerous phantom elves.
My work is currently focused on learning the skills needed to achieve each smaller step. I have been getting to grips with OCR application Transkribus, ably guided by Valentina. Crucial to making the most of such tools is referring forwards to the next digital stage and its own tools, as well as backwards to my research questions. In doing so, the image of a series of discrete black boxes has now given way to a relay race, passing a baton of information on from one stage to the next. The way I use one tool can make the transition onto the next easier or harder. So, while firmly focused on Transkribus, Harry has been guiding me through the stage that follows, so that the data baton can be passed on as smoothly as possible.
As well as relying on some unsophisticated metaphors, my vocabulary has been changing, with both some new words, and some old words with different, or more specific meanings. Regions and tags are two from Transkribus. Regions are a way of segregating areas of the original image so that Transkribus organises the text into separate sections. I have been using the pre-existing Heading and Marginalia, for example, and have added a new Region, Credit, where staff are credited with work undertaken during the year. Using regions should help the data extraction stage by enabling me to focus on areas of text where the data most useful for my research questions is to be found. Tags label individual words or phrases as entities such as People, Places and Organisations. ‘Tag’ is a short word but using tags involves a careful examination of what I need to tag and why, as well as consideration of each tag’s attributes. Transkribus’ default Person tag, for example, includes the Attributes First Name, Last name and dates of Birth and Death. To track promotion over time, I have added a new attribute – Title. Tagging is an intriguing, interpretive process and I expect to have more to say about it later in the project.
As I move onto the data extraction stage, I will no doubt be acquiring and understanding more vocabulary. I have so far spotted entities, triples, NLP, Python, LLM, and NER, to name a few. I also expect to need a new metaphor or two.
Dr Huw Rowlands
British Library Coleridge Fellow 2025
Processing Coordinator and Cataloguer
India Office Records Map Project