DHBN 2024 - Digital Humanities in the Nordic and Baltic Countries Conference Report
This is a joint blog post by Helena Byrne, Curator of Web Archives, Harry Lloyd, Research Software Engineer, and Rossitza Atanassova, Digital Curator.
This year’s Digital Humanities in the Nordic and Baltic countries conference took place at the University of Iceland School of Education in Reykjavik. It was the eight conference which was established in 2016, but the first time it was held in Iceland. The theme for the conference was “From Experimentation to Experience: Lessons Learned from the Intersections between Digital Humanities and Cultural Heritage”. There were pre-conference workshops from May 27-29 with the main conference starting on the afternoon of May 29 and finishing on May 31. In her excellent opening keynote Sally Chambers, Head of Research Infrastructure Services at the British Library, discussed the complex research and innovation data space for cultural heritage. Three British Library colleagues report highlights of their conference experience in this blog post.
Helena Byrne, Curator of Web Archives, Contemporary British & Irish Publications.
I presented in the Born Digital session held on May 28. There were four presentations in this session and three were related to web archiving and one related to Twitter (X) data. I co-presented ‘Understanding the Challenges for the Use of Web Archives in Academic Research’. This presentation examined the challenges for the use of web archives in academic research through a synthesis of the findings from two research studies that were published through the WARCnet research network. There was lots of discussion after the presentation on how web archives could be used as a research data management tool to help manage online citations in academic publications.
The conference programme was very strong and there were many takeaways that relate to my role. One strong theme was ‘collections as data’. At the UK Web Archive we have just started to publish some of our inactive curated collections as data. So these discussions were very useful. One highlight was the ‘Panel: Publication and reuse of digital collections: A GLAM Labs approach’. What stood out for me in this session was the checklist for publishing collections as data. It was very reassuring to see that we had pretty much everything covered for the release of the UK Web Archive datasets.
Rossitza and I were kindly offered a tour of the National and University Library of Iceland by Kristinn Sigurðsson, Head of Digital Projects and Development. We enjoyed meeting curatorial staff from the Special Collections who showed us some of the historical maps of Iceland that have been digitised. We also visited the digitisation studio to see how they process periodicals, and spoke to staff involved with web archiving. Thank you to Kristinn and his colleagues for this opportunity to learn about the library’s collections and digital services.
Harry Lloyd, Research Software Engineer, Digital Research.
DHNB2024 was a rich conference from my perspective as a research software engineer. Sally Chambers’ opening keynote on Wednesday afternoon demonstrated an extraordinary grasp of the landscape of digital cultural heritage across the EU. By this point there had already been a day and a half of workshops, including a session Rossitza and I presented on Catalogues as Data.
I spent the first half using a Jupyter notebook to explain how we extracted entries from an OCR’d version of the catalogue of the British Library’s collection of 15th century books. We used an explainable algorithm rather than a ‘black-box’ machine learning one, so we walked through the steps involved and discussed where it worked well and where it could be improved. You can follow along by clicking the ‘launch notebook’ button in the ReadMe here.
Handing over to Rossitza in the second half to discuss her corpus linguistic analysis worked really well by giving attendees a feel for the complete workflow. This really showed in some great conversations we had with attendees over the following days about tricky problems like where to store the ‘true’ results of OCR.
A few highlights from the rest of the conference were Clelia LaMonica’s work using Latin large language model to analyse kinship in texts from Medieval Burgundy. Large language models trained on historic texts are important as the majority are trained on modern material and struggle with historical language. Jørgen Burchardt presented some refreshingly quantitative work on bias across a digitised newspaper collection, very reminiscent of work by Kaspar Beelen. Overall it was a productive few days, and I very much enjoyed my time in Reykjavik.
Rossitza Atanassova, Digital Curator, Digital Research.
This was my second DHNB conference and I was looking forward to reconnecting with the community of researchers and cultural heritage practitioners, some of whom I had met at DHNB2019 in Copenhagen. Apart from the informal discussions with attendees, I contributed to DHNB2024 in two main ways.
As already mentioned, Harry and I delivered a pre-conference workshop showcasing some processes and methodology we use for working with printed catalogues as data. In the session we used the corpus tool AntConc to perform computational analysis of the descriptions for the British Library’s collection of books published in the 15th century. You can find out more about the project here and reuse the workshop materials published on Zenodo here.
I also joined the pre-conference meeting of the international GLAM Labs Community held at the National and University Library of Iceland. This was the first in-person meeting of the community in five years and was a productive session during which we brainstormed ‘100 ideas for the GLAM Labs Community’. Afterwards we had a sneak peak of the archive of the National Theatre of Iceland which is being catalogued and digitised.
The DHNB community is so welcoming and supportive, and attracts many early career digital humanists. I was particularly interested to hear from doctoral students researching the use of AI with digitised archives, and using NLP methods with historical collections. One of the projects that stood out for me was Johannes Widegren’s PhD research into the ethical use of AI to enable access and discovery of Sami cultural heritage, and to develop library and archival practice.
I was also interested in presentations that discussed workflows for creating Named Entity Recognition resources for historical archives and I plan to try out the open-source Label Studio tool that I learned about. And of course, the poster session is always a highlight and I enjoyed finding out about a range of projects, including computational analysis of Scandinavian runic-texts, digital reconstruction of Gothenburg’s 1923 Jubilee exhibition, and training large language models to track semantic variation in climate change vocabulary in Danish news articles.
We are grateful to all DHNB24 organisers for the warm welcome and a great conference experience, with special thanks to the inspirational and indefatigable Olga Holownia!