The Arts and Sciences (BASc) department at University College London has been at the forefront of pioneering a renascence of liberal arts and sciences degrees in the UK. As part of its Core modules offering, students select an interdisciplinary elective in Year 2 of their academic programme – from a range of modules specially designed for the department by University College London academics and researchers.
When creating my own module – Information Through the Ages (BASC0033) – as part of this elective set, I was keen to ensure that the student learning experience was both supported and developed in tandem with professional practices and standards, knowing that enabling students to progress their skills developed on the module beyond the module’s own assignments would aid them not only in their own unique academic degree programmes but also provide substantial evidence to future employers of their employability and skills base. Partnering with the British Library, therefore, in designing a data science and data curation project as part of the module’s core assignments, seemed to me to provide an excellent opportunity to enable both a research-based educative framework for students as well as a valuable chance for them to engage in a real-world collaboration, as providing students with external industry partners to collaborate with can contribute an important fillip to their motivation and the learning experience overall – by seeing their assessed work move beyond the confines of the academy to have an impact out in the wider world.
Through discussions with my British Library co-collaborators, Mahendra Mahey and Stella Wisdom, we alighted on the Microsoft Books/BL 19th Century collection dataset as providing excellent potential for student groups to work with for their data curation projects. With its 60,000 public domain volumes, associated metadata and 1 million+ extracted images, it presented as exciting, undiscovered territory across which our student groups might roam and rove, with the results of their work having the potential to benefit future British Library researchers.
Structuring the group project around wrangling a subset of this data: discovering, researching, cleaning and refining it, with the output from each group a curated version of the original dataset we therefore felt presented a number of significant benefits. Students were enabled to explore and develop technical skills such as data curation, software knowledge, archival research, report writing, project development and collaborative working practices, alongside experiencing a real world, digital scholarship learning experience – with the outcomes in turn supporting the British Library’s Digital Scholarship remit regards enabling innovative research based on the British Library digital collections.
Students observed that “working with the data did give me more practical insight to the field of work involved with digitisation work, and it was an enriching experience”, including how they “appreciated how involved and hands-on the projects were, as this is something that I particularly enjoy”. Data curation training was provided on site at the British Library, with the session focused on the use of OpenRefine, “a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.”[1] Student feedback also told us that we could have provided further software training, and more guided dataset exploration/navigation resources, with groups keen to learn more nuanced data curation techniques – something we will aim to respond to in future iterations of the module – but overall, as one student succinctly noted, “I had no idea of the digitalization process and I learned a lot about data science. The training was very useful and I acquired new skills about data cleaning.”
Overall, we had five student groups wrangling the BL 19th Century collection, producing final data subsets in the following areas: Christian and Christian-related texts; Queens of Britain 1510-1946; female authors, 1800-1900 (here's a heatmap this student group produced of the spread of published titles by female authors in the 19th century); Shakespearean works, other author’s adaptations on those works, and any commentary on Shakespeare or his writing; and travel-related books.
In particular, it was excellent to see students fully engaging with the research process around their chosen data subset – exploring its cultural and institutional contexts, as well as navigating metadata/data schemas, requirements and standards.
For example, the Christian texts group considered the issue of different languages as part of their data subset of texts, following this up with textual content analysis to enable accurate record querying and selection. In their project report they noted that “[u]sing our dataset and visualisations as aids, we hope that researchers studying the Bible and Christianity can discover insights into the geographical and temporal spread of Christian-related texts. Furthermore, we hope that they can also glean new information regarding the people behind the translations of Bibles as well as those who wrote about Christianity.”
Similarly, the student group focused on travel-related texts discussed in their team project summary that “[t]he particular value of this curated dataset is that future researchers may be able to use it in the analysis of international points of view. In these works, many cities and nations are being written about from an outside perspective. This perspective is one that can be valuable in understanding historical relations and frames of reference between groups around the world: for instance, the work “Travels in France and Italy, in 1817 and 1818”, published in New York, likely provides an American perspective of Europe, while “Four Months in Persia, and a Visit to Trans-Caspia”, published in London, might detail an extended visit of a European in Persia, both revealing unique perspectives about different groups of people. A comparable work, that may have utilized or benefitted from such a collection, is Hahner’s (1998) “Women Through Women’s Eyes:Latin American Women in Nineteenth Century Travel Accounts.” In it, Hahner explores nineteenth century literature written to unearth the perspectives on Latin American women, specifically noting that the primarily European author’s writings should be understood in the context of their Eurocentric view, entrenched in “patriarchy” and “colonialism” (Hahner, 1998:21). Authors and researchers with a similar intent may use [our] curated British Library dataset comparably – that is, to locate such works.”
Data visualisation by travel books group
Data visualisation by travel books group
Over the ten weeks of the module, alongside their group data curation projects, students covered lecture topics as varied as Is a Star a Document?, "Truthiness" and Truth in a Post-Truth World, Organising Information: Classification, Taxonomies and Beyond!, and Information & Power; worked on an individual archival GIF project which drew on an institutional archival collection to create (and publish on social media) an animated GIF; and spent time in classroom discussions considering questions such as What happens when information is used for dis-informing or mis-informing purposes?; How do the technologies available to us in the 21st century potentially impact on the (data) collection process and its outputs and outcomes?; How might ideas about collections and collecting be transformed in a digital context?; What exactly do we mean by the concepts of Data and Information?; How we choose to classify or group something first requires we have a series of "rules" or instructions which determine the grouping process – but who decides on what the rules are and how might such decisions in fact influence our very understandings of the information the system is supposedly designed to facilitate access to? These dialogues were all situated within the context of both "traditional" collections systems and atypical sites of information storage and collection, with the module aiming to enable students to gain an in-depth knowledge, understanding and critical appreciation of the concept of information, from historical antecedents to digital scientific and cultural heritage forms, in the context of libraries, archives, galleries and museums (including alternative, atypical and emergent sources), and how technological, social, cultural and other changes fundamentally affect our concept of “information.”
“I think this module was particularly helpful in making me look at things in an interdisciplinary light”, one student observed in module evaluation feedback, with others going on to note that “I think the different formats of work we had to do was engaging and made the coursework much more interesting than just papers or just a project … the collaboration with the British Library deeply enriched the experience by providing a direct and visible outlet for any energies expended on the module. It made the material seem more applicable and the coursework more enjoyable … I loved that this module offered different ways of assessment. Having papers, projects, presentations, and creative multimedia work made this course engaging.”
Situating the module’s assessments within such contexts I hope encouraged students to understand the critical, interdisciplinary focus of the field of information studies, in particular the use of information in the context of empire-making and consolidation, and how histories of information, knowledge and power intersect. Combined with a collaborative, interdisciplinary curriculum design approach, which encouraged and supported students to gain technical abilities and navigate teamwork practices, we hope this module can point some useful ways forward in creating and developing engaging learning experiences, which have real world impact.
This blog post is by Sara Wingate-Gray (UCL Senior Teaching Fellow & BASC0033 module leader), Mahendra Mahey (BL Labs Manager) and Stella Wisdom (BL Digital Curator for Contemporary British Collections).