30 April 2013
Biohumanities Symposium at University of Maryland
An excellent and exciting symposium took place just over a fortnight ago at the Maryland Institute for Technology in the Humanities (MITH): Shared Horizons: Data, Biomedicine and the Digital Humanities, 10-12 April 2013. The twitter tag #DHbio succinctly signifies its outlook.
It felt like a special moment on the timeline of interdisciplinary research, and as its title hints it brought together active researchers from digital humanities and art technology centres, biochemistry laboratories, bioinformatics computational units, complex science and visualisation institutes, and notably medical and national libraries.
The conference was strengthened by the presence of senior representatives from research councils in medicine, arts and humanities and biology in both the UK and the USA. A sense of occasion was further heightened by the reception at the Deputy’s Residence of the UK Embassy at Washington DC. The British Deputy Head of Mission Mr Philip Barton welcomed us by noting that one of the two key priorities for him and his staff based in the USA is the promotion of collaboration between US and UK researchers. The reception was supported by Research Councils UK.
The University of Maryland was certainly a suitable venue for not only does MITH bring technology and humanities together with a ground-breaking verve unsurpassed anywhere, but the university is the base of Kari Kraus who has written one of the definitive papers on the interface between biology and humanities, adopting the term ‘biohumanities’ in her writings: Conjectural Criticism: Computing Past and Future Texts. It is one of those papers that rewards the reader who returns to it from time to time.
One way to characterise the symposium is simply to note the first and the last of the talks. The first was by a historian, Tom Ewing, who has been employing segmentation and network analyses to understand the relationship between newspaper reporting on influenza and the spread of the disease virus itself. The last was by a mathematical physicist, Simon DeDeo, who highlighted coarse-graining, renormalisation and semantic analysis in identifying the key junctures in the history of the Old Bailey’s (Central Criminal Court of England and Wales) verdicts over the centuries, a possible route to assessing judicial procedures.
The phrase ‘interdisciplinary research’ points to an aspiration which is often expressed but truly attained less commonly. It was not used as much as you might expect at the symposium, probably because the significance of the conjunction of biology and humanities was already well understood by the people attending. Instead of bemusement and hesitation there was a beaming sense of anticipation.
Sometimes when technology and humanities are brought together in the same sentence it reflects a desire to apply scientific technologies and computer techniques for the benefit of humanities scholarship. This is an important and longstanding goal. John Unsworth of Brandeis University in his opening address reminded us just how long humanities researchers and historians have been using computers: since the 1940s with the Jesuit scholar Roberto Busa (see for example a paper by Willard McCarty).
But what made this meeting different from my perspective was the opportunity it presented to explore the intellectual interface of science and humanities. The use of technology in a new context often produces novel and overlapping understanding in the course of the research, but some research goes further and actively aims to find shared and parallel concepts: shared horizons, indeed.
One person whose research has certainly done so was the keynote speaker: David B. Searls of the University of Pennsylvania.
The title of his talk was “With a Wild Surmise: Intimations of Computational Biology in Keats, Carroll, and Joyce”.
It began with reference to the poet John Keats, making a comparison between, on the one hand, the Romanticism and its reaction to the rationalism of the Enlightenment, and, on the other hand, some of the changes occurring in science with the emergence of very large datasets, systems biology and computational modeling. It noted an increasing interest in understanding whole systems and complexity combined with a broad exploration of information with less emphasis on specific hypotheses. Of course, scientists have always been drawn to feasible observations and available data. In the words of the late British immunologist Sir Peter Medawar, Science is the Art of the Soluble; but it is clear that some careful statistical thought needs to be given to the way that big datasets are being compiled and used widely. (A brief presentation by Sabina Leonelli of the University of Exeter contemplated a similar philosophical theme.)
A riveting portion of the talk mapped the extraordinary demonstrations of styles of writing and esoteric sublanguage by the author James Joyce to the challenges found in modern natural language processing and text mining.
The part of the talk that had the greatest resonance in drawing comparisons with biology was that which addressed Lewis Carroll. It is closest to the bioinformatic research that Professor Searls has conducted over many years. A very enjoyable and informative paper is entitled: “From Jabberwocky to Genome: Lewis Carroll and Computational Biology”.
Images of Alice from British Library collection items: a Shower of Cards and a manuscript "Alice's Adventures Under Ground"
This paper looks at the work of the mathematician Charles Lutwidge Dodgson otherwise known as Lewis Carroll, the writer of Alice’s Adventures in Wonderland (1865) and Through the Looking Glass, and What Alice Found There (1871). In reflecting on Carroll’s delight in wordplay and combinatorial nonsense, Searls playfully shows some parallels with bioinformatic and evolutionary concepts such as mutation, recombination, the occurrence of indels (insertions and deletions), segmentation and even sequence alignment.
Examples include the derivation of the word SLITHY in the poem Jabberwocky from a recombination of the words SLIMY and LITHE; and likewise the word CHORTLE obtained by recombining CHUCKLE and SNORT is another of Carroll’s neologisms. By contrast the word BRILLIG (said by Humpty Dumpty to mean four o’clock in the afternoon when broiling things for dinner begins) might be seen as a mutational product of BRoIL[L]InG. Searls observes that Carroll’s puzzles known as Syzygies, seem to have “anticipated key elements of multiple alignment, minimum distance alignment, and local alignment that are now central to biological sequence analysis”.
These two diagrams are from the article published by David B. Searls in the Journal of Biological Computation: From Jabberwocky to Genome: Lewis Carroll and Computational Biology
As with nearly all play there is a serious side to these explorations that is made plain in other papers by Searls, with titles such as “The linguistics of DNA”, “The language of genes” and “A primer in macromolecular linguistics”.
Along with noting the many linguistic metaphors that have become part of molecular biology: genetic code, gene expression, reading frames, transcription of DNA into RNA, and translation of RNA into proteins, with some enzymes editing RNA, Searls and colleagues have argued for many years that a number of bioinformatic techniques are akin to those in linguistics, and understanding of the genome and biological sequences can be enhanced through careful examination of linguistic formalisms and methods.
The approach is proving useful in understanding structural aspects of nucleic acids (DNA and RNA) as well as the proteins that are produced; and has been extended to the identification and understanding of genes themselves, as informational entities. The Chomsky hierarchy of language has played a prominent role in the approaches outlined, whereby the syntactic nature of a sentence may be depicted as a branching structure (actually, a kind of tree visualisation; see earlier blog posts on 3D trees and phlyogenetic visualisation). (Noam Chomsky is also known for the theory of a universal grammar which is the subject of debate; but no matter where one stands on that topic, the Chomsky hierarchy is a beautifully effective way for elucidating and visualising aspects of language, as illustrated in "The linguistics of DNA", American Scientist 80(6): 579-591, 1992, available from JSTOR.)
A new avenue was brought to the bioinformatic way of thinking by a presentation at Shared Horizons on the study of metre in Urdu poetry through sequential analysis, a collaboration between an Urdu expert (Sean Pue) and a researcher of microbial ecology (Tracy Teal) and their genomics colleague (Titus Brown).
The UK was well represented at the symposium. Not only was I able to give an invited talk on behalf of the Department of Digital Scholarship at the British Library but delegates included academics from University of Cambridge, University of Exeter, University of Manchester and Imperial College London.
Andrew Prescott, Head of Digital Humanities at King’s College London (and former Curator of Manuscripts at the British Library), did some quick and proactive thinking, gathering together three prominent speakers from the UK for an impromptu, brief but richly useful session, including a presentation by Christopher Howe of University of Cambridge on the use of phylogenetic methods to study manuscripts.
This blog post has barely touched on the symposium and its outcomes. I hope to write more another time. In the meantime some of the presentations can be found on the Shared Horizons website.
Many thanks to Professor Neil Fraistat, Director of MITH, and his colleagues, especially Jennifer Guiliano and Trevor Muñoz.
Jeremy Leighton John, @emsscurator