16 December 2015
BL Labs Awards (2015): Creative/Artistic category Award winning project
The winners of the British Library Labs Awards were announced at the British Library Labs Symposium, held on Monday 2nd November 2015, at the British Library. The Awards were launched in 2015 by the British Library Labs team in order to formally recognise outstanding and innovative work that has been created using the British Library’s digital collections and content.
This year, the Awards honoured projects within three key categories: Research, Creative/Artistic and Entrepreneurship. The winner of the Creative/Artistic Award (2015) was “The Order of Things” by Mario Klingemann.
The project involves the use of semi-automated image classification and machine learning techniques in order to add meaningful tags to the images and create thematic collections of the British Library’s one million Flickr Commons images.
Below, Mario’s guest blog discusses the award-winning project for us:
The advent of over one million images from the British Library added by the Mechanical Curator on the Flickr Commons opened up a treasure trove of imagery for creative use. But it has come with one problem: initially the only way to find interesting images was to browse the collection either randomly or linearly since the only useful metadata available was the title of the book, the publishing year and the author. This motivated me to begin using semi-automated image classification and machine learning techniques in order to add meaningful tags to the images and to start the creation of thematic collections.
Over the course of a year I have tagged tens of thousands of images in the British Library Flickr collection and I have also added lots of collections to the "Lost Visions" project by the Cardiff University, which also uses British Library material.
Furthermore, I have created various artworks with the images I have found and tagged on the Flickr Commons. You can view some of my work here:
Through my project, I wanted to investigate what kinds of material are contained in the Mechanical Curator’s selection and to what extent image classification and machine learning techniques can add useful metadata to such a huge, unsorted collection of images.
The biggest challenge is the sheer amount of data – one million doesn't sound like much these days, but if you do not have access to academic computer resources or a sponsor of computation time, then even seconds of calculation time can add up to weeks or months.
In order to address this, I have used a hybrid approach that mixes automatic classification with manual confirmation. At the base is a 128-dimensional feature vector that is calculated for each of the images in the collection. The classifier calculates histogram statistics as well as Haralick texture features over various representations of an image, by trying to turn various aspects of the image into numbers, colours, contrast, edges, structure and information content.
Using the feature vector, it calculates distances and similarities between images and allows them to either cluster by various aspects or it finds the nearest neighbours to a particular image. With the help of visual tools I have produced specifically for this purpose, I can then create thematic collections or find images that fit into certain collections or have a similar style, very quickly and in a playful way.
My classification approach works well for some cases but does not for others, for example it can distinguish a portrait from a map, but only in rare cases can it say whether a picture depicts a male or female. So in the next phase of my work I am planning to employ deep learning techniques to the same material in order to extract some more detailed metadata for certain classes of images.
Over the course of my project and research, I have posted about my progress on Twitter, Flickr, Tumblr and Facebook, where I have received very positive feedback and probably got many more people interested in exploring the British Library collections themselves. I've also talked about my research and art practice at various conferences, including at FITC, codemotion, Reasons to be Creative and Eyeo. In 2014, I was invited by the Museum of Modern Art (MoMA), New York to give a talk at "Archives as Instigator" about my work with the British Library archives , which was followed by a one day workshop.
You can find out more about Mario Klingemann and his projects online: https://www.rebelmouse.com/quasimondo/ ; https://twitter.com/quasimondo https://www.flickr.com/photos/quasimondo ; http://mario-klingemann.tumblr.com/