Mining the FT

We're pleased to announce a partnership with the Financial Times to open up its archives to new kinds of research. The business news daily newspaper has been running since 1888, and has a wealth of information on national and international economic news, and in recent years reporting on general news, the arts and society. Its digital archive is available in the standard search-and-browse manner to institutional subscribers via Cengage Gale, but the newspaper is interested to explore different ways to makes its archives available, with an emphasis on what can be done with its data.

The full digital archive runs 1888-2010 and comprises 903,029 pages from 37,464 print editions. However, the collaboration is starting off with a relatively small amount of content, which may expand later. The FT has agreed a licence which permits use of the data for academic research purposes, either onsite at the British Library or via controlled remote access.

Four complete sample years of FT pages images (as JPEGs) and data (XML) are being made available to research teams: 1888, 1939, 1966 and 1991. The licence runs to the end of 2015, when we will review what has been learned and will see how access and use may be extended thereafter. So the sample years would be ideal for researchers developing data-driven projects who need some test content to scope future plans, or to test tools or applications that they may be developing.

Anyone who is interested should get in touch with Luke McKernan, Lead Curator News & Moving Image at the British Library, who can provide further details. Research teams may also be interested be to take part in the Library's first news hackathon, scheduled for November 16th, which will include FT data alongside data derived from the Library's own news collection. More news on this will be published soon.

The collaboration with the Financial Times is one part of emerging plans for British Library news data. The structure of news content offers numerous opportunities for analysing, interrogating, visualising and rethinking what news archives today, as well as creating new kinds of newspaper and and other news media history. We held a news data workshop on September 7th, where we brought together researchers, developers and content owners to look at ways we might develop plans for news data that would best benefit researchers. There's a report on the workshop on our Digital Scholarship blog.

We will hope to be issuing news on further news archive datasets that we can make available for research in the near future.

The Newsroom blog

Mining the FT

Comments