Scaling up to preserve the UK web

Thursday 28 November 2013, 6.00pm - 9.00pm

BCS, 1st Floor, The Davidson Building, 5 Southampton Street, London, WC2E 7HA | Map and directions

BCS Members £10, Non BCS Members £15, Students £7.50 (plus VAT @ 20%)

Helen Hockx-Yu


The British Library has been archiving UK web sites since 2004, aiming to understand the challenges involved and to build the capability to preserve the UK's digital heritage for future generations. This work has been significantly intensified since 6 April 2013, when the Non-print Legal Deposit Regulations were put in place by the UK government. This legal framework enables the UK Legal Deposit Libraries to collect digital publications as they have done with printed publications such as books and journals, in the most cost-effective and comprehensive way. The British Library undertakes periodic crawls of the UK web in its entirety on behalf of the five other Legal Deposit Libraries. This talk will give an overview of the end-to-end web archiving process, involving crawling, indexing, storing and providing access to 4-5 million UK websites. It will also touch upon the key technical, legal and business challenges.


Helen Hockx-YuHelen Hockx-Yu is Head of Web Archiving at the British Library. She leads the British Library's web archiving operations and services. A key component of the responsibility is to implement Legal Deposit of UK internet publications, involving crawling, indexing, storing and providing access to 4-5 million websites. Previously, she was Project Manager of the Planets project, a four-year project co-funded by the European Union under the Sixth Framework Programme to address core digital preservation challenges. Before joining the British Library, she worked as a Programme Manager at the UK Joint Information Systems Committee, overseeing JISC's research and development activities in the area of digital preservation.