Building Infrastructure for Open Science

Thursday 10 December 2015, 6.00pm - 8.30pm

The Davidson Building, 5 Southampton Street, London WC2E 7HA. The nearest underground stations are Covent Garden and Charing Cross.

Free, but pre-booking is required.

Dr Rob Davey


The life science domain is experiencing a "coming of age" in data generation. Whilst bioinformatics is not a new field, and research is continuing to develop command line algorithms and tools for analysing the wide range of datasets that are being produced at a never-before experienced rate, researchers are now looking towards the web for the next generation of analytical tools. Furthermore, a great deal of effort is being put into building data management platforms as the amounts of data being made openly available are expanding exponentially. Developing infrastructure for infrastructure's sake, however, is not the answer - we need to be exposing interconnected decentralised software that can help push big data science into the next era.

About the speaker

Robert joined The Genome Analysis Centre (TGAC), Norwich, in February

2010 as the lead software engineer on the MISO lab information management (LIMS) project, which was released to the community as an open source framework for tracking sequencing experiments in 2012. He went on to become the Core Bioinformatics Project Leader, managing a team of developers to advance MISO as well as new projects into data infrastructure and management, and the genomic data visualisation tool, TGAC Browser. Robert was appointed as Data Infrastructure and Algorithms Group Leader in late 2012.

Prior to joining TGAC, he was a post-doctoral researcher at the Institute of Food Research (IFR) in the National Collection of Yeast Cultures (NCYC) group, providing analytical tools and bioinformatics support to help drive this important national capability. He completed his degree in Microbiology and his PhD in Bioinformatics at the University of East Anglia (UEA), the latter developing algorithms for assessing the gene content of bacterial organisms using Comparative Genomic Hybridisation microarrays. Robert's main interests are in enterprise-grade software development, data management and associated HPC infrastructure, sequence analysis and quality control pipelines, novel visualisation strategies for sequencing and biological data, metadata and the Semantic Web, and the open source ethos.

The Data Infrastructure and Algorithms group focuses on research into understanding how best to manage, represent and analyse data for open science, as well as exploring new hardware, algorithms and methodologies to develop tools to push the boundaries of data-driven informatics in the life sciences. The team applies their research expertise to develop infrastructure platforms for data and software dissemination and publication, assembly algorithms for viral and microbial metagenomics, large-scale data visualisation, and best practice and training in bioinformatics.


Attendance lists will normally be finalised on the Monday preceding each meeting but late admission may be accepted by signing in to the Davidson Building as a visitor.


PDF Icon Building Infrastructure for Open Science - Dr Rob Davey

Watch the YouTube video