Bio-IT World and Cambridge Healthtech Institute’s Fifth Annual
High-Scale Computing
Turning Big Genomics Data into Smart Data
4-5 December 2013 | Sheraton Lisbon Hotel & Spa | Lisbon, Portugal
Day 1 | Day 2 | Download Brochure | Biographies
Thursday, 5 DECEMBER
08:00 Morning Coffee
08:30 Chairperson’s Opening Remarks
Maria Burpee, Marketing Manager, EMEA Healthcare, Dell
08:35 Preprocessing of High-Throughput Sequencing Data Speeds Up Targeted Research
Tomasz Konopka, Ph.D., Research Scientist, Sebastian Nijman Laboratory, Informatics and Synthetic Biology, CeMM, Research Center for Molecular Medicine, Austrian Academy of Sciences, Austria
Information encoded in high-throughput sequencing reads can be valuable for targeted hypothesis-driven research questions. It is thus worthwhile to access relevant portions of large datasets in fastq format without costly processing of the remainder. The TriageTools suite available on SourceForge provides fast utilities for preprocessing fastq data. It includes a method for extracting reads likely to map onto predefined regions of interest and extracting data on a few DNA- or RNA-seq samples’ target genes with speedup factors up to ~90.
09:05 Compression Models for DNA Sequences
Armando J. Pinho, Ph.D., Director, Instituto de Engenharia Electrónica e Telemática de Aveiro (IEETA) and Associate Professor, Departamento de Electrónica, Telecomunicações e Informática (DETI), Universidade de Aveiro, Portugal
Research in the genomic sciences is confronted with the volume of sequencing and resequencing data increasing at a higher pace than that of data storage and communication resources, shifting a significant part of research budgets from the sequencing component of a project to the computational one. Hence, being able to efficiently store sequencing and resequencing data is a problem of paramount importance. In this talk, we will present and discuss DNA sequence compression models that we have been developing during the past years.
09:35 HPC for Life Sciences: Accelerate Discovery and Insights through Optimized Infrastructure and Support
Kris Buggenhout, HPC Solutions Architect, Dell EMEA
Proliferation of low-cost, efficient Next Generation DNA Sequencing (NGS) systems is accelerating the creation of large scale databases, driving increasing demand for compute and storage resources to aggressively analyze their content. This presentation will outline how a high performance computing environment can be optimized for the needs of genomic analysis, enabling organizations to accelerate time-to-insight with easy-to-deploy, open standards-based architectures designed for performance, scalability and efficiency.
10:05 Coffee Break in the Exhibit Hall with Poster Viewing
10:45 Management of Genomic Big Data in a Country-Wide Collaborative Initiative for Rare Disease Gene Finding
Joaquin Dopazo, Ph.D., Head, Computational Genomics, Centro de Investigacion Principe Felipe, Spain
About 1000 exomes were analyzed in a nationwide initiative to find disease genes in many inherited diseases. This flood of DNA and RNA-seq data led to pipeline optimization for NGS data analysis, including accelerating runtimes and increasing sensitivity in mapping and variant calling processes, the development of new visualization tools and the development of new systems-biology-based candidate gene prioritization methods. The Medical Genome Project and the Spanish network for Rare Diseases constitute an example of a collaborative nationwide genome project.
11:15 Bringing the Big Brain Computer to the Cloud: SGI UV for Cloud-Based Genomics Workflows
James Reaney, Senior Director, Research Markets, SGI
Building on years of experience with Cyclone™, SGI announces a new collaborative project to bring cloud-based computational resources to genomics workflows worldwide. SGI will showcase several of its computational and storage technologies in the project but chief among these is the SGI UV platform: the “Big Brain” supercomputing system which already powers several large genomics research facilities worldwide. A very brief, high-level overview of the project and it’s collaborative approach will be given, along with a discussion of the initial goals.
PLENARY KEYNOTE
11:45 From Genome Annotation to Genome Medicine
Timothy Hubbard, Ph.D., Senior Group Leader, Wellcome Trust Sanger Institute, United Kingdom
12:35 Close of Conference
Day 1 | Day 2 | Download Brochure | Biographies