Data Infrastructure and High Performance Computing


Day 1 | Day 2 | Download Brochure | Short Courses 

The management and analysis of the massive amounts of data generated by the life science industry is demanding for best practices involving provisioning, using and improving the critical systems and services in high-performance computing, networking, and storage.The system flexibility, sustainability, and scalability of IT infrastructure are vital to support life science research and its translation to medicine.

Monday, 8 October

Pre-Conference Short Courses 1 & 2 


Tuesday, 9 October

7:30 Conference Registration and Morning Coffee

» Plenary Keynotes 

8:00 Chairperson’s Opening Remarks

Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

8:10 Knowledge Management for Public-Private Partnerships in Translational Science: An Introduction to the IMI eTRIKS Consortium

Yike GuoYike Guo, Ph.D., Professor, Computing Science, Imperial College London

Cross-organizational Translational Research (TR) studies, as featured in many IMI projects, require the management and analysis of a wide range of clinical, pre-clinical and typically molecular assay data (e.g. genetic, omic, etc).  The objective the cross Pharma eTRIKS consortia is to provide a central KM service in IMI to support such TR studies, building on the open source tranSMART platform.  This presentation will outline the need, the formation of the consortia and the objectives of the project as it initiates in 2012.

8:55 The Cloud for Distributing and Annotating Genomic Information

Paul FlicekPaul Flicek, D.Sc., Principal Investigator & Head, Vertebrate Genomics Team, EMBL-European Bioinformatics Institute

Providing large genomic data sets such as those produced by the 1000 Genomes Project or Ensembl within the cloud dramatically expands the number of researchers who can access these data and frees them to undertake projects that would be impossible otherwise. We are adapting increasingly complex analysis and annotation pipelines for the cloud in an effort to democratize high quality genome assembly and annotation in the same way that NGS technology made the production of the sequence itself available to anyone.

9:40 EU Strategy for High Performance Computing – Leveraging Bio-IT

Aniyan VargheseAniyan VARGHESE, Ph.D., Programme Manager, C1- e-Infrastructure , DG-CONNECT, European Commission

European Commission published its HPC Strategy in 2012: High-Performance Computing: Europe's place in a Global Race. This recognises the significance of HPC for competitiveness of European industries and research. It calls for actions both in the supply of technologies, application codes and services and in their use for solving major scientific, industrial and societal problems. HPC is a crucial enabler for Bio-IT. The presentation will address how the Commission intends to implement the recommendations.

10:25 New Technologies, Investments in Data Collaborations/Personalized Medicine

Hermann HauserHermann Hauser, CBE, Partner, Amadeus Capital Partners

11:10 Coffee Break

Open to Big Data 

11:30 OpenBEL: Knowledge Engineering for the Life Sciences

Ted SlaterTed Slater, CTO, OpenBEL Consortium, Selventa

The recent emphasis on big data and cloud computing has brought with it a sharper focus on data-centricity and infrastructure convergence. While these are excellent goals in principle, they are very difficult to achieve in large part because of legacy knowledge representation and architecture choices that work primarily to create data silos. Data silos, in turn, are brittle, non-interoperable solutions that can severely hinder modern data infrastructure efforts. OpenBEL is an open source knowledge representation standard, together with a set of software tools, that can help eliminate data silos and fully enable knowledge-based life sciences research.

12:00 The Big Data and the Worldview Paradigm of Biological Science

Simon BerkovichSimon Berkovich, Ph.D., Professor of Engineering and Applied Science, Member of the European Academy of Sciences Department of Computer Science, The George Washington University

The Big Data immensity begets Bounded Rationality since explicit utilization of all available information is not feasible. Aimed at knowledge discovery, a new computational model is introduced employing implicit data selection determined holistically by context. This functionality is paralleled with Google’s PageRank yet with an online realization that can be effectively actuated through on-the-fly clustering. The suggested approach allows emulating the brain as a Big Data machine with Cloud Computing shifting the general worldview paradigm towards the presentation of biological objects in terms of the Internet of Things.

12:30 Lunch on Your Own

Make Computing Scale 

14:00 Chairperson’s Remarks

Folker Meyer, Computational Biologist, Argonne National Lab

14:05 SHOCK: A Cloud-Enabled, Federated Data Sharing Mechanism for Microbial Biology and Microbial Ecology Supporting HPC and Individual Researchers

Folker MeyerFolker Meyer, Computational Biologist, Argonne National Lab

Large scale metagenomics projects like the Earth Microbiome Project ( will create vast amounts of data and analysis results. However creating, storing and accessing this data will be a challenge for the biology community as a whole. I will present an overview of SHOCK, an emerging data management system for large scale bioinformatics projects. 


14:35 XworX - Cloud-Ready High Performance NGS Data Analysis Framework for Biomarker Screening & Disease Diagnostics

Albert KriegnerAlbert Kriegner, Ph.D., Head of Bioinformatics & Software Development Group, Health & Environment, Austrian Institute of Technology

High throughput molecular profiling technologies such as Next-Generation Sequencing (NGS) play an increasingly important role in disease diagnostics, and will also allow medicinal products to be used more efficiently allowing cost-efficient remuneration based on drug performance. We present XworX, a cloud-ready, user-friendly NGS data analysis framework that enables even small laboratories to accomplish standardized state of the art diagnostic data analysis, interpretation & reporting of highly complex genomics data on affordable desktop servers behind local firewalls.

Panasas15:05 Parallel Storage: Addressing the Bio and Life Sciences Big Data Challenge

Terry Rush, Account Executive, Panasas, Inc.

The exponentially growing volumes of data generated by Bio IT applications compound the challenge of selecting a storage infrastructure capable of linearly scaling capacity and performance. Panasas will discuss how to address this big data storage challenge with high-performance parallel storage and the parallel NFS protocol.

Quantum Europe15:20 Mastering the Data Flood in Life Sciences with Quantum StorNextVolker Flegel, System Administrator, IT, Swiss Institute of Bioinformatics
The advent of next generation sequencers are contributing to orders of magnitude more data to store, analyze and share, increasing complexities of genomic sequencing and data analysis workflows. Here we present a case study of Life Science data management at the Vital-IT High-Performance Computing centre, using StorNext Storage Manager and cluster filesystem.

15:35 Refreshment Break in the Exhibit Hall with Poster Viewing

16:15 Turning Big Data into Knowledge: Techniques and Tools for Parallel Computing on Online Data Streams in Systems Biology and Epidemiology

Marco-AldinucciMarco Aldinucci, Assistant Professor, Computer Science Department, University of Torino

Parallel computing is central to Systems Biology. We describe streaming and online data filtering in parallel computing as way to both ameliorate the I/O bottleneck and to raise the abstraction level in software construction, enhancing performance portability and time-to-solution. The design of a parallel simulator with parallel online data analysis is discussed as a paradigmatic example to demonstrate the effectiveness of the approach in multi-scale problems such as modeling infectious diseases (e.g. HIV, flu and tuberculosis) at individual and population levels.

16:45 Running Scientific Workflows on Cloud Infrastructures without Learning Cloud Technology

Peter KacsukPeter Kacsuk, Ph.D., Professor, MTA SZTAKI and University of Westminster

The European SCI-BUS project developed a generic workflow-oriented cloud platform by which scientific workflows can be developed for various clouds. Based on the generic platform application-oriented science gateways can easily be derived. Under such science gateways end-user scientists can run scientific workflows on different cloud infrastructures without learning cloud technology. The scientific workflows are automatically portable between the different cloud systems and even more, different nodes of the workflows can run on different clouds in parallel.

17:15 Panel Discussion: Scale-Up and Scale-Out: Virtual Machines for Big-Data Analysis

Opening presentation: Benzi Galili , Executive Vice President, ScaleMP


Yike Guo, Ph.D., Professor, Computing Science, Imperial College London


Folker Meyer, Computational Biologist, Argonne National Lab

Benzi Galili , Executive Vice President, ScaleMP

Marco Aldinucci, Assistant Professor, Computer Science Department, University of Torino

Peter Kacsuk, Ph.D., Professor, MTA SZTAKI and University of Westminster

17:45 Welcome Reception with Exhibit and Poster Viewing

18:45 Close of Day

Day 1 | Day 2 | Download Brochure | Short Courses 

Japanese Korean Chinese Simplified Chinese Traditional 
Premier Sponsors


Hitachi Data Systems


View All Sponsors 

Premier Sponsor

Official Media Partner

Bio-IT World

View Media Partners 


Bio-IT World Events

Bio-IT World Expo Locations 
Bio-IT World Expo 

Bio-IT Cloud Summit