Monday, 4 October


10:00 – 13:30
Making Sense of Next-Gen Sequencing DataOrganized by

With the widespread deployment of second-generation sequencing and the emergence of new third-generation sequencing platforms, the extraordinary throughput of next-generation sequencing (NGS) technology is outpacing our ability to analyze and interpret the data. Researchers need productive strategies to cope and handle this deluge of data. This workshop will focus on practical informatics methods, strategies and software tools for transforming NGS data into usable information.

What you will learn:

  • NGS assembly and annotation methods
  • Tools for data analysis
  • An appraisal of commercial software and freeware
  • Case studies of analysis to support research

Short Course Instructors:

Michele Clamp, Senior Consultant, BioTeam, Inc.
Daniel Blankenberg, Ph.D., Postdoctoral Research Associate, Biochemistry & Molecular Biology, Pennsylvania State University
Qi Wang, Ph.D., Postdoctoral Scholar, German Cancer Research Center



9:30 – 10:00 am Short Course Registration

10:00 Opening Remarks
Stan Gloss, Founding Partner and Managing Director, BioTeam, Inc.

10:10 Next-Generation Sequencing Analysis and Beyond
Michele Clamp, Senior Consultant, BioTeam, Inc.
DNA sequencing technology is moving at a lightning pace. The last few years have seen the cost of sequencing drop by an order of magnitude, and the next two years seem likely to deliver another huge change. So much change in such a short period of time results in a need for radical change to our data storage, mining and analysis needs. This talk will include:
- Designing effective sequencing strategies
- Dealing with increasing amounts of sequencing data
- Efficient first pass analysis methods for read mapping and assembly
- Getting the most from RNAseq data
- Downstream annotation

10:55 Galaxy: Making NGS Analyses Accessible for All
Daniel Blankenberg, Ph.D., Postdoctoral Research Associate, Biochemistry & Molecular Biology, Pennsylvania State University
Recent rapid proliferation of DNA sequencing technology has enabled any investigator, for a modest cost, to produce enormous amounts of sequence data; however, working with this large-scale sequencing data yields significant challenges for even the largest institutions, let alone individual investigators and small labs. Here, we present Galaxy, an open-source analysis framework that is available as a free public service and able to be effortlessly deployed on both private hardware or cloud resources. The Galaxy platform empowers transparent and reproducible research by providing interactive access to popular tools, including those that allow manipulation of raw sequencing reads, mapping, peak calling, genomic interval operations, visualization at genome browsers and more, as well as a point-and-click workflow system. Using Galaxy, a user without computational expertise can, for example, perform a complete ChIP-Seq analysis beginning with raw sequencing reads and continuing through visualizing called peaks at reference or custom-built genome browsers, all without leaving the familiar interface of a web browser.

11:40 Coffee Break

12:10 Dissecting Cancer Development Using Whole-Genome Sequencing
Qi Wang, Ph.D., Postdoctoral Scholar, German Cancer Research Center
Cancers are diseases of the genome; they result from changes in the DNA sequence of the cancer-cell genomes. Recent advances in sequencing technologies present us a unique opportunity to survey a large number of whole cancer genomes. This presentation will focus on how we use whole-genome sequencing data to study how individual cancers have developed.

12:55 Closing Panel Discussion

13:30 End of Short Course


14:00 – 17:00
Creating Synergy – Introduction to Biomedical Data Fusion
Systems biology and personalized medicine increasingly require a synergistic consideration of different molecular or clinical data sets. Making such heterogeneous data available is only the first step for obtaining the big picture through a coherent analysis, i.e. data fusion. This introductory tutorial will provide a broad overview of the different options and methodologies for making the most of your data through data fusion.

  • A principled approach to data fusion
  • Powerful methods from machine learning, multivariate statistics and pattern recognition
  • How to deal with any kind of data
  • QTL mapping of omics data
  • Application examples in cancer and diabetes


13:30 - 14:00 Short Course Registration

14:00 Opening Remarks

14:05 Principles of Data Fusion

Jürgen von FreseJuergen von Frese, Ph.D., Managing Director, Data Analysis Solutions DA-SOL GmbH - Biography 





The fusion of complementary biomedical data can be used to obtain a comprehensive characterization for each sample – the “big picture” in terms of systems biology. The data could comprise clinical and patient data, microarray, proteomics and metabolomics data or even histological images.  This talk will offer a comprehensive overview on data fusion, ranging from the principles and an overall workflow to a discussion of the analytical options and approaches. It will provide a conceptual understanding of some of the major issues, pitfalls and chances. Powerful approaches from various disciplines such as bioinformatics, chemometrics and pattern recognition will be introduced.

14:40 Kernel Methods for Fusing Diverse Biomedical Data - Biography
Gunnar Ratsch Gunnar Rätsch, Ph.D., Friedrich Miescher Laboratory, Max Planck Society
Kernel methods, in particular support vector machines, have established themselves as a very powerful and versatile paradigm for learning from high-dimensional data. Kernels have been developed not only to deal with numerical data but also sequence information or even graphs representing e.g. protein-protein interaction data. Their widespread use for developing molecular signatures as well as the large number and diversity of bioinformatics applications testify the power of this approach. Adding to that the ability to combine various kernels irrespective of their underlying data type and to learn optimal combinations from the data itself provides therefore a unique tool for achieving optimal prediction performance and data understanding through data fusion.

15:15 – 15:45 Refreshment Break

15:45 Data Fusion and Network Biology of Metabolic Profiles
Marc-Emmanuel Dumas, Ph.D., Lecturer in Systems Biomedicine, Imperial College - Biography
Integration of metabolic phenotyping with other –Omics provides a systems biology approach to identify biomarkers and susceptibility genes related to the cardio-metabolic syndrome as well as other diseases. In particular, approaches such as metabolomic Quantitative Trait Locus (mQTL) mapping, or Metabolomic Genome-Wide Association Studies consist of the robust and accurate statistical integration of genome-wide genotyping (single nucleotide polymorphisms, microsatellites) and metabolome-wide profiling by NMR spectroscopy and mass spectrometry. New signal processing and statistical developments for enhancing signal recovery, locus detection and biomarker identification will be shown. Mechanistic insights derived from this systems biology approach clarify the influence of gene variants on metabolic profiles and results in a better understanding of disease phenotypes and identification of potential drug targets.

16:20 Interactive Panel Discussion

Who Should Attend:

Researchers with a basic understanding of omics data analysis who want to combine data from different sources for extracting maximal information.


14:00 – 17:00
Cloud Computing for Life SciencesOrganized by

Cycle Computing is leading the efforts for many life science organizations in using the cloud, helping research labs and companies leverage internal and external clouds for collaboration, calculations, and storage. We’ll cover real world use cases across drug discovery & design, collaboration, next generation sequencing, proteomics, software as a service, and bioinformatics, to explore how life sciences are using cloud computing, its challenges and effectiveness, how money can be saved by an organization, and regulatory compliance. Join thought leaders in this day long workshop to examine how cloud computing can be used effectively as an external IT service and an internal computing model.

Short Course Instructors:

Richard Holland, Director, Operations and Delivery, Eagle Genomics Ltd.
David Powers, Senior Analyst, Business Development, Cycle Computing
Glenn Proctor, Ensemble Software Coordinator, EMBL EBI




13:30 – 14:00 Short Course Registration


14:00 Opening Remarks

Jason Stowe, CEO, Cycle Computing

David Powers, Evangelist, Business Development, Cycle Computing


14:05 Presentation 1

Glenn Proctor, Ensemble Software Coordinator, EMBL EBI


14:40 Presentation 2

David Powers, Senior Analyst, Business Development, Cycle Computing


15:15 – 15:45 Refreshment Break


15:45 Presentation 3

Richard Holland, Director, Operations and Delivery, Eagle Genomics Ltd.


16:20 Panel Discussion


16:55 Closing Remarks

Jason Stowe, CEO, Cycle Computing

David Powers, Evangelist, Business Development, Cycle Computing




14:00 – 17:00
Visualization of Large-Scale Biological Data
Data visualization has become increasingly important for life scientists as the amount of data generated in biomedical studies continues to grow rapidly. Visual representations are powerful tools in exploring large quantities of data quickly, helping to detect patterns and generate hypotheses, which guide further analyses. This practical course will provide a comprehensive view on utilizing visualization to support the analysis of large biological data sets, and will cover interaction networks and biochemical pathways, as well as transcriptomics, proteomics and metabolomics data.

  • Visualization principles and pitfalls
  • The roles of visualization in data analysis
  • Key methods and software tools
  • Integration of visualization with automated methods: Visual Analytics
  • Future technologies


14:00 Introduction

14:10 Principles of Visualization and Pitfalls

We will discuss perceptual principles relevant for visualization, how data can be encoded using a range of different visual channels, and point out problems that might arise.

14:40 Role of Visualization in Data Analysis

In this part we will give a brief overview of how visualization is used in data analysis and clarify what visualization can deliver and what it cannot deliver.

14:50 Key Methods and Software Tools

We will introduce visualization methods and software tools implementing these methods for the major data types used in biology.

15:30 Refreshment Break

16:00 Live Demo Session

In this part of the course we will demonstrate how transcriptomics, network and genome data can be visualized and explored with Mayday, Cytoscape, the Integrative Genomics Viewer (IGV) and other tools.

16:30 Visual Analytics for Biological Data

We will briefly introduce the field of “visual analytics” and its promise for biological data analysis.

16:40 Future Technologies

In this final part of the course we will look at emerging technologies both for the generation and visualization of biological data that are expected to have an impact on how analysts visualize and interact with large data sets.

16:50 Q&A

17:00 Close of Course

Short Course Instructors:

Nils Gehlenborg, M.Sc., Functional Genomics Group, EMBL-EBI

Nils is a Ph.D. student at the University of Cambridge and the European Bioinformatics Institute and recently submitted his dissertation. His research interests are in the areas of information visualization and machine learning and he is particularly interested in the application of techniques from these fields to the exploration and interpretation of large, biological, high-throughput data sets. Over the last seven years, he has developed methods and tools for visualization and analysis of transcriptomics and mass spectrometry data, such as Mayday, Prequips and the Space Maps method. Nils co-organized the first EMBO Workshop on Visualization of Biological Data (VizBi), and is currently serving on the Board of Directors of the International Society of Computational Biology (ISCB) as the student representative.

Kay Nieselt, Ph.D., Proteomics Algorithms and Simulation, Center for Bioinformatics Tübingen

Kay has a doctorate in mathematics from the University of Bielefeld. She has worked in the area of computational biology for over 20 years. She is a group leader at the Center for Bioinformatics Tübingen. For the last 8 years, her main research interests have been computational transcriptomics and visual analytics of large-scale life science data. Her group has developed Mayday, a software framework for integrative transcriptomics, and SpRay, a visual analytics program for life science data. Kay has organized two EMBO Workshops on Computational RNA Biology. She is associate editor of the journal Algorithms for Molecular Biology.

*Separate Registration Required.

Japanese Korean Chinese Simplified Chinese Traditional 
Premier Sponsors


Hitachi Data Systems


View All Sponsors 

Premier Sponsor

Official Media Partner

Bio-IT World

View Media Partners 


Bio-IT World Events

Bio-IT World Expo Locations 
Bio-IT World Expo 

Bio-IT Cloud Summit