Track 4 Header


Transform and manage data from -omics technologies, biomarker discovery and systems biology to improve R&D productivity and decision making across the organization.


14:00-17:30 Pre-Conference Workshop: Sequencing Data Storage
Co-organized by CHI and BioTeam
Matthew Trunnell, Manager, Research Computing, Broad Institute
Chris Dagdigian, Founding Partner and Director of Technology, BioTeam, Inc.
Guy Coates, Ph.D., Group Leader, Informatics System Group, The Wellcome Trust Sanger Institute
Additional Speakers to be Announced
BioTeam has been on the frontlines of next-generation sequencing integration, having helped several organizations with the unique next-gen IT, storage, and data management challenges. This workshop will present real-world customer experiences straight from the trenches. You’ll get practical information about the storage support needed by research organizations in the next-gen world.

* Separate registration required



Leveraging Informatics to Drive Innovation &
Productivity in Drug Discovery

13:50 Chairperson’s Opening Remarks

14:05 Drug Innovation 2.0 – Why we Need Knowledge Metrics for Democratic Action

Jörg Kurt Wegner, Ph.D., Scientist, Integrative Chem-/Bio-Informatics, Tibotec (J&J, Belgium); Blogger, Mining Drug Space; Project Administrator, Open Source Development

In this presentation we share three current challenges for innovation in drug design. 1) Group dynamics – Scientific collaboration typically occurs through networks. Inefficient collaboration and networking can lead to tunnel vision; innovation opportunities through more distant networks may be overlooked. Knowledge metrics can encourage new and productive networks, by appropriately rewarding contributing network members. 2) Information overload – Information and knowledge are not the same thing. We have too much (unstructured) information and we lack the time required to structure this information into knowledge. Knowledge metrics can encourage the ranking, structuring, and accessing of information on a reward-per-use basis. 3) Data silos – Data silos caused by technical or license hurdles, can reduce the number of efficient collaboration options. Knowledge metrics should clearly reward barrier-breaking and silo-bridging efforts, and favor new and diverse over redundant information…Drug designers, lawyers and computer scientists have a unique opportunity to take on these challenges – drug innovation 2.0!

14:35 Has Knowledge Management’s Time Finally Come?

Jerry Lanfear, Ph.D., Head, Data Support and Management, Research CoEs, Pfizer

Knowledge Management is a discipline that emerged some 12-15 years ago, promising a structured way to ensure companies extracted maximum value from all their data. However, it’s true to say that it has never really delivered on that promise. This talk will discuss the possible reasons for this and why emerging technologies such as Web 2.0 and the increasingly sophisticated means by which we can capture, store and manage our data may mean that all that is about to change. This will be a forward looking talk illustrating Pfizer’s experience in implementing Knowledge Management strategies.

Leveraging Databases and Data Mining to
Improve the Drug Discovery Process

15:05 Centralizing Discovery Information: From Logistics to Knowledge

Julen Oyarzabal, Ph.D., Head, Drug Discovery Informatics Section, Spanish National Cancer Research Centre (CNIO)

This presentation will discuss a successful implementation and integration of a chemical and biological information management system in a publicly-funded research center. This platform, centralizing discovery information, is utilized on a daily-basis for more than 50 researches via web GUI: Retrieving data for decision-making and assay-planning. Currently, data from more than 250 different assays are available for a proprietary database containing more than 40,000 unique compounds. This drug discovery informatics platform, which covers the whole process and centralizes the information in an academic/publicly funded environment, is quite attractive for many starting and on-going research initiatives in this area and will be applicable for attendees whether they be from an academic, governmental or private institution.

15:35-16:00 Refreshment Break

Sponsored by
Biobase Logo
16:00 Analysis of integrated networks to identify biomarker-relevant components

Edgar Wingender, Ph.D., President & CSO, BIOBASE; Professor and Director, Dept. Bioinformatics, UMG, Univ. of Goettingen

Characterizing the specifics of different biological networks (e. g. gene regulatory,
signal transduction and metabolic networks) is a prerequisite for their proper integration in order to achieve a comprehensive representation of a eukaryotic cell. Using such an integrated view on cellular networks, it is possible to map large –omics datasets and to infer the causes of the molecular and the associated clinical phenotype under study. We will discuss the required knowledge bases as well as novel concepts for identifying those network components that are relevant to the etiology, the diagnostics and eventually the therapy of a given disease.

16: 15 Sponsored Presentation Sponsored By NextBio 
Gene and Sequence-Centric Framework For Mining Large Collections of Public and Proprietary Data
Ilya Kupershmidt, Co-founder and Vice President Products, NextBio
Microarrays and next generation sequencing technologies produce data at an unprecedented rate. Most organizations now expect their research and informatics programs to leverage a combination of both proprietary and public data. Bringing this data together in a way that offers researchers a unified view of transcriptomic, proteomic, epigenetic, copy-number and sequence variation events requires a platform that can interrogate data from a gene as well as sequence perspective. This type of integrated platform is required to support today’s target selection, biomarker discovery, and drug repositioning research processes. In this session we will present strategies of integrating public and proprietary data from modern high-throughput technologies and applying computational methods to explore it at the level of genes, SNPs, sequence regions and pathways.

16:30 Automated Document-Based Abstract Annotation, Data Linking and Network Generation

Reinhard Schneider, Ph.D., Team Leader, Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL)

We will present web-based applications that apply biological named entity recognition to enrich any web page or Microsoft Office, PDF and plain Abstract documents. The input files are converted into the HTML format and then sent to the Reflect tagging server, which highlights biological entity names like genes, proteins and chemicals, and attaches to them JavaScript code to invoke a summary popup window. The window provides an overview of relevant information about the entity, such as a protein description, the domain composition, a link to the 3D structure and links to other relevant online resources. The services are also able to extract the bioentities mentioned in a set of files and to produce a graphical representation of the networks of the known and predicted associations of these entities.

17:00 Information Silo Breaker -- An AstraZeneca Approach to Leverage Information & Knowledge Sharing

Mats Sundgren, Ph.D., Principal Scientist, Global Clinical Development, AstraZeneca R&D

This presentation will discuss different aspects of how Clinical Development in AZ has made changes to implement a new information sharing model that is moving towards a more proactive information sharing with transparent processes, language and principles to provide safe and secure information sharing (including data and knowledge) across projects and beyond in AstraZeneca R&D. Topics will include: assumptions and prerequisites of information sharing, information sharing model and new roles (information advocates), experiences from the implementation process, change management aspects, and communication and stakeholder management aspects.

17:30 Close of Day’s Sessions

18:30 BIOTECHNICA Night Original Bavarian Beer Hall, full dinner reception, a traditonal German Band




Enabling Translational Research and
Biomarker Discovery through Informatics

08:50 Chairperson’s Remarks

09:05 Integrated R&D: Bridging Research and Development to Drive Innovation in Pharma

Jakob DeVlieg, Ph.D., Global Head, Molecular Design & Informatics, Schering-Plough

The availability of complete genome sequences and vast amounts of structural information on targets and target-ligand complexes have stimulated many efforts to rationalize the drug design process. It is believed that ‘omics’ and informatics may create many opportunities to speed up the multidisciplinary drug discovery process, and provide novel approaches to the design of drugs otherwise not possible. However, low productivity and high late-stage attrition continue to challenge the pharmaceutical industry. Integrated R&D research approaches and genomics-based methods are needed to address the attrition problem and to increase productivity. The role of translational sciences and advances in bioinformatics, cheminformatics & genomics technologies in drug discovery and development will be discussed.

09:35 Co-Presentation with Ortho-Biotech Oncology R&D, Inc. & Centocor R&D, Inc.: Two Sides of the Coin – Adventures in Translational Science

Hans Winkler, Ph.D., Senior Director, Global Head, Oncology Biomarkers,
Ortho-Biotech Oncology R&D, Inc.

Sándor Szalma, Ph.D., Senior Research Fellow, R&D Informatics, Centocor R&D, Inc.

We will present our experience in developing and applying informatics strategies in the conAbstract of translational science. Special attention will be paid to crucial scientific questions of drug discovery and development such as biomarker discovery, indication selection and pros and cons of applying animal model data for decision support. The informatics approaches applied in our organization will be extensively discussed including data warehousing, data curation, ontologies, semantic technologies, cloud computing, and high-performance computational algorithms.

Sponsored by
Tessella Logo
10:05 Effective TM Deployment: How You Can Make the Most of What You Already Know

Andrew Chadwick, Ph.D., Principal Consultant, Life Sciences and Healthcare, Tessella

Translational medicine has great potential for saving lives and wasted development costs. However, the new options that TM brings can complicate R&D planning: extra tasks, designed to ‘fail fast’, can actually add elapsed time to successful projects. As pharmaceutical companies blur the boundaries between R&D, and set up more independent, commercially aware development teams, there is increasing pressure to show cost-effective deployment of this new way of working. This presentation will show how it is possible to make better use of existing knowledge to improve the effectiveness of experimentation, decision-making and planning. It will compare the current ‘best-practice’ approaches in drug R&D and in satellite design.

10:35 Coffee Break

Integrating Chemogenomics
and Systems Biology Data

11:00 Case Study: "Virtual" Data Integration using Babylon Enterprise

Christian Hauck, Knowledge Management and Competitive Intelligence, Novartis Pharma AG

Data Integration  - overcoming isolated silos - is an ambitious undertaking. There are various approaches trying to harmonize and connect  data within an enterprise, and with the outside world. Babylon Enterprise is an unusual complementary approach: the tool recognizes text content form the screen, and then looks up related information extracted from glossaries based upon data stored in primary, even transactional systems. It has no semantic knowledge - but it behaves as if. You will learn about the basic concepts, and about the experience of introducing an unusual tool in a big company.

11:30 What is a Drug Target? Development of a Molecular Interaction Ontology

Richard Bickerton, Medicinal Informatics, Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee

The advent of large-scale chemogenomics resources provides an invaluable, global view of pharmacological space explored to Date. However, the representation of biological activity as a compound-protein pair over-simplifies the biological reality. Several key target families present multiple distinct binding sites in the form of non-competitive, allosteric or co-factor sites, either on the same or on different structural domains, each of which may have been the subject of medicinal chemistry studies. Furthermore, many key molecular targets comprise multiple biological macromolecules, often with stoichiometric contributions from the products of multiple genes. Available structural and chemogenomic information has been harnessed to describe these relationships in the form of a molecular interaction ontology for drug targets, the development and application of which will be described.

12:00 Chemogenomics - Data Integration and Analysis

John Overington, Ph.D., Team Leader, EMBL-EBI

The drug discovery industry and academic community has been largely unsuccessful in delivering on the therapeutic promise of the genomic revolution. The reasons for this are hotly debated, but are clearly of central importance in considering allocation of resources to new therapeutic programs. We will outline our core data resources, now placed in the public domain, that allow unbiased informatics annotation and analysis of genome, pathway, and target data with drug discovery relevant information. Various applications of this to ‘target prioritization’, ‘lead optimization’ and ‘drug reuse’ will be described.

12:30-13:45 Lunch for Purchase in the Exhibit Hall and Exhibit Viewing

Knowledge Management for Pathway Analyses

13:50 Chairperson’s Remarks
Yuriy Gankin, Ph.D., Chief Scientific Officer, GGA Software Services, LLC GGA Software 

14:05 Novel Information Integration Approaches to Current Challenges of Analyzing and Visualizing Network Data and Pathways

Dr. Mario Albrecht, Group Leader, Molecular Networks in Medical Bioinformatics, Max Planck Institute for Informatics

Sophisticated bioinformatics methods are required to integrate, analyze, and visualize rapidly increasing amounts of molecular data for medical systems biology. This talk will present novel computational approaches on how to deal with the avalanche of high-throughput datasets and how to transform them into biological knowledge, providing valuable insight into cellular processes. Possible solutions to current challenges of integrative and visual data exploration will be demonstrated for large-scale interaction networks and specific pathways.

14:35 Sponsored By Ontotext 
A Reason-Able View to the Web of Pathway Data
Vassil Momtchev, Group Leader, Semantic Life Science Applications, Ontotext
Linked Data standards and technologies already gained popularity as a platform for data integration and analysis in the life science and health care domain. In this talk we present the Linked Life Data project and the integration of a large pathway knowledge base with other public resources by semantic instance mapping and text-mining. Thus, you can interlink proprietary sources with linked data from the public cloud and put the internal information interpretation into different contexts that allows you to make more interesting queries.


15:05 Emerging Cross Pharma Collaboration - How Can Services Make the Difference?

Nick Lynch, Ph.D., Chemistry Domain Expert, AstraZeneca

An Open Source initiative ( has been established to provide the foundation of data standards, ontologies and associated web-services to enable the discovery workflow through common business terms, relationships and processes. Initial focus has been on chemistry, biological screening and sample logistics. Everyone is challenged by the technical inter-conversion, collation and interpretation of discovery data. Thus, there is a vast amount of duplication, conversion and testing that could be reduced if a common foundation of data standards, ontologies and web-services could be promoted. This would allow interoperability between a traditionally diverse set of technologies to benefit the healthcare sector. We will describe current progress, learnings and how companies, academics and others can participate in this approach.

15:35 Refreshment Break

Ontologies and Semantic Web

* Shared Session with Track 3: IT Software for the Life Sciences

16:00 The W3C Health Care and Life Sciences Interest Group: Semantic Web in Action

M. Scott Marshall, Ph.D., W3C HCLS IG Co-chair, Informatics Institute, University of Amsterdam

The W3C Semantic Web for Health Care and Life Sciences Interest Group (HCLS) has the mission of developing, advocating for, and supporting the use of Semantic Web technologies for biological science, translational medicine and health care. HCLS covers hot topics including data integration and federation, bridging commonly used domain standards such as CDISC and HL7, and the applications of medical terminologies. This talk will introduce the HCLS, as well as provide an overview of the activities that are currently ongoing within the BioRDF, Linking Open Drug Data, and Scientific Discourse tasks. Some new developments and the recent Face2Face meeting will also be discussed. The audience will gain an understanding of how actual Semantic Web applications work and how they are being applied to the areas of Health Care and Life Sciences.

16:30 Pfizerpedia Patents - a Semantic Wiki Database of Patent Information

Andrew Berridge, B.Sc., Delivery Advisor, Informatics, Pfizer

Pfizerpedia Patents is a project using Semantic MediaWiki to store competitor patent information. For the first time, this has enabled a searchable database of patents to be built in Pfizer. This presentation shows how the class-leading MediaWiki software (which also powers can be used to produce custom applications with little or no software development, minimal support costs, and zero licensing cost. Learn how the innovative use of this technology can inspire your organization to create its own information repository.

17:00 BIODIVER – A Federated Query Engine Integrating with DAS

Oliver Karch, Ph.D., Bio- and Chemoinformatics, Merck Serono

The BIODIVER federated query engine dynamically aggregates information from various biomedical data sources in a flexible way by simultaneously querying disparate data repositories e.g. relational databases, ontologies, full-Abstract, sequence indexes, web-services etc. Execution plans specifying a biomedical concept’s retrieval process are evaluated at runtime allowing dynamic search strategies to be realized. Query results are represented in a simple and uniform way based on XML. This talk describes an integration layer which allows BIODIVER query strategies to be exposed as data sources of a Distributed Annotation System (DAS) enabling query results to seamlessly be consumed by DAS compliant feature viewers.

17:30 Conference Adjourns

Japanese Korean Chinese Simplified Chinese Traditional 
Premier Sponsors


Hitachi Data Systems


View All Sponsors 

Premier Sponsor

Official Media Partner

Bio-IT World

View Media Partners 


Bio-IT World Events

Bio-IT World Expo Locations 
Bio-IT World Expo 

Bio-IT Cloud Summit