Data Integration and Knowledge Management

Download Brochure |  Short Courses 

Too few new drugs are emerging to replace older ones coming off patent and the increased investment in R&D has not resulted in a higher number of approvals. Many companies are seeking ways to increase productivity in R&D, control spending and improve outcomes. Finding better ways to integrate data sources and make better decisions faster is being given top priority across the industry. Data Integration and Knowledge Management focuses on the transformation and management of data from -omics technologies, biomarker discovery and systems biology in order to improve R&D productivity and decision making across an organization. It’s a unique forum to learn how others are breaking down silos, integrating data sources and making R&D decisions faster and more effectively. Topics include: leveraging informatics and science 2.0 to drive innovation & productivity in drug discovery, integrating chemistry and biology data, data mining, enabling translational research through informatics tools, knowledge management and pathway analysis and integration.


13:00 Conference Registration


Data Integration and Collaboration: Tearing Down the Walls Between the Knowledge Silos in the Pharmaceutical Industry

14:00 Chairperson’s Remarks

14:05 Knowledge Engineering: Integrating the External with the Internal

Ian Dix, Ph.D., Capability Lead for Knowledge Engineering & Information Science, Discovery Information, AstraZeneca

Decision-making within R&D is often compromised by the sheer volume and heterogeneity of decision-relevant biomedical & commercial information available to project teams: project teams struggle to access critical information in the necessary time, at an acceptable quality, coverage, and cost. This challenge is compounded by the formats and accessibility of biomedical information, with the most valuable information being ‘locked’ within textual formats such as the literature, patents and internal reports. This BioIT presentation will describe AZ’s response to this challenge, focusing on external content delivery.

14:35 The Reality of Web Services in the Life Sciences

Carol GobleCarole Goble, Professor of Computer Science, University of Manchester

Finding and using web services to third party online bioinformatics resources is much harder than it should be. This talk will describe the current reality of Web Services in the Life Sciences, and how better practices can be encouraged. BioCatalogue provides a common interface for registering, finding, and monitoring bio-web services and will be described here.

15:05 The New Information Landscape: Towards a New Logic and Practice to Enhance Interoperability and Collaboration in Life Science

Mats Sundgren, Ph.D., Principal Scientist, Global Clinical Development, AstraZeneca R&D

Today the pharmaceutical industry needs to transform itself on many levels to enhance scientific research and innovation output. In order for the pharmaceutical industry to succeed in new research areas like personalized medicine and predictive medicine, one key factor is how to connect their internal information with the external world, including health care and patients, to share and make use of different information types or systems. From a new drug development perspective, the competitive edge is the ability to connect all information domains.

15:35 Refreshment Break


Knowledge Management for Target Discovery: Enhancing Biological Knowledge by Utilizing Large Scale Datasets

16:00 The Unforseen Consequences of Opportunistic Deuterated Drug Claims, Patent Extraction Feeds and the PubChem Chemistry Rules

Christopher Southan, Ph.D. , Knowledge Engineering, ChrisDS Consulting

16:30 Novel Knowledge Integration Approaches to Mining Genome-Wide Data Sets

Mario Albrecht, Ph.D., Research Group Leader, Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics

Novel computational approaches are required to integrate and mine rapidly increasing amounts of molecular data for translational systems research. This talk will present new bioinformatics methods how to transform gene and protein sets into biological knowledge and potential drug targets. Possible solutions to current prioritization challenges of large-scale data sets generated by genome-wide association studies, RNA interference screens, and various detection techniques of protein interactions and complexes will utilize comprehensive network and pathway information.

17:00 In silico Target Discovery Based on Flexible Pipelining and Meta-Analysis

James Cai, Ph.D., Head, Biomedical Informatics, Pharma Research & Early Development (pRED) Informatics, Hoffmann-La Roche

Target discovery relies heavily on the intelligent use of scientific information. Recently high dimensional genomics data have become increasingly critical in discovering new targets and biomarkers. However it is difficult to combine findings from the large number of genomics studies and integrate across multiple data types. This presentation will describe a new in silico target discovery approach based on flexible pipelining and quantitative data integration using a meta-analysis framework. Examples will be given to show how results from multiple gene expression studies can be combined with mutation, disease association and competitor information to prioritize potential targets in oncology.

18:30 – 21:00 BIOTECHNICA Night: Beer Hall, Full Dinner Reception, Live Band


9:00 Conference Registration and Morning Coffee


Translational Informatics: Knowledge Fertilization Between Bench and Bedside

9:30 Chairperson’s Opening Remarks

9:35 Managing Change Across Interdisciplinary Scientific Integration

Paul Konstant, Manager, Informatics Center of Excellence, Janssen Pharmaceutical Companies of Johnson & Johnson

Interdisciplinary scientific data integration efforts have faced numerous challenges as they attempt to keep pace with the evolution of both science and the technology that supports it. This talk will present a framework for identifying equivalence classes of protocols that reaches across disciplines based on easily understood scientific principles. Through the identification of the independent variables in a given experiment, patterns of parameter estimation, and the dependent variable outcomes, we will discuss how seemingly synonymous terminology. The principles of this framework allow well-structured expansion and graceful change management.

10:05 Integrating Clinical and Molecular Data: CASI Project

Thomas Y. Gan, Application Services Senior Analyst, IT, Merck Research Labs

10:35 Coffee Break

Sponsored by
11:00  Understanding the Business Value of Dynamic Data Aggregation in Life Sciences Manufacturing and Process Development

Joe Rothman, VP Engineering and Professional Services, Aegis Analytical Corporation

For some life science organizations the road to data aggregation begins with a traditional data warehouse.  But, with traditional approaches come traditional problems. Dynamic data aggregation represents several advantages over the data warehouse including improved usability of the aggregated data, lower Total Cost of Ownership (TCO) and improved network performance. Delegates attending this presentation will learn: 
• Pitfalls of Using a Data Warehouse, Spreadsheets or Manual Data Collection Methods
• Description of Dynamic Data Aggregation and The Technologies that Support It
• Benefits of Dynamic Data Aggregation
• How to Achieve Dynamic Data Aggregation
• Case Study of Benefits Achieved Using Dynamic Data Aggregation

11:30 Visualization and Statistical Pharmacovigilance Methods Using Informatics for Integrating Biology and the Bedside (i2b2)

Shawn Murphy, M.D., Assistant Professor, Neurology, Harvard Medical School; Medical Director of Research Computing, Information Services, Partners Healthcare Inc.

The goal of i2b2 is to provide clinical investigators broadly with the software tools necessary to collect and manage hospital-based clinical research data in the genomics age as a cohesive entity—a software suite to construct and manage the modern clinical research chart. In the i2b2 software framework (the Hive), we are developing components (cells) to allow clinical investigators to compare medical outcomes on two matched groups of subjects. The components can be used for pharmacovigilance studies to detect associations between drugs and adverse events using Electronic Health Record data.

12:00 Translational Drug Discovery and Development: An Expedition of Data Integration and Utilization from Preclinical to Clinical

Dongzhou (Jeffery) Liu, Ph.D., Principal Clinical Investigator/Core Project Leader, Medical Affairs/Clinical Development, GlaxoSmithKline

This presentation will discuss contemporary reform and transformation in the drug industry, the latest strategies and approaches to renovate the process of drug R&D, and applications of combined approaches for efficient drug development.

12:30 Lunch for Purchase in the Exhibit Hall

13:45 Dedicated Poster Viewing in the Exhibit Hall


Ontologies and State-of-the-Art Web Technologies: Bring the Knowledge to the User

14:30 Chairperson’s Remarks

14:35 Keeping Track of Data and What the Data are About: An Exploration in Realism-Based Ontology for Translational Research

Werner Ceusters, Ph.D., Director, Ontology Research Group, NYS Center of Excellence in Bioinformatics & Life Sciences

The Innovative Medicines Initiative (IMI) is a unique pan-European public and private sector collaboration between large and small biopharmaceutical and healthcare companies, regulators, academia and patients. One specific aim is to create a Knowledge Management Platform to arrive at effective data integration and analysis tools. We will discuss the exact place of ontology in such an endeavor and its relationship to terminologies and information models.

Sponsored by
Infosys sm 
15:05 Shared Molecular Network Models of Unrelated Diseasomes – Unraveling the Conundrum of Network Medicine
Dr. Reena Gollapudy, Lead Consultant, IHL, Infosys Technologies Ltd.
An understanding of the functionally relevant genetic, regulatory, metabolic and protein-protein interactions in a cellular network play an important role in deciphering the pathophysiology of human diseases. Several diseases share common pathophysiological mechanisms. Identification of common pathophysiological pathways between seemingly unrelated diseases could offer new approaches to drug discovery. The use of network and systems biology in generation of common diseasomes and identification of shared molecular pathways holds great promise in advancement of drug discovery approaches with specific relevance to molecular repurposing, predictive toxicology and discovery of diagnostic biomarker properties of pharmaceutical drugs and candidate molecules. This treatise attempts to explore common diseasome attributes and applications using a proliferasome as a case study.

15:35 Refreshment Break

16:00 Sponsored Presentation (Opportunity Available)

16:30 How Can a “Normal End User” Become a Web Service Consumer?

Christian Hauck, Ph.D., Knowledge Management and Competitive Intelligence, Novartis Pharma AG

Web services exist in various variants, but are mostly used by experts to build machine-to-machine interfaces. Without enough consumers, providers don’t add value. Without enough providers, consumers get no value. We try to build web service consumer “frontends” for educated scientists and business people to help overcome this “chicken-and-egg” bootstrap problem. Attendees will learn about how to re-use web services that might already exist in their organizations, thereby creating more value out of what is already available.

17:00 Efficient Web-Browsing for the Life Scientist

Reinhard Schneider, Ph.D., Team Leader, Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL)

Anyone who regularly reads life science literature often comes across names of genes, proteins, or small molecules that they would like to know more about. We have developed a new service called Reflect ( that can be installed as a plug-in into browsers. Reflect tags gene, protein, and small molecule names in any web page, typically within a few seconds. Clicking on a tagged gene or protein name opens a popup showing a concise summary that includes synonyms, database identifiers, sequence, domains, 3D structure, interaction partners, subcellular location, and related literature. The popups also allow navigation to commonly used databases.

17:30 A Data Warehousing Approach for the Management of Clinical and Operational Information: Specific Challenges within Pharma Development

Norbert Fritz, Ph.D., Development Lead, Business Intelligence Warehouse, Information Management - Product Development Operations, F. Hoffmann-La Roche Ltd

The development of new medicines implies more and more extensive generation and usage of medical information. Management of this broad scope of information requires integration of data generated in heterogeneous data sources and presentation of clinical and operational data in one consolidated data model. Specific challenges for Pharma Development include: evolving and changing medical concepts, disparate systems generating source data, disparate data models and standards within various data sources, transient data quality issues, the need for frequent data refreshes. We will analyze the specific challenges, present approaches to design adequate solutions and discuss real-life experiences.

18:00 Close of Conference

Download Brochure |  Short Courses 

Japanese Korean Chinese Simplified Chinese Traditional 
Premier Sponsors


Hitachi Data Systems


View All Sponsors 

Premier Sponsor

Official Media Partner

Bio-IT World

View Media Partners 


Bio-IT World Events

Bio-IT World Expo Locations 
Bio-IT World Expo 

Bio-IT Cloud Summit