This track explores information technology & business process integration; open and collaborative platforms; data modeling, storage management, and analysis tools; as well as ontologies and semantic web.Featured case studies will explore how to leverage computer systems and database management in support of drug discovery and development processes.
14:00-17:30 Pre-Conference Workshop: Sequencing Data Storage
Co-organized by CHI and BioTeam
Matthew Trunnell, Manager, Research Computing, Broad Institute
Chris Dagdigian, Founding Partner and Director of Technology, BioTeam, Inc.
Guy Coates, Ph.D., Group Leader, Informatics System Group, The Wellcome Trust Sanger Institute
Additional Speakers to be Announced
BioTeam has been on the frontlines of next-generation sequencing integration, having helped several organizations with the unique next-gen IT, storage, and data management challenges. This workshop will present real-world customer experiences straight from the trenches. You’ll get practical information about the storage support needed by research organizations in the next-gen world.
* Separate registration required
WEDNESDAY, 7 OCTOBER
13:50 Chairperson’s Remarks
14:05 FEATURED SPEAKER
IT-Enabled Process Integration Risk Management
Chris Asakiewicz, Ph.D., Affiliate Professor of Information Management, Wesley J. Howe School of Technology Management, Stevens Institute of Technology
Today, business organizations invest hundreds of millions of dollars in IT infrastructure and applications with the intention of realizing significant benefit from their investment. However, business investment in IT-enabled processes can often be jeopardized due to a failure in identifying and managing the risks associated with inter and intra IT/business process integration. This presentation discusses both a process and tool for use in proactively identifying, addressing, and mitigating integration risks associated with IT-enabled business process change efforts. By identifying and addressing the risks associated with IT-enabled business process integration, business organizations will better ensure progress and alignment on critical enterprise-wide programs and strategic initiatives, enhance the ownership and accountability around their change efforts, and ensure the successful implementation and realization of benefits from their investments in IT-enabled business processes.
14:35 Building a Flexible Infrastructure for Life Science with Federated Services, Open Source, and Bioclipse
Lars Carlsson, Ph.D., Global Safety Assessment, AstraZeneca R&D
Ola Spjuth, M.Sc., Bioclipse Coordinator, Uppsala University
Bioclipse is a free, open source workbench and development platform that delivers advanced functionality available both from an intuitive GUI and an extensible scripting language. Bioclipse is the result of an international academic collaboration between Uppsala University, European Bioinformatics Institute (EBI), Cambridge University, Cologne University, and Ludwig-Maximilians-Universität. This presentation describes how Bioclipse can be utilized as a cost-effective integration platform and how well this combines with open standards and a federated service infrastructure with respect to custom development, integration, maintenance, provisioning, reporting, and end user experience. The presentation also demonstrates ongoing work at AstraZeneca towards establishing services and tools based on open standards and open source in pharmaceutical R&D.
15:05 Nature Network and Connotea: Social-Software Projects
Ian Mulvany, Product Development Manager, Nature Publishing Group; Project Manager, Nature Network and Connotea
Nature Publishing Group has created a number of tools to help scientists in their daily work to get to information they need. These tools include Connotea, an online bookmarking system for sharing links and a free online reference management tool for all researchers, clinicians and scientists; Nature Network, a blogs and forums driven social network for scientists to discuss research and to identify communities of interest. Other tools include Nature Precedings, Nature Blogs and Scintilla. These tools hopefully will help to address the looming problem of information overload that science will face. This presentation will discuss these tools in depth as well as future possibilities of building onto and integrating with these services.
15:35 Refreshment Break
16:00 Translational Research Informatics Meets Cloud Computing Sponsored by
James DeGreef, Ph.D., Vice President Market Strategy, GenoLogics
Christina Schroeder, Manager Customer Solutions, GenoLogics
As translational approaches continue to gain momentum in academic health sciences and pharmaceutical and biotechnology research enterprises, new adaptive, integrating, and innovative methods to approach informatics are called for, versus traditional large-scale enterprise software deployments. Web 2.0 technologies are transforming consumer Internet computing, with new business models and collaborative network approaches transforming the software business. Learn how GenoLogics is providing innovative translational research informatics solutions to leading pharmaceutical companies, academic medical centers, and research institutions, networks, and universities. Real customer examples will be reviewed, focusing on clinical data integration, cloud computing platforms, genomics, proteomics, systems biology, biorepositories, LIMS, query portals, translational data repositories, federating architectures, biomarker ontologies, and semantic web.
16:15 Sponsorship Presentation (Sponsorship Opportunity Available)
16:30 How myExperiment Supports Social Curation, Workflow & Protocols
Carol Goble, Ph.D., Professor of Computer Science, University of Manchester, UK; Principal Investigator (PI), myGrid Project
myExperiment (www.myexperiment.org) is a collaborative environment where scientists can safely publish and share data and analytical pipelines, computational workflows and experiment plans. Workflows, and other scientific objects and collections can be swapped, sorted and searched for, making it straightforward for our next generation of scientists to draw upon and contribute to a pool of scientific assets and hence share expertise and avoid reinvention. Our public myExperiment.org website has nearly 1000 publicly deposited workflows from more than 10 workflow systems including Taverna, Trident, Pipeline Pilot and Triana. Our approach is to support and encourage social curation - helping the community deposit and curate the content themselves - by understanding the incentives to share and annotate and addressing concerns about sharing. This presentation describes why myExperiment is an extraordinary resource for bio-developers developing workflows and protocols, how we have incentivised and protected contributors of content, and how myExperiment can be embedded in a workflow platform, specifically Taverna (www.taverna.org.uk) or other applications. We use the same principles of crowd-sourced content and social curation and collaboration in related projects within the myGrid consortium, notably BioCatalogue (www.biocatalogue.org) - a curated catalogue of web services in the Life Sciences. The presentation will also refer to this crucial, newly released Life Science resource. myExperiment and Biocatalogue are both projects from the UK's myGrid consortium (www.mygrid.org.uk) which includes the University of Southampton and EMBL-EBI as partners.
17:00-17:30 Federating Computing for the Life Sciences: Towards a Universal Platform
Karim Chine, Software Architect and Coordinator, Biocep
Bernd Bischl, Fakultät Statistik, Technische Universität Dortmund
R , the open-source software environment for statistical computing and graphics, is becoming the lingua franca of data analysis. Repositories of contributed R packages related to a variety of problem domains in life sciences are growing at an exponential rate. Scilab , the open-source software package for numerical computations, is becoming more and more widely used for scientific applications. The ubiquitous Java technologies allow the building of highly effective platform-independent distributed systems and graphical user interfaces. Virtualization technologies allow the creation, distribution and reuse in any environment of snapshots of operating systems, computing software stacks and data sets. Those snapshots (or virtual images) can be run and used either locally, or on public clouds (Amazon EC2) or on private clouds (Eucalyptus System). Biocep builds with these ingredients and others a universal open-source computing platform that creates an open environment for the production, sharing and reuse of all the artifacts of computing. It enables a centralized and strictly controlled access within the organization to the hardware and software computational resources and It provides frameworks and tools for the rapid creation of easily maintainable and highly scalable analytical software applications.
18:30 BIOTECHNICA Night – Original Bavarian Beer Hall, full dinner reception, a traditional German Band
THURSDAY, 8 OCTOBER
08:50 Chairperson’s Remarks
09:05 Converging Computer Science and Systems Biology
Corrado Priami, Professor of Computer Science, University of Trento; President and CEO, The Microsoft Research - University of Trento Centre for Computational and Systems Biology (CoSBi)
CoSBi, a partnership between Microsoft, the Italian Government and University of Trento, was formed four years ago and has grown steadily. There are over 25 researchers who have developed a number of prototype tools that are freely available to the public. CoSBi is working on algorithmic systems biology where computing and systems biology converge. They are using computer science tools to try and develop a new programming language, syntax, and toolset to model and simulate living systems. During the next 18 months, CoSBi hopes to develop an interface to the platform that is useable by biologists. This talk will present the technology challenges and limitations of this model as well as its scalability.
09:35 Genetics, Genomics and Systems Biology
Hans Lehrach, Ph.D., Founder, Alacris Pharmaceuticals; Director, Molecular Genetics, Max Planck Institute; Professor in Biochemistry, Free University of Berlin
Biological processes are driven by complex networks of interactions between molecular and cellular components. Predicting the outcome of potential disturbances is of prime importance to be able to prevent disease, as well as to identify possible therapies for diseases, which are already present. To predict the behaviour of such complex networks, we will have to develop general models of the processes involved, based on information on pathways derived from genetic and molecular approaches, to ‘individualise’ these by applying ‘genomics’ scale analysis techniques (e.g. genome and/or transcriptome analysis by next-gen sequencing techniques-genomics), and to explore the behaviour of these models computationally (systems biology). We are using a combination of high throughput sequencing of genome and transcriptome of both tumor and patient to establish predictive models (virtual patients), which ultimately will reflect the response of real patients to specific therapies in oncology and other areas of medicine.
10:05 ISA: Standards and Infrastructure for Managing Experimental Metadata
Susanna-Assunta Sansone, Coordinator, EMBL-EBI and NERC-NEBC
Philippe Rocca-Serra, Technical Coordinator, EMBL-EBI and NERC-NEBC
Composed by a set of freely available software components, the ISA infrastructure (http://isatab.sf.net) empowers communities to aggregate their own sets of multi-omics studies, complying with the relevant reporting standards, store it locally and/or submit to public repositories, where applicable. It has been widely recognized that data should be accompnaied by the contextual information (experimental metadata) to provide with any necessary details on the origin of the data the wider scientific community, which may wish to examine and use these datasets to underpin other bio-investigations. Such details include, experimental design, related publications, sample source(s) and treatment(s), preparation of a sample for analytical assay, the processes and instruments used throughout, and sample-data file relations. To enable a thorough and regularized description of this experimental metadata (and the associated data), a growing number of reporting standards (minimum information checklists: http://www.mibbi.org and ontologies http://www.obofoundry.org) are being developed. This presentation will introduce the synergistic standards activity and illustrate how these are implemented in the ISA infrastructure for managing experimental metadata from a variety of multi-omics studies.
10:35 Coffee Break
11:00 Integration of Data in BioModels Database Using MIRIAM
Camille Laibe, Software Developer, European Bioinformatics Institute
BioModels Database is a data resource that allows biologists to store, search and retrieve published mathematical models of biological interests. BioModels Database is heavily used by the systems biology community and provides much more than a simple catalogue of models. This talk will describe how data integration is achieved in BioModels Database using MIRIAM. MIRIAM Resources is a growing tool which answers the needs of anybody having to deal with perennial identification of information in life sciences.
11:30 Molwind - Mapping Molecule Spaces to Geospatial Worlds
Christian Herhaus, Ph.D., Bio- and Chemoinformatics, Merck Serono
Molwind is a server based on NASA’s World Wind concept, a browser to navigate data interactively on planetary terrain. Molwind extends the applicability of the proven and elaborated navigation concepts and tools available in Geoinformatics very generally to other disciplines for easier analysis of complex network- or tree-like graphs. Molwind makes it easier to “move” within a dataset, rapidly changing complexity levels, and with lots of options for annotation of additional data where desired, irrespective of the source of the data. Learn how the Molwind tool addresses the limitations that currently exist with graph and data analysis tools.
12:00 Development of an Automated Information System for Cell Therapy Manufacturing
Fabio Triolo, D.d.R., M.Phil., Ph.D., Technical Director, Regenerative Medicine and Cell Therapy Unit, Mediterranean Institute for Transplantation and Advanced Specialized Therapies (ISMETT)
Tommaso Piazza, Eng.D., Chief Information Officer, Mediterranean Institute for Transplantation and Advanced Specialized Therapies (ISMETT)
This talk will discuss a sophisticated, research-friendly, paperless electronic system created to document cell therapy manufacturing for cGMP cell processing facilities. One of the most innovative features of the system is its information coding that allows it to have standardized and retrievable data and to perform statistical analysis and data mining on retrieved data. This allows it to carry out multi-dimensional analyses on the database, in order to compare data from different isolations, find significant correlations between parameters involved in the isolation process and the success of the transplant, and ultimately use this information to facilitate the identification of those factors which contribute more effectively to the process outcome.
12:30-13:45 Lunch for Purchase in the Exhibit Hall and Exhibit Viewing
13:50 Chairperson’s Remarks
14:05 CASTOR QC - A Database Approach for Handling Large Genomic Data Sets
Tibor van Rooij, Bioinformatics Director, Génome Québec & Montreal Heart Institute Pharmacogenomics Centre
CASTOR QC (Comprehensive Analysis and STORage) uses a novel database centric approach that leverages data structures and database technologies to enable rapid analysis of genotypic and phenotypic data. Learn how both genome-wide and candiDate gene association analyses can be conducted in a highly parallel and efficient manner. CASTOR QC addresses the three main challenges to streamline this type of data for rapid processing 1) reduce the size of the output files, 2) transform the data into an analysis-friendly format and 3) eliminate the need for sequential access to the data.
14:35CBIP: Successfully Implementing a Large-Scale Automation and Informatics Platform for Academic Probe Discovery
Raza Shaikh, Associate Director of Informatics, Chemical Biology Platform, The Broad Institute
Over the past two years, The Broad Institute set out to completely rebuild its high throughput screening and probe discovery platform using state of the art automation and informatics tools. The platform uses an electronic notebook as the rich front-end for recording chemical and biological information integrated with a web-based LIMS system. Our innovative approach to engineering the solution included informatics participating in Factory Acceptance Tests (FAT) and building functionality to align with Site Acceptance Tests (SAT). This approach coupled with just-in-time approach to soft ware engineering has allowed us to build a comprehensive platform. Attendees will learn the issues and decisions involved in building an informatics platform that caters to the needs for a diverse set of processes.
15:05 Sponsored Presentation (Opportunity Available)
15:35 Refreshment Break
* Shared Session with Track 4: Data Integration and Knowledge Management
16:00 The W3C Health Care and Life Sciences Interest Group: Semantic Web in Action
M. Scott Marshall, Ph.D., W3C HCLS IG Co-chair, Informatics Institute, University of Amsterdam
The W3C Semantic Web for Health Care and Life Sciences Interest Group (HCLS) has the mission of developing, advocating for, and supporting the use of Semantic Web technologies for biological science, translational medicine and health care. HCLS covers hot topics including data integration and federation, bridging commonly used domain standards such as CDISC and HL7, and the applications of medical terminologies. This talk will introduce the HCLS, as well as provide an overview of the activities that are currently ongoing within the BioRDF, Linking Open Drug Data, and Scientific Discourse tasks. Some new developments and the recent Face2Face meeting will also be discussed. The audience will gain an understanding of how actual Semantic Web applications work and how they are being applied to the areas of Health Care and Life Sciences.
16:30 Pfizerpedia Patents - a Semantic Wiki Database of Patent Information
Andrew Berridge, B.Sc., Delivery Advisor, Informatics, Pfizer
Pfizerpedia Patents is a project using Semantic MediaWiki to store competitor patent information. For the first time, this has enabled a searchable database of patents to be built in Pfizer. This presentation shows how the class-leading MediaWiki software (which also powers Wikipedia.org) can be used to produce custom applications with little or no software development, minimal support costs, and zero licensing cost. Learn how the innovative use of this technology can inspire your organization to create its own information repository.
17:00 BIODIVER – A Federated Query Engine Integrating with DAS
Oliver Karch, Ph.D., Bio- and Chemoinformatics, Merck Serono
The BIODIVER federated query engine dynamically aggregates information from various biomedical data sources in a flexible way by simultaneously querying disparate data repositories, e.g. relational databases, ontologies, full-Abstract, sequence indexes, web-services, etc. Execution plans specifying a biomedical concept’s retrieval process are evaluated at runtime allowing dynamic search strategies to be realized. Query results are represented in a simple and uniform way based on XML. This talk describes an integration layer which allows BIODIVER query strategies to be exposed as data sources of a Distributed Annotation System (DAS) enabling query results to seamlessly be consumed by DAS compliant feature viewers.
17:30 Conference Adjourns