Integration of Third Party Genetic Analysis Software Into a Clinical Next Generation Sequencing Data...

1
Content Advancements in next generation sequencing (NGS) technology have allowed researchers and clinicians to generate genome-wide datasets. However, current information systems, which predate the arrival of NGS technology, lack adequate methods for NGS analysis and remain isolated from the systems which generate the data. In order to add meaningful annotations to clinical NGS data, we developed a system that allows us to request and dynamically format sequencing data for a variety of software applications. Design Overall Program Design Results Easily consumable web services support NGS interpretation in clinical and research scenarios, without adversely impacting system performance Third party applications can be easily integrated through web service interfaces Minimal knowledge of data model is required Server based authorization allows patient- and field-based data restriction Results easily returned to data management system Conclusion The ability to generate genome-wide sequences and analyze the data in a clinically meaningful way are two distinct entities, which have been discordant in their points of evolution. With NGS technology evolving at a rapid pace, the technology used to interpret it struggles to keep up. To increase efficiency and provide increased data access, applications that easily and securely allow access to third-party research applications, such as web services, are important for furthering our understanding of sequencing data. Integration of Third Party Genetic Analysis Software into a Clinical Next Generation Sequencing Data Platform THOMAS JS DURANT MD, WADE L. SCHULZ MD, PhD YALE SCHOOL OF MEDICINE, DEPARTMENT OF LABORATORY MEDICINE, NEW HAVEN CT, 06520 Authenticate user to web service Web service call to VarBase • Deserialize returned data model Parse VCF data with Python script Analyze output with SciClone Send result to VarBase web service VCF Data Parser VarBase Architecture Integration Workflow Data Parsing Data Analysis Technology Data management system (VarBase-Galileo) hosted on Windows Server 2012 (Microsoft, Redmond, WA; USA) RESTful web service interface Research software packages run on Linux-based virtual machines (CentOS) Web service responses consumed and parsed using Python 2.7 Visualized in R (version 3.1.2) using the SciClone R package

description

Integration of a third party statistical library (SciClone) into VarBase for the analysis of next generation sequencing data.

Transcript of Integration of Third Party Genetic Analysis Software Into a Clinical Next Generation Sequencing Data...

  • Content

    Advancements in next generation sequencing (NGS)

    technology have allowed researchers and clinicians to

    generate genome-wide datasets. However, current

    information systems, which predate the arrival of NGS

    technology, lack adequate methods for NGS analysis and

    remain isolated from the systems which generate the

    data. In order to add meaningful annotations to clinical

    NGS data, we developed a system that allows us to

    request and dynamically format sequencing data for a

    variety of software applications.

    Design

    Overall Program Design

    Results

    Easily consumable web services support NGS interpretation in clinical

    and research scenarios, without adversely impacting system

    performance

    Third party applications can be easily integrated through web service interfaces

    Minimal knowledge of data model is required

    Server based authorization allows patient- and field-based data restriction

    Results easily returned to data management system

    Conclusion

    The ability to generate genome-wide sequences and analyze the

    data in a clinically meaningful way are two distinct entities, which

    have been discordant in their points of evolution. With NGS

    technology evolving at a rapid pace, the technology used to

    interpret it struggles to keep up. To increase efficiency and provide

    increased data access, applications that easily and securely allow

    access to third-party research applications, such as web services,

    are important for furthering our understanding of sequencing data.

    Integration of Third Party Genetic Analysis Software into a ClinicalNext Generation Sequencing Data Platform

    THOMAS JS DURANT MD, WADE L. SCHULZ MD, PhDYALE SCHOOL OF MEDICINE, DEPARTMENT OF LABORATORY MEDICINE, NEW HAVEN CT, 06520

    Authenticate user to web service

    Web service call to VarBase

    Deserialize returned data model

    Parse VCF data with Python script

    Analyze output with SciClone

    Send result to VarBase web serviceV

    CF

    Dat

    a Pa

    rser

    VarBase Architecture Integration Workflow

    Data Parsing Data Analysis

    Technology Data management system (VarBase-Galileo) hosted on

    Windows Server 2012 (Microsoft, Redmond, WA; USA)

    RESTful web service interface

    Research software packages run on Linux-based virtual machines (CentOS)

    Web service responses consumed and parsed using Python 2.7

    Visualized in R (version 3.1.2) using the SciClone R package