Module development

34
Module Development for A R A P O R T Jason Miller J. Craig Venter Institute (JCVI) Plant & Animal Genomes Conference (PAG) 2016 1

Transcript of Module development

Page 1: Module development

1

Module Developmentfor

A R A P O R T

Jason MillerJ. Craig Venter Institute (JCVI)

Plant & Animal Genomes Conference (PAG) 2016

Page 2: Module development

2

www.ARAPORT.org

Community• Jobs• Meetings• Curation

ThaleMine• Genes pages• Protein pages• Analysis tools

Modules• Science apps• Web services• Education

JBrowse• Annotation• RNAseq• SNPs

Arabidopsis Information Portal

Page 3: Module development

3

www.ARAPORT.org

Community• Jobs• Meetings• Curation

ThaleMine• Genes pages• Protein pages• Analysis tools

Modules• Science apps• Web services• Education

JBrowse• Annotation• RNAseq• SNPs

Arabidopsis Information Portal

Page 4: Module development

4

Modules Proposed

‘The design of the AIP will provide core functionality while remaining flexible to encourage multiple contributors and constant innovation.’IAIC Whitepaper (2012) Plant Cell: ”Taking the next step”.

Page 5: Module development

5

Modules Realized

Page 6: Module development

6

Modules Realized

Core web applications for integration and indexing.

Page 7: Module development

7

Modules Realized

Core web applications for integration and indexing.

In-house Science Apps

Community Science Apps

Page 8: Module development

8

Interactions Module

Contributed by BAR2015

Page 9: Module development

9Science App by Asher Pasha & Nicholas Provart from BAR at University of Toronto.

Page 10: Module development

10

How It Works

https://www.araport.org JavaScript in the Browser…

if ( gene && gene.length > 0 )

… calls Araport web services by URL…

$.get('https://api.araport.org…

… which in turn call BAR web services by URL…

http://bar.utoronto.ca/webservices/get_expressologs.php

*This pass-through middle layer exists to prevent “cross-origin resource sharing” exceptions that would occur if the JavaScript were to invoke URLs unrelated to the JavaScript source.

Page 11: Module development

11

How It Works

https://www.araport.org

The graph is interactive.• Users can rearrange nodes by dragging.• Users can get details by clicking.

This is Cytoscape.• The graph is drawn by Cytoscape.js• This is a free library for JavaScript.

There are many libraries to choose from!• jsPhyloSVG: phylogenetic trees• HighCharts: statistical charts• jQuery DataTables: interactive tables• d3.js: all sorts of cool stuff

Page 12: Module development

12

Code re-use

http://bar.utoronto.ca

The Araport science app (left) reuses code from the pre-existing BAR app (right).These apps look different by choice but they could be made identical.

https://www.araport.org

Page 13: Module development

13

Key Points• The interactions Science App

– Example of visualization module– Uses Cytoscape library for JavaScript apps– Displays data from BAR web services via Araport pass-through– Developed at BAR by developers attended Araport Workshop– Similar codes deployed at Araport and BAR

• We invite you to develop a visualization module– Araport engineers available to provide technical support

• Featured speaker:– Xinbin Dai, The Samuel Roberts Noble Foundation

• “HRGRN: enabling graph search and integrative analysis of Arabidopsis signaling transduction, metabolism and gene regulation networks”

Page 14: Module development

14

Subcellular Localization Module

Contributed by SUBA2015

Page 15: Module development

15

Web Servicesuba3

Web service by Cornelia Hooper, Ian Castleden from University of Western Australia.

This URL

Returns this data

Page 16: Module development

16

Auto Docsuba3

Automatic documentation at Araport.

These are the service endpoints• The endpoint is the verb in the URL.• Verb is followed by a parameter.• Example: araport/suba/search?locus=AT2G46830Standard service endpoints at Araport• /list = which IDs work with this service?• /search = what are the details for a given ID• /prov = who provided this data and how?

Automatic documentation is generated by Araport based on provided metadata.Implemented with server side components: Adama and Swagger.

Page 17: Module development

17

Auto Docsuba3

Easy to use “try it out!” button.

Result: this transcription factor localizes to nucleus.

Results in JavaScript-friendly JSON format.

Page 18: Module development

18

A Web Service Module• SUBA provides a web services module– URL query takes an Arabidopsis locus as parameter– URL responds with a web page full of data

• The data is not formatted for display to humans (e.g. HTML)• The data is formatted for JavaScript parsers (in JSON format)

– The service is RESTful in that the data exchange is achieved with just the standard web protocol, HTTP

• We can all use this module…– Build a Science App that colors genes and pathway– Build a Science App that scores predicted interactions– ThaleMine could add subcellular localization to gene lists

suba3

Page 19: Module development

19

How They Did Itsuba3

1. SUBA created a RESTful web service at their university.• Added local URLs that return JSON instead of HTML.• Re-used their existing database and web server.

2. SUBA wrote an Araport adapter.• Wrote an adapter program in python.• Adapter calls their URL & prints results in JSON format.• Added metadata in YAML format (for auto documentation)• Saved code to a source code repository on bitbucket.

3. SUBA deployed the adapther to Araport.• Used ‘curl’ to send Araport the URL of the source code repository.• Araport checks out the code, compiles it, containerizes it, deploys it.• Araport generates interactive documentation using Swagger.

http://suba.plantenergy.uwa.edu.au/suba-app…

https://bitbucket.org/athaliana/suba-araport

$ curl –kL -X POST –H ”$BEARER_TOKEN” –F "git_repository=https://bitbucket.org/athaliana/suba-araport” https://api.araport.org/community/v0.3

Page 20: Module development

20

Key Points• Example of a web services module– Developed at SUBA, University of Western Australia– Deployed independently without Araport intervention

• We invite you to develop a web services module– Araport will provide documentation, indexing– Araport will promote auto-discovery, interoperability

• Featured speaker:– Manhoi Hur, Iowa State University, Ames, IA

• “PMR metabolomics and transcriptomics database and its RESTful web APIs: A data sharing resource”

suba3

Page 21: Module development

21

JBrowse Module

Contributed by EPIC CoGe2015

Page 22: Module development

22

JBrowse Tracks

Track selected.

Track displayed.

Data provided by EPIC CoGe web services. Thanks to Erik Lyons, University of Arizona.

Page 23: Module development

23

Key Points• A web services module for JBrowse display– Data + metadata provided by EPIC CoGe web services– Exposed at Araport using pass-through adapters

• No code, just metadata

• We invite you to contribute JBrowse track data– Support for mapped reads in indexed BAM files– Support for genomic variants in VCF files

• Featured speaker:– Beth Rowan, Max Planck Institute for Developmental Biology,

Tübingen, Germany• “User friendly tools for the Arabidopsis thaliana 1001 Genomes”

Page 24: Module development

24

BLAST Module

Contributed by TACC2015

Page 25: Module development

25

BLAST APP

The BLAST app provides basic search against TAIR10 and Araport11 databases. Future versions will provide gene report page hyperlinks and other Science App integrations.

Page 26: Module development

26

How It WorksBLAST APP

JavaScript in the Browser:https://www.araport.org

function submitBlastJob(Agave){ Agave.api.jobs.submit(…

Docker

Araport Servers:

filesDocker

Page 27: Module development

27

How It WorksBLAST APP

The source code is public:

https://www.github.org

◀ JavaScript Science App for browsers(upload this using an Araport

form) ◀ BLAST Database build script for servers

(submit this with Agave tools)

◀ BLAST software build script for servers(submit this with Agave tools)

Page 28: Module development

28

Key Points• The BLAST Science App

– Built by Araport staff with tools available to end users– Codes are open source (can be re-used)– Codes are portable (can be installed at your site, too)

• We invite you to contribute a computational module– Develop in almost any programming language– Define the Docker container for running it– Use Agave for scheduling jobs, storing files, etc.– Deploy a JavaScript user interface to Araport

• Featured speaker:– Michael Hamilton, Colorado State University

• “Predicting differential intron retention with iDiffIR”

BLAST APP

Page 29: Module development

29

Summary• The Araport platform – Hosts modules from community members

• Members gain visibility, accessibility, discoverability• Members benefit from documentation, tech support

– Hosts many forms of community modules• Visualization Science Apps using JavaScript libraries• Computation Science Apps and back end software• Pure data interchange as RESTful web services• JBrowse tracks as RESTful web services

• On-going infrastructure development– Federated search, ontology-based interoperation– User workspaces, drag & drop combinations

Page 30: Module development

Araport Developer Workshops

30

Deploying the Atted Science App Tutorial atAIP Developer Workshop, TACC, Nov 2014.

The Atted Science App Tutorial is available as open source on GitHub.

Sign up now for the 2016 workshop

Page 31: Module development

31

Acknowledgements

Araport Data Sources

Page 32: Module development

32

AcknowledgementsJ Craig Venter Institute• Chris Town (PI)• Jason Miller• Agnes Chan• Erik Ferlanti• Irina Belyaeva• Chia-Yi Cheng• Vivek KrishnakumarFormer team members: Konstantinos Krampis, Svetlana Karamycheva, Maria Kim, Ben Rosen, Christopher Nelson, Seth Schobel

University of Cambridge• Gos Micklem• Sergio Contrino

Funding Agencies

Texas Advanced Computing CenterMatt VaughnSteve MockMatt HanlonWalter MoreiraRion DooleyJoe StubbsJosue Balandrano CoronelAlex Rocha

Page 33: Module development

The Arabidopsis Information Portal (www.Araport.org) is an integrated resource for Arabidopsis genomics data, web-based genome browsing, and data mining. Araport is also a community-extensible platform for growth. Community labs are invited to contribute modules devoted to specific experimental data types. Araport’s community modules provide databases, computations, and visualizations. These are exposed as user-friendly web applications (“Science Apps”) or programmer-friendly web services or both. Araport modules are custom branded, auto-documented, and portable to other web sites. Module deployment is automated and developer-driven. Provenance tracking, usage reporting, data indexing and data integration will be automated soon. We will explain the process of developing a module and deploying it to Araport.

Abstract

Page 34: Module development

34

External programsPortal programs (www.araport.org)

API (api.araport.org)

Agave Corekeep metadata

enroll usersADAMA

format data

enroll services

a b c d e f

CGI

Computing

Storage

Databases

ThaleMine JBrowse

Authentication, metering, logging, versioning, security.

a b c d e f

Apps

Jobs

Systems

CGI

InterMines

Others

Tripal

SOAP

CGI

REST

Science Apps

Requisite Architectural Diagram