Download - Introduction to Web Services - DTU Bioinformatics to Web Services Peter Fischer Hallin, Center for Biological Sequence Analysis Comparative Microbial Genomics Workshop Bangkok, Thailand

Transcript

Introduction to Web Services

Peter Fischer Hallin, Center for Biological Sequence AnalysisComparative Microbial Genomics Workshop

Bangkok, ThailandJune 2nd 2008

Background - why worry...

• Increasing size and number of public sequence databases - faster sequencing methods

• Increasing number of bioinformatic prediction tools

• Requirements for data integration, spanning several types of data, different databases and physical locations.

• Solution: Web Services

• Web Services vs. Web Applications

• Why WS? Advantages / disadvantages

• Design of web services: interoperability

• Implementation case stories

• Invoking a web service

• WS and it’s future role

Topics covered

Web applications• HTML based software designed to interact

with the user - in most cases this involves human interpretation and navigation.

• If designed well, they are user friendly

• Do not require special skills to operate

• In near every case, web pages do not make sense to computers (wget?)

Web Applications ... you use them everyday

... plus a billion more ...

Web Services (on the other hand)- are software designed to enable computer-

to-computer interaction

- should aim to enhance interoperability between different systems

- exchange objects which are well defined

- consist of methods / operations and are well defined

- exchanging data using SOAP over HTTP.

Interoperability

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <getProductDetails xmlns="http://warehouse.example.com/ws"> <productID>827635</productID> </getProductDetails> </soap:Body> </soap:Envelope>

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <getProductDetailsResponse xmlns="http://warehouse.example.com/ws"> <getProductDetailsResult> <productName>Toptimate 3-Piece Set</productName> <productID>827635</productID> <description>3-Piece luggage set. Black Polyester.</description> <price currency="NIS">96.50</price> <inStock>true</inStock> </getProductDetailsResult> </getProductDetailsResponse> </soap:Body> </soap:Envelope>

Basic concepts of the web service:

‣ XML (Extensible Markup Language): Format for input and output.

‣ Request: An XML message generated by the user/client that are uploaded to a service

‣ Response: An XML message generated by the server as an answer to the request.

‣ WSDL (Web Services Description Language) file: Most often published by the service provider. Describes all aspects of the service

‣ Message: A detailed declaration of how input and output looks.

‣ XSD (XML Schema Definitions) is the language for defining the types that form the messages.

‣ Types: Declarations of input and output objects can be within the WSDL file itself, or linked externally.

‣ Endpoint: A URL pointing the client to a location which will read the request and respond.

‣ Documentation: Every object element/attribute can be documented

Basic concepts of the web service:

How does the WSDL look?• Written in XML

• Looks complicated - most developers fear to write them manually

• Personally, I believe they should be written by hand in a simple text editor...

• But let’s first look at an XSD file - defining objects used for the EasyGene service, which is an ORF finder

The XSD: Declaring types used for messages

Data types

Detailed documentation

The WSDL: Importing types, defining messages

The WSDL: Linking operations and messages

The WSDL: Linking operations to endpoints

The WSDL: Service documentation

The WSDL: Endpoint

Web Services design considerations

• Common data types

• Granularity

• Typing

Granularity, Granularity, and Granularity

• Our choice of technology sets standards for typing any element in input/output - and these standards should be used!

• To a certain extent, Web Services is all about plumbing - connecting objects (pipes) from different operations to build a workflow and finally to generate a result for you

• This plumbing gets increasingly difficult, the poorer the granularity

Tropomyosin isoforms

Tropomyosin isoforms

Tropomyosin isoforms

Tropomyosin isoforms

1 2 3

Granularity

Typing

Typingarray

id: string

sequence: string - restrictions? /^[ACDEFGHIKLMNPQRSTVWY]+$/

• You will not need to examine the WSDL and XSD files during this workshop.

• You should just know of their existence and idea behind the WSDL

Requirements

3 methods to invoke web services:

• Workflow editors: Graphical. Less user un-friendly (supposedly)

• Text/XML editors: Easy-to-use. Less features. For exploring services

• Programmatic access: For writing clients and automating tasks.

Graphical workflows

E.g. Taverna: http://taverna.sourceforge.net/

Raw text/XML

SoapUI: www.soapui.org

Perl/Python/C

SOAP modules exist for most programming languages: Perl, Python, C, Java ...

Perl/Python/C

SOAP modules exist for most programming languages: Perl, Python, C, Java ...

- Java based stand-alone software for inspecting, invoking, and testing Web Services.

- Downloadable from http://www.soapui.org free of charge.

- ‘pro’ edition available for purchase having extended features.

Demonstration using SoapUI

SoapUI is ...

• Ideal for development/testing of Web Services: Provide the raw XML request/response to/from your server. It allows you to view HTTP headers, attachments etc.

• Construct template/default request messages based on the WSDL/XSD

• SoapUIs strength is inspection and manually invoking of operations, however the Pro edition supports workflows/test cases!

• Handling multiple WSDLs with a project, and multiple projects within a session

Why SoapUI

• Genome Atlas (http://www.cbs.dtu.dk/ws/GenomeAtlas/) Various database tools accessing prokaryotic genomes sequences

• RNAmmer (http://www.cbs.dtu.dk/ws/RNAmmer) Predicts ribosomal RNA genes in full genome sequences.

Example: Inspecting two services

Creating a new project

Labeling the project and adding WSDL

Default request are made automatically

We provide a genbank accession to ‘getSeq’

http://www.cbs.dtu.dk/ws/RNAmmer

RNAmmer online documentation

Adding a new WSDL to the project

Adding a new WSDL to the project

Copy the genome sequence from ‘getSeq’

Paste genome sequence in RNAmmer ‘runService’

Adding a new WSDL to the project

Submit and acquire jobid

Poll queue until job finished

Poll queue until job finished

Fetch the result, suing ‘fetchResult’

Asynchronous operations

EasyGene - the programmatic way

• Similar example, this time programmed in Perl

• Using Genome Atlas ‘getSeq’ to obtain genome sequence, and submit this to the ORF finder EasyGene.

• Poll for the job status and obtain result.

First, obtaining a Genome Sequence

EasyGene client : load fasta from STDIN

EasyGene client : function to predict ORFs

Case story: meta genomic application of BLAST

atlasesA service which allows visualization of homology between a reference genome and any number of genomes, metagenomic samples, or sequence databases.

Case-story: the BLASTatlas WS

Case-story: the BLASTatlas WS

Example: Seven ocean samples from various depths (surface to 4km): 63,837,557 nucleotides, in 65,674 sequences.

24,978 proteins from 12 fully sequenced Prochlorococcus marinus genomes

0M

0.5M1M

1.5M

2M

2 .5M

P. marinus str. MIT 93032,682,675 bp

green P. marinus genomesblue=ocean samples

Surface

4km

• Web services are (almost guaranteed) to be more cumbersome to access than its web application counterpart.

• Web services allow automation and integration into your programming language.

• You need only a single (WSDL) file to allow your client to connect to the service

Conclusionpros and cons

Pros and cons

• Input (request) can be validated - remember the XSD!

• Should failures occur

Looking into the crystal ball ...

• Will WS replace the bioinformatician or require more of them?

• Will WS ease the access to tools and databases?

• Will WS save time?• Will WS allow more complex analysis

that were impossible before?• Will WS take over - and what will they

take over?

Acknowledgements

David W. UsseryCraig BenhamKarin LagesenJan Christian BryneFrancisco RoqueKristoffer RapackiHans Henrik Stærfeldt