Integration of BioInformatics tools at NUS Meena K Sakharkar Training Manager, BioInformatics Centre...

Post on 16-Jan-2016

222 views 0 download

Tags:

Transcript of Integration of BioInformatics tools at NUS Meena K Sakharkar Training Manager, BioInformatics Centre...

Integration of BioInformatics tools at NUS

Meena K Sakharkar

Training Manager, BioInformatics Centre

National University of Singapore

BioInformatics Tools at NUS

• BioPortal

• BioKris (formerly called BioKleisli)

• BioWebWatch (Bio-Eyes)

• BioJake

• Other projects

BioPortalBioPortal

BioPortal

User friendly gateways to

GCG and other software.

Prior Scenario

Command line interface.

• Basic OS knowledge.

• User Interface(UI) for each program.

• Platform dependent.

Transient phase

• Hyper text Menu Based Interface

Present Scenario

• User-Friendly web interface to Biosoftware.

• Platform Independent.

• Depends on Client - Server Technology.

• Common Gateway Interface (CGI) programs written in PERL and C.

Programs available• GCG

• PHYLIP

• CLUSTALW

• GNUPLOT

• SIGNALSCAN

• LOCAL

• ALIGN

• TULLA

Graphical Representation of BioPortal Functionality

User WWW Server BioPortalBio

SoftwareBio

Databases

ClientBrowser

(Netscape…)

The web interface allows you to focus on

"How will I solve my sequence analysis problems?"

rather than on

"How do I use this software?"

Special features

User Friendly Interface• The form based approach simplifies the task.

• It guides you through the whole program.

• No need to install and learn command line interface of all software.

Format Independent

Hyperlinks for Results

Formats supported

Results for Signal Scan

Special features

Online Help • Speeds up your data entry and analysis.

ConduitOutput can be piped to other programs for further analysis.

Pileup Distances Growtree.

Special features

• Tmp file usage:(a) Output of other programs is input for them.

(b) Programs which take a lot of time.

• Improved Operation Control:A fully customisable interface provides a flexible environment.

• Shared Access:This interface allows several distanct users to use the same program simultaneously.

• Remote retrieval and analysis:This interface allows you to access your program from anywhere in the world.

Special features of JAVA ver.

Navigation Panel :

Java navigation allows to view the tree structure for the programs.

File handling Capability:

Local desk top file loading.

Bioportal vs WWW2GCG

• File handling capability.

• Format independence.

• Work flow approach.

Bioportal vs SeqWeb

• Not all programs in GCG have been implemented.(about half) and not all options.

• Temp file usage feature is not there.

Bioportal is a test bed software in:

Schering Plough (NJ)

Wistar Institute , Upenn

Wayne State University, Michigan, USA. (Prof. David Womble's Lab)

Boston University, Boston, USA. (Prof. Temple Smith's Lab).

The Signal Scan WWW Interface is in use at:

• NIH

• TIGEM, Italy

• Japan

Future Directions

Linking Bioportal to Kleisli

BioKleisli orBioKleisli orBioKrisBioKris

Introduction

A BIC-KRDL collaboration

Open query system for broad-scale integration of heterogeneous distributed databanks

Flexible access for existing biological databanks

Features and Benefits

Features

Provides a query system that can integrate all types of biology databanks and analysis software

express high level views and transformations that go beyond flat tables

Flexible access = migrate + integrate + restructure

Benefits

More effective and easier access to world wide data for biologists

Enhances productivity and simplifies programming tasks

BioKleisli ... Sequence Analysis ..

Functional Approach

• The computational analysis of the biological data was predominantly on an ad hoc basis - i.e. the application of a given piece of software on the biological data depended on the need of the moment.

• This functional approach often resulted in piecemeal computational analysis with large amount of intervening "dead-time".

BioKris…Protocol Approach

• The present high-throughput availability of experimental biological data requires a more streamlined and integrated protocol approach

• BioKris starts with a sequence and builds a substantial custom data-warehouse of information from various sources e.g structure , sequence and medline databases

Architecture

Remote Servers

GenBank

GSDB

GDB

NCBI-BLAST

Net

Process

Sybase

ASN.1

OPM

ACeDB

BLAST

Drivers

ProcessCPL

TypeModule

Optimizer

NRC

DriverManager

PrimitiveManager

ComlpexObjectLibrary

pipe

sharedmemory

CPL-Kleisli

Blast Query against Protein DB

Blast Results

Analysis

BioWebWatchBioWebWatchororBio-EyesBio-Eyes

Problems

• Too much Information?

• Many sites to cover despite automatic notification.

• Revisit your site of interest at least once a month.

BioWebWatch

• Allows users to automate queries.

• Channels.

• Users • Keywords• Websites

Goals

• Collecting, aggregating, managing and integrating information.

• Integration with existing software tools.

• Sharing information across the organisation.

• Collaboration with other information servers.

What’s out there?

• Electronic Journals

• Web Pages

• News Groups

• Databases

Cyberspace is a big place

New Query

Search

Good Hits!

Useless Hits

An Information Butler

Information ManagementInformation Management

Automated Query

Search

Results

Automated DispersalGood Hits!

Useless Hits

Creating a Query

• Enter search string• Choose source• Preview search• Create Channel

Gathering Information

News Services,Headlines Grabber,

…………

Query Broker

• Broadcast queriesto multiple searchengines

• Grab headlinesfrom multiplesources

BookmarkedWebSite A

BookmarkedWebSite B

BookmarkedWebSite C

• Dispatch agents(spiders) to targetweb sites

Working with Information

Channel 1Title 1.1. .. summaryTitle 1.2 … summaryTitle 1.3 … summaryTitle 1.4 … summary

Channel 2

Title 2.1. .. summaryTitle 2.2 … summaryTitle 2.3 … summary

Channel 1’Title 1.1. .. summaryTitle 1.2 … summaryTitle 1.3 … summaryTitle 1.4 … summary

Channel 2’

Title 2.1. .. annotation summaryTitle 2.2 … summaryTitle 2.3 … summary

Delete / Filter

Comment / Annotate

Search / Rank

Channel 3

Title 1.2. .. summaryTitle 2.1 … summaryTitle 1.1 … summary

Distributed Processing for scalable performance

Community Server : Collating Information from multiple WebWatch Server

WebWatchCommunity

WebWatch Server 1

WebWatch Server 2

WebWatch Server 3

Enterprise wise Information Gathering

WebWatch Server• Query Broker• Community Server• Gateway to External Access

Internet

CorporateFirewall

Departments with WebWatch subServer,serving multiple internal departmental users. Some channels are shared outside of Department. Each WebWatch subServer uses WWServer as• Proxy for Internet (Gateway)• Community Server (Collaboration)

WebWatchUser

Community

WebWatchSuperServer

Updates• Software• Broker Scripts

Enterprise

EnterpriseWorkgroup

Server

InformationConsumers

within Enterprise

Remote Accessto WebWatch

Other projects

• Molecular modelling

• Peptide based vaccine design

• Secondary structure prediction

• Crystallisation in conjunction with other NUS dept. and Stanford synchrotron centre.

• Genome analysis with specific reference to intron movements.

Conclusion

• BIC has explored the ways to organise and query information and resources in a biologist friendly way.

• The protocol approach will be the new paradigm in bioinformatics on the Internet.

• Our aim is a step towards utilising the resources available to uncover a treasure trove.

Computers in Peptide Vaccine Design

Prediction of T-cell epitopes

2. ANN (Brusic et al)

1. Statistics (Parker et al)

Computer Modeling of the predictions

CARA/SCEO

Biological Testing

(Lee .C & Subbiah 1991)