Code camp 2015 visual programming mm

Post on 08-Apr-2017

335 views 3 download

Transcript of Code camp 2015 visual programming mm

Visual Programming

Environments for

Science and BusinessMITCH MILLER

SCIENTIFIC THINKING

CODE CAMP 2015

SEPTEMBER 19, 2015

Disclaimer

This talk represents my opinion and personal experience using 2 fine

software systems developed by third parties

The software systems shown are very complex and have hundreds of components. I have only worked with a small number.

Every task shown today can be accomplished in multiple ways. I’m

only showing of those ways.

Overview

Introduction: first demo

What is a ‘visual programming environment’

The two systems we’ll look at today

What are these systems capable of?

Second set demos (in-depth)

Demo 1: set-up

Task: produce report of all compounds registered during January

Visual Programming: informal

definition

Drag functional components onto canvas to create program

Configure most components by setting parameters

Connect components to route data from one to another

Run and observe data traveling down the lines

Component types

File I/O

Read/write text files

Read/write MS Office documents

XML

JSON

PDF

Database access

Connect

Query

Update

Component types (continued)

Web service consumption

Domain-specific processing

Chemical structure I/O

Chemical structure processing and analysis

Sequence processing

Extensibility

Add your own libraries for more sophisticated processing

Component types (continued)

Visualization

Graphing

Statistical calculations

Scripting

Tip: aim for brief scripts

Data transformation

If/else processing

Filtering

Column selection

And many more…

KNIME

Originally a production of the University of Konstanz, Germany 2004

Currently produced by KNIME.com AG, a company in Zurich,

Switzerland

KNIME stands for KoNstanz Information MinEr

Pronounced “Nighm”

A general purpose data analytics platform

Free version available for download

For-sale version available with added extensions

KNIME (continued)

Java based

Written in Java

Scripted, extensible in Java

URL: https://www.knime.org/

Pipeline Pilot

Developed and sold by BIOVIA, San Diego, CA

Originally developed by Scitegic, San Diego in 1999

Designed for scientists to “rapidly create, test and publish scientific

services that automate the process of accessing, analyzing and

reporting scientific data”

(http://accelrys.com/products/collaborative-science/biovia-

pipeline-pilot/)

Client-server system

Commercial product

Extensible using .NET and Java

Scripted using an original language, ‘PilotScript’

KNIME Terminology

Components are called “Nodes”

Programs are “Workflows”

Reusable sets of Nodes are “Metanodes”

Groups of related Nodes are “Extensions”

Pipeline Pilot Terminology

Components are called “Components”

Programs are “Protocols”

Reusable sets of Components are “Subprotocols”

Groups of related Components are “Packages”

Different protocols can be combined

One protocol provides initial UI –including a Web form

A second protocol handles form data processing (‘work protocol’)

Different systems shown today

serve different populations

KNIME can be used ad hoc on the desktop of a power user. It is also

used by companies in a variety of industries

Pipeline Pilot is geared towards scientists and is part of an enterprise system and requires a server installation

Programs can be deployed outside

the development client

Give users a URL to access your program

Users of BIOVIA Electronic Lab Notebook and other software can access

Pipeline Pilot protocols outside the Pipeline Pilot UI

Users access a Web application that shows them the data they’re

looking for in a purpose-built user interface

The application does not look like the system with which it was built

For-sale version of KNIME Server provides similar functionality

Server Features

User access configuration

Shared data sources

Automatic jobs

Etc.

Second demo

Exploration of data set using KNIME and Pipeline Pilot

Data set comes from National Cancer Institute (NCI)’s Developmental

Therapeutics Program (DTP)

Results of laboratory tests for activity against 60 types of human cancer

cell lines

Data freely available:

https://dtp.cancer.gov/discovery_development/nci-60/default.htm

Additional demos

Pipeline Pilot Web Port sample

Suggestions for getting started

Download the KNIME software(knime.org)

Install on your computer

Look at the sample workflows

Start simple; build up

Types of applications

Reporting

Data set comparisons

ETL

Data Analysis

References

Scholarly article on KNIME and Pipeline Pilot

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3414708/

www.knime.org

https://www.youtube.com/user/KNIMETV

http://accelrys.com/products/collaborative-science/biovia-

pipeline-pilot/

https://dtp.cancer.gov/

Who is your speaker?

Mitch Miller, Ph.D. in Chemistry and 20+ years of IT experience

Independent consultant: Scientific Thinking, LLC

mitch.miller@thinkscience.us

Some recent projects

Ongoing custodian of one chemical database implementation for ChemIDplus project within the National Library of Medicine

Upgraded 10-year-old Java Servlet lab workflow application to latest version of JDK, Internet Explorer 11 and implemented enhancements

Windows service to handle communication between 2 legacy applications

Import wizard for chemical array designer

Merged a set of chemical databases and harmonized data