Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate...

16
Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate Professor Clemson University Sponsored by NSF grant OCI‐ 1245936 KC Wang Clemson University 1 July 17 2013

Transcript of Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate...

Clemson NextNet

SDN Use Cases for Life Sciences Research

Kuang-Ching “KC” Wang

Associate ProfessorClemson University

Sponsored by NSF grant OCI‐1245936KC Wang Clemson University 1July 17 2013

Clemson NextNet: A NSF CC-NIE Project

July 17 2013 2KC Wang Clemson University

Objectives:• Direct access to I2

100G Innovation Platform

• Science DMZ from anywhere, w/o manual plumbing

• Campus production,end-to-end support

• Flexible, optimized10~40G access to resources on campus and other universities

• Software defined network (SDN)

What is the Fuss About SDN?

KC Wang Clemson University July 17 2013 3

NetworkResearchers:

Industry:

Traditional network gettinging unmanageable (not about bandwidth)!

Traditional Network SDN

What Do Our (Life Sciences) Folks Need?

KC Wang Clemson University July 17 2013 4

Real-time medical imaging

Two Clemson life sciences researchers in attendance today:• Alex Feltus

– Associate Professor in Genetics & Biochemistry

– Faculty Consultant in Clemson University Genomics Institute

– Research: Rapid crop design with massive gene interaction networks

• David Kwartowitz– Assistant Professor in

Bioengineering– Research: Rapid processing stereo

laparoscopic data for real-time pre- and intra-surgery support

PalmettoHPC

Cluster

DataStore

N…

The Feltus Lab Builds Massive Gene Interaction Networks Using RNA Expression Profiles From Next-Generation Sequence (NGS) and Microarray Experiments.

Rice (Oryza sativa)

Goal: Rapidly design new crop varieties for a specific environment including “old” environments with a changed climate…

Personalized Agriculture

Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 5

Massive amounts of DNA/RNA/Genetic Data in Databases

1.64 Quadrillion base pairs in 5 yrs!

http://www.ncbi.nlm.nih.gov/Traces/sra/ Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 6

A NGS Biomarker Example Datasets

5.7G Sample_Feltus1_L006_R1.cat.fastq5.7G Sample_Feltus1_L006_R2.cat.fastq5.8G Sample_Feltus1_L007_R1.cat.fastq5.8G Sample_Feltus1_L007_R2.cat.fastq6.7G Sample_Feltus2_L006_R1.cat.fastq6.7G Sample_Feltus2_L006_R2.cat.fastq6.8G Sample_Feltus2_L007_R1.cat.fastq6.8G Sample_Feltus2_L007_R2.cat.fastq6.5G Sample_Feltus3_L006_R1.cat.fastq6.5G Sample_Feltus3_L006_R2.cat.fastq6.6G Sample_Feltus3_L007_R1.cat.fastq6.6G Sample_Feltus3_L007_R2.cat.fastq7.3G Sample_Feltus4_L006_R1.cat.fastq7.3G Sample_Feltus4_L006_R2.cat.fastq7.4G Sample_Feltus4_L007_R1.cat.fastq7.4G Sample_Feltus4_L007_R2.cat.fastq5.6G Sample_Feltus5_L006_R1.cat.fastq5.6G Sample_Feltus5_L006_R2.cat.fastq5.7G Sample_Feltus5_L007_R1.cat.fastq5.7G Sample_Feltus5_L007_R2.cat.fastq8.8G Sample_Feltus6_L006_R1.cat.fastq8.8G Sample_Feltus6_L006_R2.cat.fastq8.9G Sample_Feltus6_L007_R1.cat.fastq8.9G Sample_Feltus6_L007_R2.cat.fastq

2.4G Sample_Feltus1_L007_R1.MERGED.BAM2.4G Sample_Feltus1_L007_R1.MERGED.BAM2.7G Sample_Feltus2_L006_R1.MERGED.BAM2.7G Sample_Feltus2_L007_R1.MERGED.BAM2.6G Sample_Feltus3_L006_R1.MERGED.BAM2.6G Sample_Feltus3_L007_R1.MERGED.BAM3.0G Sample_Feltus4_L006_R1.MERGED.BAM3.0G Sample_Feltus4_L007_R1.MERGED.BAM2.2G Sample_Feltus5_L006_R1.MERGED.BAM2.2G Sample_Feltus5_L006_R1.MERGED.BAM2.9G Sample_Feltus6_L006_R1.MERGED.BAM2.9G Sample_Feltus6_L007_R1.MERGED.BAM

6 RNA Samples in Duplicate163.6 GB (raw) + 31.8 GB (processed) =195.4 GB of critical data files(<6 hours to process on cluster)

Does not include: Intermediate processing filesReference genome (0.72 GB)

RAW DATA (uncompressed) PROCESSED DATA (compressed)

Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 7

The CUTTERS (Kwartowitz) lab is working to enable remote processing of stereo laparoscopic data for real-time feedback with surgical robot systems

on partner sites (Vanderbilt, Mayo Clinic)

KC Wang Clemson University 8July 17 2013

Clemson, SC

Vanderbilt, TN

Mayo Clinic, MN

PalmettoHPC

Cluster

How Does It Work Today

KC Wang Clemson University July 17 2013 9

ISP 1Internet

ISP 2Internet

R&Enet

……

DataCenter

CampusNetwork

ResearchNetwork

R&Enet 1

G

Down the road• compliances• User-specific

privileges• access control

What Are We Building NOW

KC Wang Clemson University July 17 2013 10

Porting GENI Research Prototype to ProductionSOS: Seamless Large Data Transport

KC Wang Clemson University 11July 17 2013

Perceived point-to-point or multi-point connection

SOS-enabledswitch

SOS-enabledswitch

SOSController

1

2

3.1

4.1

SOSagent

SOSagent

3.2

4.2

SOS pipe

TCP TCP

SOSUW-Madison

SOSClemson

SOSStanford

SOSSCinet

GENIcore

Steroid OpenFlow Service (SOS)by Aaron Rosen and KC Wang

• Seamless TCP throughput upgrade, e.g., 2.5 Mbps 120 Mbps• Multipath support• Automatic site agent detection

Upcoming demos of SOS:

• NSF 12th GENI conference, Kansas City, MO.• Supercomputing 2011, Seattle, WA.

Condo of Condos:Connecting Campus HPC with SDN

KC Wang Clemson University July 17 2013 12

Significance of IT Support Team to Bootstrap Researcher Use of HPC and SDN

KC Wang Clemson University

May 2010: Galen joins CITI and begins recruiting & training

users

New Palmetto Cluster Users

Num

ber o

f Use

rs

And to Create a Transformative University• a unique coalition among academy, IT, and industrial partners

within and beyond Clemson.

• Synergy with other university research centers: Cyberinstitute, ICAR, and Watts Innovation Center

KC Wang Clemson University July 17 2013 14

Synergy with Cross-Communities Momentum

KC Wang Clemson University July 17 2013 15

Research Communities Companies

Open Source Communities IT Communities

Universities

. . .

FURTHER [email protected]

KC Wang Clemson University

July 17 2013 16