System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and...

21
System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems September 2003

Transcript of System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and...

Page 1: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter System Software

Andrew A. ChienSAIC Chair Professor,

Computer Science and Engineering, UCSD Director, Center for Networked Systems

September 2003

Page 2: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter System Software Team

• Challenge – ~20 Lead Researchers, Many More in Entire Team– Diverse Researcher Backgrounds and Focus– Broad Research Agenda, Abstract Shared Perspective

• Process– Innumerable Phone Calls and 1-on-1 Meetings, Fall 2002-Spring 2003– Team Meeting with UCSD and UCI Teams (October 4, 2002)– Straw Man OptIPuter System Software Architecture (January 2003)

– Goals, Context, Organization, Relationship of Efforts– OptIPuter All Hands Meeting, February 6-7, 2003

– First Presentation to Entire Team – Feedback, Revision, Improvement, Deeper Understanding, Shared

Perspective– Optical Signalling and Network Management Meeting (May 22, 2003)

– Mambretti Organized– OptIPuter Software Architecture Version 1.0 (July 2003)

– Structure Stabilized, interfaces Becoming Concrete

Page 3: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

’s Transform Distributed Systems

• Key Technology Changes– Massive Bandwidth

– 100-1000x Increases Wide-Area Systems– “End To End” -Connections

– Private Networks, Guaranteed Bandwidth – Endpoints are Parallel Clusters – Large-Scale Network-Attached

– Storage– Instruments– Displays– Other Peripherals

– Grids and Flexible Wide-Area Sharing• Opportunities

– Communication– Tight Wide-area Resource Coupling – Simpler Distributed Applications– Proactive Computing and Communication

Challenge is Abstractions,

Technologies, and Protocols (SOFTWARE!)

to Deliver these Capabilities to Applications

Page 4: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Towards Middleware for -Networked Systems

FabricResource Access and Control: Computers, Storage, Networks

ConnectivityGlobus_IO/XIO & GSI

ResourceGRAM, GridFTP, GRIS, Co-allocation

CollectiveDUROC, GARA, Replica Catalogs, Metadata Servers, Brokers, Workflow

Application

• Leverage Investment and Capabilities (e.g. Globus 2.2 and 3.0)– Carl Kesselman OptIPuter Participant– Ian Foster, OptIPuter Frontier Advisory Board

• Explore What Must Change– New Software/Protocols for Managing Lambdas– Simplify, Deliver Higher Performance and New Capabilities

Globus Architecture

Page 5: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter Software Architecture for Distributed Virtual Computers v1.1

Layer 4: XCPNode Operating Systems

-configuration, Net Management

Grid and Web Middleware – (Globus/OGSA/WebServices/J2EE)

Physical Resources

DVC #1

OptIPuter Applications

DVC #2 DVC #3

Layer 5: SABUL, RBUDP, Fast, GTP

Real-Time Objects

Security Models

Data Services:DWTP

Higher Level Grid Services

VisualizationDVC/

Middleware

High-Speed Transport

Optical Signaling/Mgmt

Page 6: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter Links Three Major Sets ofTechnology Activities

• Distributed Virtual Computers– Provide a Simple Abstractions – Aggregate Component Technology Capabilities– Surface Novel Capabilities

• High speed Transport Protocols [Bannister’s Talk]– Long Thread of High Bandwidth-Delay Product Network Protocols– Span The Range “Reach” For Dedicated Optical Connections

– Complete Integration with IP Network Management – Hybrid – to Local Packet-Switched Networks– Separate – End-to-end

• Optical Network Signaling and Management [Mambretti’s Talk]– Single Domain and Inter-Domain– Hybrid Circuit and Packet-Switched Networks– Planning and Execution

Page 7: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Distributed Virtual Computers

Page 8: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Exploiting ’s for an Application

• Network View: Ad Hoc connections– Applications Request -Connections– Network Recognizes High BW flows and Configures

• System View: Enclave of Resources and Connections– a Distributed Virtual Computer (a SYSTEM)– How to Specify, Implement, and Exploit?

Page 9: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

DVC Examples

• Virtual Cluster (Hide Complexity of Grid; Resource Flexibility)– Shared Single Domain (Spans Multiple)– Private Connections; Simple Network Naming– Simple Resource Discovery and Access– Uniform Performance Characteristics– Direct Access to Everything (Storage, Displays, etc.)

• Real-Time Virtual Cluster for Distributed Collaborative Visualization– Grid Resources + Real-Time (TMO)

• Collaborative Visualization Cluster– Grid Resources + Photonic Multicast or LambdaRAM (Leigh)

SIO/NCMIR

UCI or UICSDSC

UCSD CSE

Page 10: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Realizing Distributed Virtual Computers

• Research Challenges– Application-driven Definition of Abstractions

– Useful Collections which Match Application Paradigms and Needs– Incorporates New Collective Models

– DVC Description– Namespaces, Communication, Performance, Real-Time, … – Standard Specifications; Most Applications Parameterize

– Integration Of Component Technologies• Executing the DVC on a Grid

– Planner That Identifies Resources – Selects from Virtual Grid Resources– Negotiates with Resource Managers and Brokers

– Executor and Monitor for DVC– Acquires and Configures – Monitors for Failures and Performance– Adapts and Reconfigures

Page 11: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter Component Technologies

Page 12: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Current Storage Views

• Network-attached Storage (NAS) – Filesystem protocols; Integrated Access-Control and Security– Low performance; Little Aggregation and Parallelism

• Grid View: High-Level Storage Federation– GridFTP (Distributed File Sharing)– GSI-based Access/Authentication– Put/Get, Third-Party Transfers, Whole File and Segments

• Single-System view: Lower-level storage federation– Secure Single System View– SAN – Block Level Disk and Controller Protocols– High Performance, Efficient sharing

• Research Areas– Network-Attached Secure Disk– Direct Access File Systems

Page 13: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

We Need a Distributed Storage Solutionfor e-Science Distributed Data Generators

• BIRN: Distributed Data, Intensive Analysis– 100GB Data Elements; Petabyte Data Sets– Comparative and Collective Analysis across Data Elements– Visualization of Multi-Scale Data Objects

Page 14: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Storage Research Directions

• From Performance to Performability– Manage and Exploit Multi-Latency Performance– Parallel Performance, Stability, and Isolation– Integration of Device, Network, Site Reliability Concerns

• OptIPuter Storage Directions– Application-Driven Design

– Needs, Performance, Device/Site/Network Flexibility, Coding and Selection

– Integrate Dynamic ’s and SAN Networks– Peering, Protocol Interfacing, Performance

– Performance Robust Storage– Erasure/Other Redundancy; Large-Scale Parallelism; Statistical

Approaches to Performance Isolation– Secure Shared Storage: Threshold Cryptography Approach

Page 15: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

OptIPuter Security Considerations

• OptIPuter as a Computing Platform – Information Assurance and Security Needed for Applications

– Current Plan: use Globus Security Infrastructure

• OptIPuter as a Research Platform– Current Efforts

– Distributed Security Services (Goodrich & Tamassia)– Incremental IP Trace-Back via Packet Marking for DOS Defense

(Goodrich)– Enhanced Forensic Analysis By Design (Karin & Peisert)

– Planned Efforts– Minimum Round Trip Latency Control (Goodrich)– Hardening Against Attacks by Multi-Path Routing (Goodrich, Karin)– End-to-End Application and Session Security Through Dedicated

Lambdas (Karin)

Source: Karin, UCSD and Goodrich, UCI

Page 16: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Multi-Lambda Security Opportunities

• Security Frequently Defined Through Three Measures: – Integrity, Confidentiality, And Reliability (“Uptime”)

• Can These Measures be Enhanced by Employing Multiple Lambdas?

• Can Confidentiality be Improved by Dividing the Transmission Over Multiple Lambdas?– Fundamentally or Using “Cheap” Encryption?

• Can Integrity be Ensured or Reliability Improved by Exploiting Redundancy?– Source Coding and Performance– Adaptive Techniques

Source: Goodrich, Karin

Page 17: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Vision – Real-Time Tightly Coupled Wide-Area Distributed Computing

Real-Time

Object network

Goals

• High-precision Timings of Critical Actions

• Tight Bounds on Response Times

• Ease of Programming

–High-Level Prog–Top-Down Design

• Ease of Timing Analysis

Dynamically formed

DistributedVirtual

Computer

Source: Kim, UCI

Page 18: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Real-Time: from LAN to WAN

• Time-Triggered Message-Triggered Object (TMO) Middleware Subsystem Model that can be Easily Implemented on Both Windows and Linux Platforms

• Developed a Global Time-Based Coordination for use in Fair and Efficient Distributed On-Line Game Systems and LAN Feasibility Demonstration– a Step towards Distributed OptIPuter

Environment Demonstration– Paper will be Presented at IDPT 2003

Conference, December 2003

var

TT Method 2

Service Method 1

TT Method 1AAC

AAC

Compo-nents of a C++ object

• No thread, No priority

High-level Programming Style

Deadlines

Service Method 2

Source: Kim, UCI

Page 19: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

TMO and OptIPuter Software

• TMO will be Integrated into the Overall OptIPuter Software Architecture

• Begin Design TMO Programming Framework for the OptIPuter

• Prototype Implementation TMO Support on Linux Platforms, Including OptIPuter Visualization Cluster (UIC – Leigh, UCI -- Jenks)

Kernel

TMOSM

FT Support

Middleware

Lambdamux / demux

Kernel

TMOSM

FT Support

Middleware

Lambdamux / demux

data

data

data

" Let us start a chorus at 2pm "

" e-Science "

• An API Wrapping the Services of the RT Middleware Enables High-Level RT Programming Without a new Compiler

Source: Kim, UCI

Page 20: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Prophesy: Application Performance Modeling

• Performance Modeling of Applications on OptIPuter

• Cross Platform Comparison (vs. Traditional Grid & Parallel)

• Yr1: Completed Data Analysis Module

• Yr2: Work with Applications and High Speed Transport Protocols

• Target applications include:– SIO Geophysical Data

Visualization– NCMIR/BIRN Neuroscience

Applications

Source: Taylor, TAMU

Web-based GUI

Profiling & Instrumentation

Actual

Execution

Performance Database

TemplateDatabase

SystemsDatabase

ModelBuilder

SymbolicPredictor

DATACOLLECTION DATABASES

DATAANALYSIS

Page 21: System Software OptIPuter System Software Andrew A. Chien SAIC Chair Professor, Computer Science and Engineering, UCSD Director, Center for Networked Systems.

System Software

Summary

• OptIPuter System Software Team Organization– Development of a Concrete, Shared Perspective– Organization into Tightly-Coupled Teams

• OptIPuter Software Architecture 1.0 (July 2003)– Provides Focus on Key Problems, Clusters Related Activities– Framework for Integrating Diverse Capabilities, Identifying Gaps,

Integrating and Delivering Solutions

• Research Activity Clusters– Distributed Virtual Computers

– Including Real-Time, Security, Storage, Performance Modeling

– High Speed Transport Protocols– Optical Signaling and Network Management