Enabling Data-Intensive Science with Tactical Storage...

47
Enabling Enabling Data Data - - Intensive Science Intensive Science with Tactical Storage Systems with Tactical Storage Systems Prof. Douglas Thain Prof. Douglas Thain University of Notre Dame University of Notre Dame http:// http:// www.cse.nd.edu/~dthain www.cse.nd.edu/~dthain

Transcript of Enabling Data-Intensive Science with Tactical Storage...

Page 1: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

EnablingEnablingDataData--Intensive ScienceIntensive Science

with Tactical Storage Systemswith Tactical Storage Systems

Prof. Douglas ThainProf. Douglas ThainUniversity of Notre DameUniversity of Notre Damehttp://http://www.cse.nd.edu/~dthainwww.cse.nd.edu/~dthain

Page 2: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

The Cooperative Computing LabThe Cooperative Computing Lab

Our model of computer science research:Our model of computer science research:–– UnderstandUnderstand how users with complex, largehow users with complex, large--scale scale

applications need to interact with computing systems.applications need to interact with computing systems.–– DesignDesign novel computing systems that can be applied novel computing systems that can be applied

by many different users == basic CS research.by many different users == basic CS research.–– DeployDeploy code in real systems with real users, suffer code in real systems with real users, suffer

real bugs, and learn real lessons == applied CS.real bugs, and learn real lessons == applied CS.

Application Areas:Application Areas:–– Astronomy, Bioinformatics, Biometrics, Molecular Astronomy, Bioinformatics, Biometrics, Molecular

Dynamics, Physics, Game Theory, ... ???Dynamics, Physics, Game Theory, ... ???

External Support: NSF, IBM, SunExternal Support: NSF, IBM, Sun

http://http://www.cse.nd.edu/~cclwww.cse.nd.edu/~ccl

Page 3: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

AbstractAbstractUsers of distributed systems encounter many Users of distributed systems encounter many practical barriers between their jobs and the practical barriers between their jobs and the data they wish to access.data they wish to access.

Problem: Users have access to many Problem: Users have access to many resourcesresources(disks), but are stuck with the (disks), but are stuck with the abstractionsabstractions(cluster NFS) provided by administrators.(cluster NFS) provided by administrators.

Solution: Tactical Storage Systems allow any Solution: Tactical Storage Systems allow any user to create, reconfigure, and tear down user to create, reconfigure, and tear down abstractions without bugging the administrator.abstractions without bugging the administrator.

Page 4: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Transparent Distributed Filesystemshared

disk

The Standard ModelThe Standard Model

Page 5: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

The Standard ModelThe Standard Model

Transparent Distributed Filesystemshared

disk

Transparent Distributed Filesystemshared

disk

privatedisk

privatedisk

privatedisk

privatedisk

FTP, SCP, RSYNC, HTTP, ...

Page 6: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Problems with the Standard ModelProblems with the Standard ModelUsers encounter partitions in the WAN.Users encounter partitions in the WAN.–– Easy to access data inside cluster, hard outside.Easy to access data inside cluster, hard outside.–– Must use different mechanisms on diff links.Must use different mechanisms on diff links.–– Difficult to combine resources together.Difficult to combine resources together.

Different access modes for different purposes.Different access modes for different purposes.–– File transfer: preparing system for intended use.File transfer: preparing system for intended use.–– File system: access to data for running jobs.File system: access to data for running jobs.

Resources go unused.Resources go unused.–– Disks on each node of a cluster.Disks on each node of a cluster.–– Unorganized resources in a department/lab.Unorganized resources in a department/lab.

A global file system canA global file system can’’t satisfy everyone!t satisfy everyone!

Page 7: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

What if...What if...

Users could easily access any storage? Users could easily access any storage? I could borrow an unused disk for NFS?I could borrow an unused disk for NFS?An entire cluster can be used as storage?An entire cluster can be used as storage?Multiple clusters could be combined?Multiple clusters could be combined?I could reconfigure structures without root?I could reconfigure structures without root?–– (Or bugging the administrator daily.)(Or bugging the administrator daily.)

Solution: Tactical Storage System (TSS)Solution: Tactical Storage System (TSS)

Page 8: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

OutlineOutline

Problems with the Standard ModelProblems with the Standard ModelTactical Storage SystemsTactical Storage Systems–– File Servers, Catalogs, Abstractions, AdaptersFile Servers, Catalogs, Abstractions, Adapters

Applications:Applications:–– Remote Database Access for Remote Database Access for BaBarBaBar CodeCode–– Remote Dynamic Linking for CDF CodeRemote Dynamic Linking for CDF Code–– Logical Data Access for Bioinformatics CodeLogical Data Access for Bioinformatics Code–– Expandable Database for MD SimulationExpandable Database for MD Simulation

Improving the OS for Grid ComputingImproving the OS for Grid Computing

Page 9: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Tactical Storage Systems (TSS)Tactical Storage Systems (TSS)

A TSS allows any node to serve as a file A TSS allows any node to serve as a file server or as a file system client.server or as a file system client.All components can be deployed without All components can be deployed without special privileges special privileges –– but with security.but with security.Users can build up complex structures.Users can build up complex structures.–– FilesystemsFilesystems, databases, caches, ..., databases, caches, ...

Two Independent Concepts:Two Independent Concepts:–– ResourcesResources –– The raw storage to be used.The raw storage to be used.–– AbstractionsAbstractions –– The organization of storage.The organization of storage.

Page 10: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

file transfer

filesystem

filesystem

filesystem

filesystem

filesystem

filesystem

filesystem

CentralFilesystem

App

Distributed Database Abstraction

Adapter

App

Distributed Filesystem Abstraction

Adapter

App

Cluster administrator controlspolicy on all storage in cluster

UNIX UNIX UNIX UNIX UNIX UNIX UNIX

Workstations owners controlpolicy on each machine.

fileserver

fileserver

fileserver

fileserver

fileserver

fileserver

fileserver

UNIX UNIX UNIX UNIX UNIX UNIX UNIX

???Adapter

3PT

Page 11: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Components of a TSS:Components of a TSS:

1 1 –– File ServersFile Servers2 2 –– CatalogsCatalogs3 3 –– AbstractionsAbstractions4 4 –– AdaptersAdapters

Page 12: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

1 1 –– File ServersFile ServersUnixUnix--Like InterfaceLike Interface–– open/close/read/writeopen/close/read/write–– getfile/putfilegetfile/putfile to stream whole filesto stream whole files–– opendiropendir/stat/rename/unlink/stat/rename/unlink

Complete IndependenceComplete Independence–– choose friendschoose friends–– limit bandwidth/spacelimit bandwidth/space–– evict users?evict users?

Trivial to DeployTrivial to Deploy–– run server + run server + setaclsetacl–– no privilege requiredno privilege required–– can be thrown into a grid systemcan be thrown into a grid system

Flexible Access ControlFlexible Access Control

fileserver

A

fileserver

B

ChirpProtocol

filesystemowner of

server Aowner ofserver B

Page 13: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Related WorkRelated Work

Lots of file services for the Grid:Lots of file services for the Grid:–– GridFTPGridFTP, , NeSTNeST, SRB, RFIO, SRM, IBP, ..., SRB, RFIO, SRM, IBP, ...–– Adapter interfaces with many of these!Adapter interfaces with many of these!

Why have Why have anotheranother file server?file server?–– Reason 1: Must have precise Unix semantics!Reason 1: Must have precise Unix semantics!

Apps distinguish ENOENT Apps distinguish ENOENT vsvs EACCES EACCES vsvs EISDIR.EISDIR.FTP always returns error 550, regardless of error.FTP always returns error 550, regardless of error.

–– Reason 2: TSS focused on easy deployment.Reason 2: TSS focused on easy deployment.No privilege required, no No privilege required, no configconfig files, no rebuilding, files, no rebuilding, flexible access control, ...flexible access control, ...

Page 14: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Access Control in File ServersAccess Control in File ServersUnix Security is not SufficientUnix Security is not Sufficient–– No global user database possible/desirable.No global user database possible/desirable.–– Mapping external credentials to Unix gets messy.Mapping external credentials to Unix gets messy.

Instead, Make External Names FirstInstead, Make External Names First--ClassClass–– Perform access control on remote, not local, names.Perform access control on remote, not local, names.–– Types: Types: GlobusGlobus, Kerberos, Unix, Hostname, Address, Kerberos, Unix, Hostname, Address

Each directory has an ACL:Each directory has an ACL:globusglobus:/O=:/O=NotreDameNotreDame/CN=/CN=DThainDThain RWLARWLAkerberos:[email protected]:[email protected] RWLRWLhostname:*.hostname:*.cs.nd.educs.nd.edu RLRLaddress:192.168.1.* RWLAaddress:192.168.1.* RWLA

Page 15: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Problem: Shared NamespaceProblem: Shared Namespacefile

server

globus:/O=NotreDame/* RWLAX

a.out

test.c test.dat

cms.exe

Page 16: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Solution: Reservation (V) RightSolution: Reservation (V) Rightfile

server

O=NotreDame/CN=* V(RWLA)

/O=NotreDame/CN=Monk RWLA

mkdir

a.outtest.c

/O=NotreDame/CN=Monk

mkdir

/O=NotreDame/CN=Ted RWLA

a.outtest.c

/O=NotreDame/CN=Tedmkdir only!

Page 17: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

2 2 -- CatalogsCatalogs

catalogserver

catalogserver

periodicUDP updates

HTTPXML, TXT, ClassAds

Page 18: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

3 3 -- AbstractionsAbstractions

An abstraction is an organizational layer built on An abstraction is an organizational layer built on top of one or more file servers.top of one or more file servers.End UsersEnd Users choose what abstractions to employ.choose what abstractions to employ.Working Examples:Working Examples:–– CFS: Central File SystemCFS: Central File System–– DSFS: Distributed Shared File SystemDSFS: Distributed Shared File System–– DSDB: Distributed Shared DatabaseDSDB: Distributed Shared Database

Others Possible?Others Possible?–– Distributed Backup SystemDistributed Backup System–– Striped File System (RAID/Zebra)Striped File System (RAID/Zebra)

Page 19: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

CFS: Central File SystemCFS: Central File System

fileserver

adapteradapter adapter

appl appl appl

file file

file

CFSCFSCFS

Page 20: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

ptr ptr

ptr

DSFS: Dist. Shared File SystemDSFS: Dist. Shared File System

fileserver

appl appl

fileserver

fileserver

file file

filefilefile

file filefile

filefile

adapter adapterDSFSDSFS

lookupfile

location

accessdata

Page 21: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

DSDB: Dist. Shared DatabaseDSDB: Dist. Shared Database

adapter adapter

appl appl

fileserver

fileserver

file file

filefilefile

file filefile

filefile

databaseserver

file index

query

directaccess

insert

create

file

DSDBDSDB

Page 22: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

system callstrapped via ptrace

tcsh

cat vi

tcsh

cat vi

file tableprocess table

Like an OS KernelLike an OS Kernel–– Tracks Tracks procsprocs, files, etc., files, etc.–– Adds new capabilities.Adds new capabilities.–– Enforces ownerEnforces owner’’s policies.s policies.

Delegated Delegated SyscallsSyscalls–– Trapped via Trapped via ptraceptrace interface.interface.–– Action taken by Parrot.Action taken by Parrot.–– Resources Resources chrgdchrgd to Parrot.to Parrot.

User Chooses User Chooses AbstrAbstr..–– Appears as a Appears as a filesystemfilesystem..–– Option: Timeout tolerance.Option: Timeout tolerance.–– Option: Cons. semantics.Option: Cons. semantics.–– Option: Servers to use.Option: Servers to use.–– Option: Auth mechanisms.Option: Auth mechanisms.

4 4 -- AdapterAdapter

Adapter - Parrot

Abstractions:CFS – DSFS - DSDB

HTTP, FTP, RFIO,NeST, SRB, gLite

???

Page 23: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

file transfer

filesystem

filesystem

filesystem

filesystem

filesystem

filesystem

filesystem

CentralFilesystem

App

Distributed Database Abstraction

Adapter

App

Distributed Filesystem Abstraction

Adapter

App

Cluster administrator controlspolicy on all storage in cluster

UNIX UNIX UNIX UNIX UNIX UNIX UNIX

Workstations owners controlpolicy on each machine.

fileserver

fileserver

fileserver

fileserver

fileserver

fileserver

fileserver

UNIX UNIX UNIX UNIX UNIX UNIX UNIX

???Adapter

Page 24: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Performance SummaryPerformance SummaryNothing comes for free!Nothing comes for free!–– System calls: order of magnitude slower.System calls: order of magnitude slower.–– Memory bandwidth overhead: extra copies.Memory bandwidth overhead: extra copies.–– TSS can drive network/switch to limits.TSS can drive network/switch to limits.

Compared to NFS Protocol:Compared to NFS Protocol:–– TSS slightly better on small operations. (no lookup)TSS slightly better on small operations. (no lookup)–– TSS much better in network bandwidth. (TCP)TSS much better in network bandwidth. (TCP)–– NFS caches, TSS doesnNFS caches, TSS doesn’’t (today), mixed blessing.t (today), mixed blessing.

On real applications:On real applications:–– Measurable slowdown, typically 5 percent.Measurable slowdown, typically 5 percent.–– Benefit: far more flexible and scalable.Benefit: far more flexible and scalable.

Page 25: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

OutlineOutline

Problems with the Standard ModelProblems with the Standard ModelTactical Storage SystemsTactical Storage Systems–– File Servers, Catalogs, Abstractions, AdaptersFile Servers, Catalogs, Abstractions, Adapters

Applications:Applications:–– Remote Database Access for Remote Database Access for BaBarBaBar CodeCode–– Remote Dynamic Linking for CDF CodeRemote Dynamic Linking for CDF Code–– Logical Data Access for Bioinformatics CodeLogical Data Access for Bioinformatics Code–– Expandable Database for MD SimulationExpandable Database for MD Simulation

Improving the OS for Grid ComputingImproving the OS for Grid Computing

Page 26: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Remote Database AccessRemote Database Access

script

ParrotTSSfile

server

filesystem

DB data

libdb.so

sim.exe

WANCFS

HEP Simulation Needs Direct DB AccessHEP Simulation Needs Direct DB Access–– App linked against Objectivity DB.App linked against Objectivity DB.–– Objectivity accesses Objectivity accesses filesystemfilesystem directly.directly.–– How to distribute application How to distribute application securelysecurely??

Solution: Remote Root Mount via TSS:Solution: Remote Root Mount via TSS:parrot parrot ––M /=/chirp/fileserver/M /=/chirp/fileserver/rootdirrootdirDB code can read/write/lock files directly.DB code can read/write/lock files directly.

GSI Auth

GSI

Credit: Sander Klous @ NIKHEF

Page 27: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Remote Application LoadingRemote Application Loading

appl

Parrot

ld.so HTTPserver

filesystem

liba.so

libb.so

libc.soWAN

Credit: Igor Sfiligoi @ Fermi National Lab

HTTP

Modular Simulation Needs Many LibrariesModular Simulation Needs Many Libraries–– DevelDevel. on workstations, then ported to grid.. on workstations, then ported to grid.–– Selection of library depends on analysis tech.Selection of library depends on analysis tech.–– Constraint: Must use HTTP for file access.Constraint: Must use HTTP for file access.

Solution: Dynamic Link with TSS+HTTP:Solution: Dynamic Link with TSS+HTTP:–– /home//home/cdfsoftcdfsoft --> /http/> /http/dcaf.fnal.gov/cdfsoftdcaf.fnal.gov/cdfsoft

select several MB from 60 GB of libraries

Page 28: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Technical ProblemTechnical Problem

HTTP is not a HTTP is not a filesystemfilesystem! (No directories)! (No directories)–– Advantages: Firewalls, caches, Advantages: Firewalls, caches, adminsadmins..

Appl

Parrot

HTTP Module

HTTPServer

root

etchome bin

alice cmsbabar

opendir(/home)

opendir(/home)

GET /home HTTP/1.0

<HTML><HEAD>

<H1>

Page 29: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Technical ProblemTechnical Problem

Solution: Turn the directories into files.Solution: Turn the directories into files.–– Can be cached in ordinary proxies!Can be cached in ordinary proxies!

Appl

Parrot

HTTP Module

HTTPServer

root

etchome bin

alice cmsbabar

opendir(/home)

opendir(/home)

GET /home/.dir HTTP/1.0

.dir

.dir

makehttpfs

alicebabarcms

Page 30: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:
Page 31: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:
Page 32: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:
Page 33: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:
Page 34: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:
Page 35: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Logical Access to Bio DataLogical Access to Bio DataMany databases of biological data in different Many databases of biological data in different formats around the world:formats around the world:–– Archives: SwissArchives: Swiss--Prot, Prot, TreMBLTreMBL, NCBI, etc..., NCBI, etc...–– Replicas: Public, Shared, Private, ???Replicas: Public, Shared, Private, ???

Users and applications want to refer to data Users and applications want to refer to data objects by logical name, not location!objects by logical name, not location!–– Access the nearest copy of the nonAccess the nearest copy of the non--redundant protein redundant protein

database, dondatabase, don’’t care where it is.t care where it is.

Solution: EGEE data management system maps Solution: EGEE data management system maps logical names (logical names (LFNsLFNs) to physical names () to physical names (SFNsSFNs).).

Credit: Christophe Blanchet, Bioinformatics Center of Lyon, CNRS IBCP, Francehttp://gbio.ibcp.fr/cblanchet, [email protected]

Page 36: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Logical Access to Bio DataLogical Access to Bio Data

BLAST

Parrot

RFIO gLite HTTP FTP

ChirpServer

FTPServer

gLiteServer

EGEE FileLocation Service

Run BLAST onLFN://ncbi.gov/nr.data

open(LFN://ncbi.gov/nr.data)

Where isLFN://ncbi.gov/nr.data?

Find it at:SFN://ibcp.fr/data/NR

nr.data

nr.data

nr.dataRETR nr.data

open(SFN://ibcp.fr/nr.data)

Page 37: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

ApplAppl: Distributed MD Database: Distributed MD DatabaseState of Molecular Dynamics Research:State of Molecular Dynamics Research:–– Easy to run lots of simulations!Easy to run lots of simulations!–– Difficult to understand the Difficult to understand the ““big picturebig picture””–– Hard to systematically share results and ask questions.Hard to systematically share results and ask questions.

Desired Questions and Activities:Desired Questions and Activities:–– ““What parameters have I explored?What parameters have I explored?””–– ““How can I share results with friends?How can I share results with friends?””–– ““Replicate these items five times for safety.Replicate these items five times for safety.””–– ““RecomputeRecompute everything that relied on this machine.everything that relied on this machine.””

GEMS: Grid Enabled Molecular SimsGEMS: Grid Enabled Molecular Sims–– Distributed database for MD Distributed database for MD simlsiml at Notre Dame.at Notre Dame.–– XML database for indexing, TSS for storage/policy.XML database for indexing, TSS for storage/policy.

Page 38: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

GEMS Distributed DatabaseGEMS Distributed Databasedatabase

server

catalogserver catalog

serverXML -> host1:fileAhost7:fileBhost3:fileC

A C BY Z X

XML -> host6:fileXhost2:fileYhost5:fileZ

data

XML+ Temp>300KMol==CH4

Credit: Jesus Izaguirre and Aaron Striegel, Notre Dame CSE Dept.

host5:fileZhost6:fileXDSFS

Adapter

Page 39: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Active Recovery in GEMSActive Recovery in GEMS

Page 40: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

GEMS and Tactical StorageGEMS and Tactical Storage

Dynamic System ConfigurationDynamic System Configuration–– Add/remove servers, discovered via catalogAdd/remove servers, discovered via catalog

Policy Control in File ServersPolicy Control in File Servers–– Groups can Collaborate within ConstraintsGroups can Collaborate within Constraints–– Security Implemented within File ServersSecurity Implemented within File Servers

Direct Access via AdaptersDirect Access via Adapters–– Unmodified Simulations can use DatabaseUnmodified Simulations can use Database–– Alternate Web/Alternate Web/VizViz Interfaces for Users.Interfaces for Users.

Page 41: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

OutlineOutline

Problems with the Standard ModelProblems with the Standard ModelTactical Storage SystemsTactical Storage Systems–– File Servers, Catalogs, Abstractions, AdaptersFile Servers, Catalogs, Abstractions, Adapters

Applications:Applications:–– Remote Database Access for Remote Database Access for BaBarBaBar CodeCode–– Remote Dynamic Linking for CDF CodeRemote Dynamic Linking for CDF Code–– Logical Data Access for Bioinformatics CodeLogical Data Access for Bioinformatics Code–– Expandable Database for MD SimulationExpandable Database for MD Simulation

Improving the OS for Grid ComputingImproving the OS for Grid Computing

Page 42: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

OS Support for Grid ComputingOS Support for Grid Computing

Grid computing in general suffers because Grid computing in general suffers because of limitations in the operating system.of limitations in the operating system.Security and permissions:Security and permissions:–– No No ACLsACLs --> hard to share data> hard to share data–– Root can Root can setuidsetuid --> hard to secure services.> hard to secure services.

Resource allocation:Resource allocation:–– Cannot reserve space Cannot reserve space --> jobs crash> jobs crash–– Hard to clean up Hard to clean up procsprocs --> unreliable systems> unreliable systems

Page 43: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

student

root

alice

httpd

visitor

kerberos

bob

visitor

anon1 anon2

These two usersare completely different:

root:kerberos:alice:visitorroot:kerberos:bob:visitor

The web server can createdistinct anonymous accounts.No need for global nobody.

kerberos given tothe login server.

alice createdby krb5 login.

student createdat run-time.

Page 44: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Tactical Storage SystemsTactical Storage Systems

Separate Separate AbstractionsAbstractions from from ResourcesResourcesComponents:Components:–– Servers, catalogs, abstractions, adapters.Servers, catalogs, abstractions, adapters.–– Completely user level.Completely user level.–– Performance acceptable for real applications.Performance acceptable for real applications.

Independent but Cooperating ComponentsIndependent but Cooperating Components–– Owners of file servers set policy.Owners of file servers set policy.–– Users must work within policies.Users must work within policies.–– Within policies, users are free to build.Within policies, users are free to build.

Page 45: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

Parting ThoughtParting Thought

Many users of the grid are constrained by Many users of the grid are constrained by functionalityfunctionality, not performance., not performance.TSS allows end users to build the TSS allows end users to build the structures that they need for the moment structures that they need for the moment without involving an admin.without involving an admin.Analogy: building blocksAnalogy: building blocksfor distributed storage.for distributed storage.

Page 46: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

AcknowledgmentsAcknowledgments

Science Collaborators:Science Collaborators:–– Christophe Christophe BlanchetBlanchet–– Sander Sander KlousKlous–– Peter Peter KunzstKunzst–– Erwin LaureErwin Laure–– John John PoirerPoirer–– Igor Igor SfiligoiSfiligoi

CS Collaborators:CS Collaborators:–– Jesus Jesus IzaguirreIzaguirre–– Aaron Aaron StriegelStriegel

CS Students:CS Students:–– Paul BrennerPaul Brenner–– James FitzgeraldJames Fitzgerald–– Jeff Jeff HemmesHemmes–– Paul MadridPaul Madrid–– Chris Chris MorettiMoretti–– Phil SnowbergerPhil Snowberger–– Justin WozniakJustin Wozniak

Page 47: Enabling Data-Intensive Science with Tactical Storage Systemscern.ch/Computing.Seminars/2006/0125/slides.pdf · The Cooperative Computing Lab Our model of computer science research:

For more information...For more information...

Cooperative Computing LabCooperative Computing Labhttp://www.cse.nd.edu/~cclhttp://www.cse.nd.edu/~ccl

Cooperative Computing ToolsCooperative Computing Toolshttp://http://www.cctools.orgwww.cctools.org

Douglas ThainDouglas Thain–– [email protected]@cse.nd.edu–– http://http://www.cse.nd.edu/~dthainwww.cse.nd.edu/~dthain