DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research
Enabling Data-Intensive Science with Tactical Storage Systems
description
Transcript of Enabling Data-Intensive Science with Tactical Storage Systems
EnablingEnablingData-Intensive ScienceData-Intensive Science
with Tactical Storage Systemswith Tactical Storage Systems
Douglas ThainDouglas Thain
http://www.cse.nd.edu/~dthainhttp://www.cse.nd.edu/~dthain
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Sharing is Hard!Sharing is Hard!Despite decades of research in distributed Despite decades of research in distributed systems and operating systems, sharing systems and operating systems, sharing computing resources is still technically and computing resources is still technically and socially difficult!socially difficult!
Most existing systems for sharing require:Most existing systems for sharing require:– Kernel level software.Kernel level software.– A privileged login.A privileged login.– Centralized trust.Centralized trust.– Loss of control over resources that you own.Loss of control over resources that you own.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Example: Grid ComputingExample: Grid Computing
Robert Gardner, et al. (102 authors)Robert Gardner, et al. (102 authors)The Grid2003 Production GridThe Grid2003 Production Grid
Principles and PracticePrinciples and PracticeIEEE HPDC 2004IEEE HPDC 2004
The Grid2003 Project has deployed a multi-virtual The Grid2003 Project has deployed a multi-virtual organization, application-driven grid laboratory organization, application-driven grid laboratory
that has sustained for several months the that has sustained for several months the production-level services required by…production-level services required by…
ATLAS, CMS, SDSS, LIGO…ATLAS, CMS, SDSS, LIGO…
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Grid Computing ExperienceGrid Computing ExperienceThe good news:The good news:
– 27 sites with 2800 CPUs27 sites with 2800 CPUs– 40985 CPU-days provided over 6 months40985 CPU-days provided over 6 months– 10 applications with 1300 simultaneous jobs10 applications with 1300 simultaneous jobs
The bad news:The bad news:– 40-70 percent utilization40-70 percent utilization– 30 percent of jobs would fail30 percent of jobs would fail– 90 percent of failures were site problems90 percent of failures were site problems– Most site failures were due to disk space.Most site failures were due to disk space.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
A Strange ProblemA Strange ProblemStorage is Plentiful!Storage is Plentiful!– Large disks on every CPU, PDA, and iPod.Large disks on every CPU, PDA, and iPod.– Typ. cluster has unused disks on each node.Typ. cluster has unused disks on each node.– MS filesystem study: most disks 90% free.MS filesystem study: most disks 90% free.– Tools for sharing: AFS, NFS, FTP, SCP...Tools for sharing: AFS, NFS, FTP, SCP...
The problem:The problem:– Users are fixed to the Users are fixed to the abstractionsabstractions provided provided
by administrators: e.g. one NFS file system.by administrators: e.g. one NFS file system.– Result: 1000 people share one 40 GB disk.Result: 1000 people share one 40 GB disk.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
What if...What if...
Users could use any storage anywhere?Users could use any storage anywhere?
I could borrow an unused disk for NFS?I could borrow an unused disk for NFS?
An entire cluster can be used as storage?An entire cluster can be used as storage?
Multiple clusters could be combined?Multiple clusters could be combined?
All this could be done without root?All this could be done without root?
Solution: Tactical Storage System (TSS)Solution: Tactical Storage System (TSS)
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
OutlineOutline
Why is Sharing Data so Hard?Why is Sharing Data so Hard?
Tactical Storage SystemsTactical Storage Systems– File Servers, Abstractions, AdaptersFile Servers, Abstractions, Adapters
Performance ComparisonPerformance Comparison
Application: High-Energy PhysicsApplication: High-Energy Physics
Application: Bioinformatics DatabaseApplication: Bioinformatics Database
ConclusionConclusion
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Tactical Storage Systems (TSS)Tactical Storage Systems (TSS)
A TSS allows any node to serve as a file A TSS allows any node to serve as a file server or as a file system client.server or as a file system client.All components can be deployed without All components can be deployed without special privileges – but with security.special privileges – but with security.Users can build up complex structures.Users can build up complex structures.– Filesystems, databases, caches, ...Filesystems, databases, caches, ...
Two Independent Concepts:Two Independent Concepts:– Resources – The raw storage to be used.Resources – The raw storage to be used.– Abstractions – The organization of storage.Abstractions – The organization of storage.
filesystem
filesystem
filesystem
filesystem
filesystem
filesystem
filesystem
CentralFilesystem
App
Distributed Database Abstraction
Adapter
App
Distributed Filesystem Abstraction
Adapter
App
Cluster administrator controlspolicy on all storage in cluster
UNIX UNIX UNIX UNIX UNIX UNIX UNIX
Workstations owners controlpolicy on each machine.
fileserver
fileserver
fileserver
fileserver
fileserver
fileserver
fileserver
UNIX UNIX UNIX UNIX UNIX UNIX UNIX
???Adapter
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Three ComponentsThree Components
User-Level File ServersUser-Level File Servers– Secure Remote File Access w/out rootSecure Remote File Access w/out root
Storage AbstractionsStorage Abstractions– Combine several file servers into one.Combine several file servers into one.
Application AdaptersApplication Adapters– Attach existing applications w/out root.Attach existing applications w/out root.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
User-Level File ServersUser-Level File Servers
Unix-Like Access to Unix-Like Access to Existing File SystemsExisting File SystemsComplete IndependenceComplete Independence– choose friendschoose friends– limit bandwidthlimit bandwidth– evict users?evict users?
Trivial to DeployTrivial to Deploy– three stepsthree steps
Flexible Access ControlFlexible Access Control
fileserver
fileserver
ChirpProtocol
ChirpProtocol
filesystem
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Access Control in File ServersAccess Control in File Servers
Unix Security is not Sufficient for the JobUnix Security is not Sufficient for the Job
AuthenticationAuthentication– Globus, Kerberos, Unix, Hostname, AddressGlobus, Kerberos, Unix, Hostname, Address
AuthorizationAuthorization– Each directory has an access control:Each directory has an access control:
globus:/O=INFN/CN=Paolo_Mazzanti RWLAglobus:/O=INFN/CN=Paolo_Mazzanti RWLA
kerberos:[email protected] RWLkerberos:[email protected] RWL
hostname:*.bo.infn.it RLhostname:*.bo.infn.it RL
address:192.168.1.* RWLAaddress:192.168.1.* RWLA
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Widely Shared Storage ServersWidely Shared Storage Serversfile
server
globus:/O=INFN/CN=* RWLAX
a.out
test.c test.dat
cms.exe
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Reservation Right (V)Reservation Right (V)file
server
globus:/O=INFN/CN=* V(RWLA)
/O=INFN/CN=Mazzanti RWLA
mkdir
a.outtest.c
/O=INFN/CN=Mazzanti mkdir
/O=INFN/CN=Berlusconi RWLA
a.outtest.c
/O=INFN/CN=Berlusconimkdir only!
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
AbstractionsAbstractions
Users Create Higher Level StructuresUsers Create Higher Level Structures– Admins do not know/care about abstractions.Admins do not know/care about abstractions.
Current Abstraction Types:Current Abstraction Types:– CFS – Central File SystemCFS – Central File System– DSFS – Dist Shared File SystemDSFS – Dist Shared File System– DSDB – Dist Shared DatabaseDSDB – Dist Shared Database
Abstractions Under Development:Abstractions Under Development:– Striped File SystemStriped File System– Distributed Time Travel Backup SystemDistributed Time Travel Backup System
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
CFS: Central File SystemCFS: Central File System
fileserver
adapteradapter adapter
appl appl appl
file file
file
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
ptr ptr
ptr
DSFS: Dist. Shared File SystemDSFS: Dist. Shared File System
fileserver
adapter adapter
appl appl
fileserver
fileserver
file file
filefilefile
file filefile
filefile
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
DSDB: Dist. Shared DatabaseDSDB: Dist. Shared Database
adapter adapter
appl appl
fileserver
fileserver
file file
filefilefile
file filefile
filefile
databaseserver
file index
query
directaccess
insert
prepare
create
file
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
hostname:database.infn.it RWLA
mkdir
DSDB AuthenticationDSDB Authenticationfile
server
hostname:database.infn.it V(RWLA)
appl
databaseserver
insert file for/O=INFN/CN=Mazzanti
mkdir
setacl/O=INFN/CN=MazzantiRWL
hostname:database.infn.it RWLAglobus:/O=INFN/CN=Mazzanti RWL
file.dat
transferdata
adaper
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
ptrace interface
EnhancedOperatingSystem tcsh
cat vi
trappedsystem
calls
tcsh
cat vi
file tableprocess table
Like an OS KernelLike an OS Kernel– Tracks procs, files, etc.Tracks procs, files, etc.– Adds new capabilities.Adds new capabilities.– Enforces owner’s policies.Enforces owner’s policies.
Delegated SyscallsDelegated Syscalls– Trapped via ptrace interface.Trapped via ptrace interface.– Action taken by Parrot.Action taken by Parrot.– Resources chrgd to Parrot.Resources chrgd to Parrot.
Research PlatformResearch Platform– Distributed file systems.Distributed file systems.– Grid appl. environments.Grid appl. environments.– Debugging.Debugging.– Easier than OS coding!Easier than OS coding!
AdapterAdapter
Adapter - Parrot
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
filesystem
fileserver
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
PrototypePrototypeStorage in Computer Science Dept - Office Workstations - Instructional Labs - Research Clusters - Storage Bricks
Each Owner Controls Local Storage - Access Control List - Evicts Users if Needed. - Collaborate Offsite
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Demo Time!Demo Time!
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
OutlineOutline
Why is Sharing Data so Hard?Why is Sharing Data so Hard?
Tactical Storage SystemsTactical Storage Systems– File Servers, Abstractions, AdaptersFile Servers, Abstractions, Adapters
Performance ComparisonPerformance Comparison
Application: High-Energy PhysicsApplication: High-Energy Physics
Application: Bioinformatics DatabaseApplication: Bioinformatics Database
ConclusionConclusion
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance ConsiderationsPerformance Considerations
Nothing comes for free!Nothing comes for free!– System calls: order of magnitude slower.System calls: order of magnitude slower.– Memory bandwidth overhead: extra copies.Memory bandwidth overhead: extra copies.
Compared to NFS:Compared to NFS:– TSS slightly better on small operations.TSS slightly better on small operations.– TSS much better in network bandwidth.TSS much better in network bandwidth.
On real applications:On real applications:– Measurable slowdownMeasurable slowdown– Benefit: far more flexible and scalable.Benefit: far more flexible and scalable.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance – System CallsPerformance – System Calls
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance - ApplicationsPerformance - Applications
parrot only
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance – I/O CallsPerformance – I/O Calls
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance – BandwidthPerformance – Bandwidth
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance – DSFSPerformance – DSFS
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance ConclusionPerformance Conclusion
TSS has measurable slowdown.TSS has measurable slowdown.
TSS is comparable to NFS.TSS is comparable to NFS.
TSS can create scalable, parallel filesys.TSS can create scalable, parallel filesys.
To do better, must modify kernel.To do better, must modify kernel.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
OutlineOutline
Why is Sharing Data so Hard?Why is Sharing Data so Hard?
Tactical Storage SystemsTactical Storage Systems– File Servers, Abstractions, AdaptersFile Servers, Abstractions, Adapters
Performance ComparisonPerformance Comparison
Application: High-Energy PhysicsApplication: High-Energy Physics
Application: Bioinformatics DatabaseApplication: Bioinformatics Database
ConclusionConclusion
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Application: High-Energy PhysicsApplication: High-Energy Physics
SP5 Monte Carlo SimulationSP5 Monte Carlo Simulation– Component of BaBar at SLACComponent of BaBar at SLAC– Collaboration with Sander Klous at NIKHEFCollaboration with Sander Klous at NIKHEF
Difficult to Deploy on a GridDifficult to Deploy on a Grid– Complex Software StructureComplex Software Structure– Custom Shared LibrariesCustom Shared Libraries– Objectivity DatabaseObjectivity Database– (Similar Difficulties with Other Applications)(Similar Difficulties with Other Applications)
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
sp5
libobjy
libobjy
scripts
data
lockserver
file systemoperations
database lockoperations
sp5sp5sp5sp5sp5sp5sp5sp5
SP5 on a Standalone Machine
manually startedapplication
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
sp5
libobjy
libobjy
scripts
data
lockserver
filesystem
ops
databaselockops
sp5sp5sp5sp5sp5sp5sp5sp5
Ideal SP5 Deployment
sp5
libobjy
sp5
libobjy
sp5
libobjy
sp5
libobjy
sp5
libobjy
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
sp5
adapter
libobjy
scripts
data
lockserver
filesystemops
databaselockops
sp5sp5sp5sp5sp5sp5sp5sp5
SP5 with Tactical Storagesp5
adapter
sp5
adapter
sp5
adapter
sp5
adapter
sp5
adapter
fileserver
libobjy
GSI GSI GSI GSI GSI GSI
libobjylibobjylibobjylibobjylibobjy
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Performance on EDG TestbedPerformance on EDG Testbed
SetupSetup Time to InitTime to Init Time/EventTime/Event
UnixUnix 446 +/- 46446 +/- 46 64s64s
LAN/NFSLAN/NFS 4464 +/- 1724464 +/- 172 113s113s
LAN/TSSLAN/TSS 4505 +/- 1554505 +/- 155 113s113s
WAN/TSSWAN/TSS 6275 +/- 3306275 +/- 330 88s88s
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Thoughts on SP5 + TSSThoughts on SP5 + TSS
““With this project we have shown that With this project we have shown that computer scientists can solve the computer scientists can solve the
complications of grid computing and complications of grid computing and physicists can just use it.”physicists can just use it.”
“The most important issue is:
Who has to do the work?”
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
OutlineOutline
Why is Sharing Data so Hard?Why is Sharing Data so Hard?
Tactical Storage SystemsTactical Storage Systems– File Servers, Abstractions, AdaptersFile Servers, Abstractions, Adapters
Performance ComparisonPerformance Comparison
Application: High-Energy PhysicsApplication: High-Energy Physics
Application: Bioinformatics DatabaseApplication: Bioinformatics Database
ConclusionConclusion
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Application: Molecular DynamicsApplication: Molecular DynamicsResearchers in MD are much like HEP:Researchers in MD are much like HEP:– Long running simulations, explore space.Long running simulations, explore space.– Collaborating/competing on similar siml.Collaborating/competing on similar siml.– ““What parameters have I explored?”What parameters have I explored?”– ““How can I share results with friends?”How can I share results with friends?”– ““Replicate these data for safety.”Replicate these data for safety.”
GEMS: Grid Enabled Molecular SimsGEMS: Grid Enabled Molecular Sims– Distributed database for MD siml at Notre Dame.Distributed database for MD siml at Notre Dame.– Collaborators: Dr. Jesus Izaguirre, Dr. Aaron StriegelCollaborators: Dr. Jesus Izaguirre, Dr. Aaron Striegel
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
GEMS Distributed DatabaseGEMS Distributed Databasedatabase
server
catalogserver catalog
serverXML -> host1:fileAhost7:fileBhost3:fileC
A C BY Z X
XML -> host6:fileXhost2:fileYhost5:fileZ
data
XML+ Temp>300KMol==CH4
host5:fileZhost6:fileX
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
GEMS and Tactical StorageGEMS and Tactical Storage
Dynamic System ConfigurationDynamic System Configuration– Add/remove servers, discovered via catalogAdd/remove servers, discovered via catalog
Policy Control in File ServersPolicy Control in File Servers– Groups can Collaborate within ConstraintsGroups can Collaborate within Constraints– Security Implemented within File ServersSecurity Implemented within File Servers
Direct Access via AdaptersDirect Access via Adapters– Unmodified Simulations can use DatabaseUnmodified Simulations can use Database
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
SurvivabilitySurvivability
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
OutlineOutline
Why is Sharing Data so Hard?Why is Sharing Data so Hard?
Tactical Storage SystemsTactical Storage Systems– File Servers, Abstractions, AdaptersFile Servers, Abstractions, Adapters
Performance ComparisonPerformance Comparison
Application: High-Energy PhysicsApplication: High-Energy Physics
Application: Bioinformatics DatabaseApplication: Bioinformatics Database
ConclusionConclusion
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Tactical Storage SystemsTactical Storage Systems
Separate Abstractions from ResourcesSeparate Abstractions from ResourcesComponents:Components:– File servers, abstractions, adapters.File servers, abstractions, adapters.– Completely user level.Completely user level.– Performance acceptable for real applications.Performance acceptable for real applications.
Independent but Cooperating ComponentsIndependent but Cooperating Components– Owners of file servers set policy.Owners of file servers set policy.– Users must work within policies.Users must work within policies.– Large numbers of users: V right.Large numbers of users: V right.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Future WorkFuture Work
More powerful abstractionsMore powerful abstractions– Striping, replicating, indexing, searching.Striping, replicating, indexing, searching.
More fine grained control of storageMore fine grained control of storage– Allocation, accounting, and management of Allocation, accounting, and management of
bandwidth and storage space.bandwidth and storage space.
Applications and DeploymentApplications and Deployment
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
Tactical Storage SystemsTactical Storage Systems
put power in the handsput power in the hands
of the users,of the users,
not administrators!not administrators!
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
CollaboratorsCollaborators
NIKHEF and Vrije UniversityNIKHEF and Vrije University– Sander KlousSander Klous
University of Notre DameUniversity of Notre Dame– Aaron Striegel, Jesus IzaguirreAaron Striegel, Jesus Izaguirre
Hard working students:Hard working students:– Justin Wozniak, Paul BrennerJustin Wozniak, Paul Brenner– Paul Madrid, Chris MorettiPaul Madrid, Chris Moretti
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
PublicationsPublicationsTactical Storage SystemsTactical Storage Systems– UND CSE Dept Tech Report 2005-07, May 2005.UND CSE Dept Tech Report 2005-07, May 2005.
Transparent Access to Grid Resources for Transparent Access to Grid Resources for User SoftwareUser Software– Accepted to Accepted to Concurrency and Computation: Practice Concurrency and Computation: Practice
and Experienceand Experience, 2005., 2005.
Gluttony and Generosity in GEMS: Grid Gluttony and Generosity in GEMS: Grid Enabled Molecular StorageEnabled Molecular Storage– High Performance Distributed Comp, 2005.High Performance Distributed Comp, 2005.
Parrot: Transparent User-Level Middleware Parrot: Transparent User-Level Middleware for Data-Intensive Computingfor Data-Intensive Computing– Workshop on Adaptive Grid Middleware, 2003.Workshop on Adaptive Grid Middleware, 2003.
Cooperative Computing Labhttp://www.cse.nd.edu/~ccl
For more information...For more information...
Cooperative Computing LabCooperative Computing Lab
http://www.cse.nd.edu/~cclhttp://www.cse.nd.edu/~ccl
Cooperative Computing ToolsCooperative Computing Tools
http://http://www.cctools.orgwww.cctools.org
Douglas ThainDouglas Thain– [email protected]@cse.nd.edu– http://http://www.cse.nd.edu/~dthainwww.cse.nd.edu/~dthain