Increasing Intrusion Tolerance Via Scalable Redundancy
-
Upload
anne-beard -
Category
Documents
-
view
34 -
download
0
description
Transcript of Increasing Intrusion Tolerance Via Scalable Redundancy
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Increasing Intrusion Tolerance Via Scalable Redundancy
Michael [email protected]
Anastasia Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
The Problem Space Distributed services manage redundant state across servers to
tolerate faults
We consider tolerance to Byzantine faults, as might result from an intrusion into a server or client A faulty server or client may behave arbitrarily
We also make no timing assumptions in this work An “asynchronous” system
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Our Goals To design, implement and evaluate new protocols for
implementing intrusion-tolerant services that scale better Here, “scale” refers to efficiency as number of servers and number of
failures tolerated grows
Targeting three types of services Read-write data objects Custom “flat” object types for particular applications, notably
directories for implementing an intrusion-tolerant file system Arbitrary objects that support object nesting
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Expected Impact Significant efficiency and scalability benefits over today’s
approaches to intrusion tolerance
For example, for data services, we anticipate At-least twofold latency improvement even at small configurations
(e.g., tolerating 3-5 Byzantine server failures) over current best And improvements will grow as system scales up
A twofold improvement in throughput, again growing with system size
Without such improvements, intrusion tolerance will remain relegated to small deployments in narrow application areas
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Outline Concepts Challenges Techniques Systems Technology transfer
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Concepts: Distributed Services Service, or object, abstraction Implementation
push pop sort
invoc
ation
resp
onse
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Concepts: Linearizability [Herlihy & Wing 1991]
A strong and accepted semantics for shared objects mimics semantics of a centralized object implementation each method appears to be executed at a distinct point between its
invocation and response
time
c1
c2
Objectinvocations
Apparentexecution
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
invinv
invinvinv
invinvinv
invinvinv
invinvinv
inv
Concepts: State Machine Replication Offers no load dispersion, and degrades as system scales
Servers
inv inv inv
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Concepts: Wait-Freedom [Herlihy 1990]
A liveness property for object invocations Informally, an implementation is wait-free if any client’s
operation is guaranteed to complete Assuming a limit on the number of faulty servers [Jayanti et al.] But not assuming a limit on the number of faulty clients
Intuitively, wait-freedom precludes synchronization mechanisms that must be “unlocked” by a client
Only read-write objects can be implemented in a wait-free way Virtually any other object cannot (in an asynchronous system)
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Challenges: Concurrency Concurrent updates can violate linearizability
Data Data
4 51 2 3
Servers
4 5 1 2 3
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Challenges: Server Failures Can attempt to mislead clients
Typically addressed by “voting”
Servers
????
31 2 4 54’
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
54
Challenges: Client Failures Byzantine client failures can also mislead clients
Typically addressed by submitting a request via an agreement protocol
Servers
Data?
1 2 3 4’ ?2’
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Challenges: Object Nesting Distributed objects have stubs and replicas
Servers
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Challenges: Object Nesting
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: Versioning
D0 determined complete, returned
Tim
e Ø Ø Ø Ø ØD0 D0 D0
D1
T0T1
D0 D1 Ø
D0
T1
Client read operation after T1
1 2 3 4 5
Ø
D0
D1 latest candidateD1 incompleteD0 latest candidate
3 writes required
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: RepairT
ime Ø Ø Ø Ø Ø
D0 D0 D0D1
T0T1T2
D0 D1D2T2
Client read operation after T2
D2
1 2 3 4 5D2 D2D2
Unreachable
D2 unclassifiableRepair D2
D2 D2
D2 D2
Return D2D2 latest candidate
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: Quorum Systems A quorum system is a data redundancy technique that supports
load dispersion among servers Only a subset of servers are accessed in each operation
Ex: Grid with n=49, b=3C o n s t r u c t i o n R e s i l i e n c e Q u o r u m s i z e
T h r e s h o l d[ D C 1 9 9 8 ] 4/nb 3 n / 4
M - G r i d[ S I A M J o C 2 0 0 0 ] 2/nb bnO
B o o s t F P P[ S I A M J o C 2 0 0 0 ] 4/nb bnO
P r o b a b i l i s t i c[ I & C ] 2/nb nbO ,max
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: Cross Checksums [Gong 1989]
A mechanism for defending against Byzantine servers that attempts to alter data in their possession Each data fragment is appended with a hash of all data fragments When retrieved, hashes are used as “votes” to determine correct data
fragments
Data-item
Data-fragmentsHashes
Crosschecksum
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: Validating Timestamps A technique for defending against Byzantine clients that attempt
to write different data values at the same timestamp Cross-checksum of write value recorded in its timestamp Read results are used to regenerate all data fragments and compare them
to the timestamp
Hashes
Crosschecksum
All data-fragmentsData-item
Hash in timestamp
Timestamp
Read results
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Techniques: Replicated Invocation b stub replicas cannot invoke
> b stub replicas can
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Our Research To summarize, we will explore the use of these techniques for
implementing Read-write block storage (linearizable, wait-free) Specialized metadata objects (e.g., directories) necessary to construct a
fully functional file system (linearizable) A general framework for arbitrary deterministic objects (linearizable)
Not all techniques will be appropriate for all cases “Flat” objects as found in file systems will generally not utilize
replicated clients Nested objects may not benefit from versioning (TBD)
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Systems: PASIS PASIS is a survivable storage system developed in a DARPA
IPTO project Funding ended December 2003
Examined the use of encoding schemes for efficiently distributing data storage while protecting confidentiality/integrity
Did not address concurrency control Clients would have to handle explicitly, e.g., using locking
Explored use of versioning for other purposes: recovery from user mistakes, system failures, penetrations Showed viability of comprehensive versioning
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Systems: Fleet Fleet is a Java-based distributed object architecture developed in
previous projects in DARPA ATO Funding ended June 2004
Focused on the use of quorum systems for efficient object replication
Fleet does not support nested objects and nested method invocations
Nor does it support potentially faulty clients
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Technology Transition Two primary channels are the industry consortia of two research
centers at Carnegie Mellon: CyLab and the Parallel Data Lab CyLab
A center focused on trustworthy and measurable computing Founded in 2003 through the merger of the Center for Computer and
Communications Security and the Sustainable Computing Consortium Corporate affiliate program includes over fifty companies, including
defense suppliers, tech companies and IT-based critical infrastructures
Parallel Data Lab A ten-year-old center focused on storage infrastructures Corporate affiliates include most major storage vendors
Both have a track record of technology transfer
Carnegie Mellon
Approved for Public Release, Distribution Unlimited
Questions?