Responsive Security for Stored Data
description
Transcript of Responsive Security for Stored Data
Responsive Security for Stored Data
Subramanian Lakshmanan
Mustaque Ahamad
H. Venkateswaran
College of Computing
Georgia Institute of Technology
Introduction
Problem definition– A distributed data repository. – Guarantees availability, integrity and confidentiality of stored data in the
face of a limited number of compromised nodes. – Better performance.
Organization– Motivation– Existing techniques – Our approach– Related work– System architecture and protocols– A simple analysis
Hope no one is
looking at my tax
documents
Would any one know about
this?
Hope the records have not been tampered
I don’t want the data to be lost, never! Could prove
vital!
Get his medical
records, fast!
At your service!
Motivation
Approach
I’m not going to trust any of you guys. No one is going to get all
the information
I will be faithful anyway
Wish I could
capture more
nodes!
I don’t want to talk to ALL of u, takes hell a lot of
time
I can’t even wipe or corrupt the data, let alone leaking info!
I will pass on what I
see to you, real fast!
So will I! Having a copy always
helps
S2
S3
S4
S1
C
E(d,k’)
E(d,k’)
E(d,k’)
E(d,k’)
E(d,k’)
E(d,k)
E(d,k)
Pure replication
1. Periodic re-encryption2. Client has to be present for re-encryption3. Compromised server could retain old data
E(d,k)
E(d,k)
E(d,k)
Secret-sharing algorithms
A (b,k) secret-sharing scheme fragments data into k shares so that b shares give no information, b+1 give all the information.
A (b,2b+1) scheme guarantees confidentiality, integrity and availability of data in the face of a maximum of b compromised nodes.
Data shares can be renewed, recovered periodically by the servers, even in the absence of the client, in a purely distributed fashion tolerating a maximum of b malicious nodes.
A pure-secret sharing scheme
S1
S2
S3
S4
S5
C
D (f1,f2..f5)(1,5) scheme
f1
f2
f3
f4
f5
1. A write involves talking to all servers
2. Suffers from the problem of related attacks
Our approach
Pure replication– Poor security tolerating malicious faults– Better performance (access cost, availability)
Pure secret-sharing – Better security tolerating malicious faults– Poor performance
Hybrid scheme– Do limited secret-sharing– Replicate the shares– Offer the benefits of both schemes
Related work
Replication for Byzantine fault tolerance– Schneider’s state machine approach for fault tolerance
– Secure FS, Practical Byzantine fault tolerance at MIT - Castro and Liskov
Quorum systems– Phalanx and Fleet – Reiter et. al.
– Dynamic quorums – Alvisi et. El.
Related work (contd.)
Secret-sharing– Shamir’s scheme based on polynomial interpolation
– Detecting and recovering corrupted shares – Feldman, Pederson
– Proactive secret-sharing, periodic share renewal and share recovery – Herzberg et. al.
PASIS at CMU. Fragmentation-scattering for intrusion tolerance at LAAS, France. Data dissemination
– Epidemic algorithms for non-malicious environment, Demers et. al.
– Dissemination in Byzantine environment – Malkhi et. al.
Our system
Write along a row
Disseminate along acolumn
Periodic share renewal
f1 f2f3
Read along a row
D f1 ,f2 ,f3
•Pure secret sharing : number of rows = 1•Pure replication : number of columns = 1
Assumptions
N Servers S1..Sn. Requests authenticated and authorized independently at each
server, secure communication channels Compromises of two different servers not related. Chosen threshold value b, number of server failures to be
tolerated. Number of columns- c, number of rows –r, rc = n, c>b. Protocols designed for chosen matrix dimensions and chosen
threshold value.
Read and write protocols
Write(x,v) by Client C
1. v v1,v2,…vc
(b,c)
2. Compute one-way function h(vi) = gvi
3. Form Verification string VS = h(v1)|h(v2)|..h(vn)sig = {uid(x),ts,v}KC-1
4. Choose a row k. for (m = 1 to c) send{“write”,uid(x),ts,vm,VS,sig} to sever Sk,m
5. Repeat 4 for different k until number of rows contacted l is such that c - b/l b+1.
Write protocol
D f1 ,f2
(b,c)
VS = h(f1)|h(f2)|..h(fn)
f1,VS, f2,VS, fc,VS,
C
l
b/l b+1
Read and write Protocols (contd.)Read(x) by Client C
1. Choose a row k. for (m = 1 to c) send{“write”,uid(x),ts,vm,VS,sig} to sever Sk,m
2. Get a list of {ts, VS,vm,sig} from Sk,m
3. Choose the highest timestamp that occurs in b+1 or more replies with same VS. 4. If no such timestamp exists, repeat from 1 for a different k.5. Pick shares corresponding to this timestamp. Pick b+1 shares that are verified successfully by VS. Reconstruct data value v from b+1 shares.6. Return v if sig is valid, else repeat from 1 for different k
Read protocol
D3. fi1 ,fi2,fib+1
(b,*)2. h(fi’) = h(fi) in VS ?
C
f1’,VS1, f2’,VS2, fc’,VSc,
1. VSi1 = VSi2
= ..Vsi b+1 ?
Data dissemination
Disseminate shares along columns Increases availability and system performance Better data sharing for shared data Better support for mobile or roaming client Replicated copies serve as back-ups
Dissemination protocol
f1 f2 fcVS VS VS
f’2f2
1. Detect/suspect corruption 2. Pull verification string from b+1 servers3. Check if share is valid using VS 4. Do share recovery if share is corruptedRemarks :
1. VS is accepted as valid only if either directly heard from client or b+1 other servers report same VS .
2. Disseminate to other servers only those VS that are accepted as valid.
Share renewal
Assumption : In any timeframe of length Tv, an adversary can compromise a maximum of
b nodes Question :
What happens over a time interval of length 2Tv?
Adversary compromises more than b nodes over a longer period of time. Renew the shares at least once every Tv seconds.
– Shares before share renewal do not make any sense with new shares.
– Done by servers in the absence of client, distributed, secure against b compromised nodes.
Share renewal (contd.)
Periodic share renewal
f1, VS f2, VS f3, VS
f1’, VS’ f2’, VS’ f3’, VS’
f2 f2’f1
Analysis
In any time fram of length Tv, a server can be compromised with probability p– Expected number of failures = np
Threshold value b, degree of replication r (or c) determine the level of security and performance offered by the system
Time taken to complete a read/write much less than Tv
Security Metrics
Availability- Probability that a legitimate client can read a data item that has been
written successfully.
Confidentiality- Complement of the probability that an adversary can read a data item
that has been written successfully.
Integrity- Complement of the probability that any client could be given
corrupted or modified data content when a read on a data item is done.
Security metrics(contd.)
(b,c) = (c) (1-pr)i * (pr) (c-i)
1. Availability():
= probability of finding at least b+1 non-faulty servers, each from a different column
i = b+1
c
i
2. Confidentiality():
= 1 - probability of finding at least b+1 malicious servers, each from a different column
(b,c) = 1 - (c) (1-qr)i * (qr) (c-i) ,
q = 1-p i = b+1
c
i
3. Integrity – same as confidentiality or depends on the strength of the underlying digital signature scheme
Performance metrics
Read cost– Expected number of servers a client needs to contact to read a data
item successfully. – Involves collecting b+1 distinct shares that are not corrupted.– (2b+1)/pr, pr – probability of a read completing successfully after
contacting 2b+1 servers. Write cost
– Number of servers a client needs to contact to write a data item at a confidence level h.
– h = probability of success = probability that at least one server from each of b+1 or more columns receive the write.
Availability, Confidentiality as functions of b for constant cAvailability
Confidentiality
Access costs as functions of b for constant c Read cost
Write cost
Availability and Confidentiality as functions of c for constant bAvailability
Confidentiality
Access costs as functions of c for constant b Read cost
Write cost
Availability and Confidentiality against threshold value, c = 2b+1
Access costs against threshold value, c = 2b+1
Remarks
When access cost or availability is the most important metric to be optimized and confidentiality is not an issue, set
r = n, c = 1, b = 0 (pure replication) When confidentiality is the most important metric to be optimized and
low performance is accepted, set r = 1, c = n, b = (c-1)/2 (pure secret-sharing)
Requirements on both security and performance would need combination of replication and secret-sharing 1-10-3.5, access cost 22 servers => b = 10, c = 21– Higher confidentiality => higher access costs and lower availability
Related attacks – Place servers vulnerable to similar attacks in same column
Future work
Per object customizable security Intrusion detection and correction Dynamic inclusion and exclusion of servers Implementation and experimental evaluation