Bitstate Hashing
description
Transcript of Bitstate Hashing
Bitstate Hashing
Presented by Yunho KimProvable Software Lab, KAIST
Contents
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 2/30
• Introduction
• Bitstate hashing
• Multi-bit hashing
• Analysis
• Conclusion
• Explicit model checking problem is reduced to the reachability problem in a state graph
• Explicit model checker enumerate and explore all the states to solve the reachability problem
• To prevent the re-exploration of previously vis-ited states, all the states visited are stored in memory
Introduction
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 3/30
• To enable fast lookup of states, the states are stored in a hash table– Each item in the table and the list is a pointer to a corre-
sponding state
Introduction
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 4/30
hash(s)s
0
h-1
Sorted linked list
• All the states should be stored in another memory storage for resolving hash conflicts– If not so, search algorithm allows false positive
• The effective way to handle large state space is required for scalability
• Mainly two approaches– Reduce the number of states to check
• Partial order reduction, statement merging– Reduce the amount of memory needed to store visited states
• Loseless method: collapse compression, minimized automaton• Lossy method: bitstate hashing, hash compact
Introduction
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 5/30
• Bitstate hashing uses a bit to store a state– The value 1 in a entry indicates that the state is visited
• Each entry in a standard hash table can have ‘sizeof(pointer) £ 8’ bitstate hash entries
Bitstate Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 6/30
hash(s)s
0
(sizeof(pointer) £ 8 £ h) - 1
Sorted linked list
• Model checking using bitstate hashing is not sound– If hash(s2) = hash(s3) and s2 is visited, s3 is also considered
as a visited state and s6, the violated state, is not reachable
• The main issue of bitstate hashing is to estimate and max-imize its coverage
Bitstate Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 7/30
S1
S2
S3
S4
S5
S6
S7
• m: the size of a hash table in bits• Assume that hash function is uniformly distributed
• After inserting r states to the hash table, the probability that a specific bit of the hash table is still 0 is
• The probability of a false positive at (r+1)th state is
• The expected number of omissions when attempting to add n distinct states is
Bitstate Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 8/30
p= (1¡ 1m )r
1¡ p= 1¡ (1¡ 1m )r
P n¡ 1i=0 (1¡ (1¡ 1
m )i )
Bitstate Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 9/30
• Bitstate hashing has more coverage than exhaus-tive search does with a given fixed amount of memory
2 8 32 128 512 2048
8192327
68131
072524
288
209715
2
838860
8
335544
32
134217
728
536870
912
214748
3648
858993
4592
343597
38368
0.00%0.00%0.00%0.00%0.01%0.10%1.00%
10.00%100.00%
Bitstate hashing compared with exhastive search
ExhaustiveBitstate
Maximum amount of memory (bits)
Expe
cted
cov
erag
e
s: the size of state 320bitsn: the number of states 219
• Original bitstate hashing uses a bit to represent the state is visited or not
• Multi-bit hashing uses multiple independent hash func-tions to minimize hash conflicts – If all the positions from hash functions are set to 1, then
hash conflict occurs
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 10/30
s1
h1(s1)=1h
2(s1)=3
0 9
s2
h1(s2)=3h
2(s2)=6
Not conflict Conflict
• m: the size of a hash table in bits• k: the number of independent hash functions used• Assume that all hash functions are uniformly distributed
• After inserting r states, the probability that a specific bit is still 0 is
• The probability of a false positive at (r+1)th states is
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 11/30
p= (1¡ 1m )kr
(1¡ p)k = (1¡ (1¡ 1m )kr )k
• m: the size of a hash table in bits• k: the number of independent hash functions used
• The expected number of omissions when attempting to add n distinct states is
• To obtain the best coverage for a fixed m and n, we have to choose appropriate value of k
• The estimate for the best value of k is derived
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 12/30
P n¡ 1i=0 (1¡ (1¡ 1
m )ki )k
mn ¢ln2
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 13/30
• Multi-bit hashing uses two or more independent hash functions to minimize hash conflicts – It means multi-bit hashing is used to maximize coverage
From Fast and Accurate Bitstate Verification for SPINby Peter C. Dillinger and Panagiotis Manoliosin SPIN 2004
m: 1MB = 223 bitsn: 626,211
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 14/30
• Using two independent hash functions for a state is inade-quate for efficiency– The case k=2 takes twice time compared with k=1 for hash
calculation
• Instead, We use double hashing scheme used for hash reso-lution– It shows as good coverage as independent hash functions does
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 15/30
Double hash algorithmInputa, b: hash functiond: input valuem: the size of hash tablek: the number of hash functions usedOutputf: array of bit positions
Procedure1 x := a(d)%m2 y := b(d)%m3 f[0] := x4 i := 15 while i < k6 x := (x+y)%m7 f[i] := x8 i := +1
In SPIN, the values of a(d) and b(d) are calcu-lated from Jenkins’ hash function
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 16/30
Triple hashing algorithm extended from double hashing al-gorithmInputa, b, c: hash functiond: input valuem: the size of hash tablek: the number of hash functions used
Outputf: array of bit positionsProcedure1 x, y, z := a(d)%m, b(d)%m, c(d)%m2 f[0] := x3 i := 14 while i < k5 x := (x+y)%m6 f[i] := x7 Y := (y+z)%m8 i := i+1
In SPIN, the values of a(d), b(d), and c(d) are calculated from Jenk-ins’ hash function
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 17/30
• Experimental results• n = 606,211, m = 224 bits, k=21
Implementa-tion
Coverage (%) Average running time(s)
Independent 93.281 19.88Double Hash-
ing92.793 4.43
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 18/30
• Experimental results• n = 606,211, m = 224 bits,
Implementa-tion
Hash func-tions k
Average running time(s)
Double 21 3.78Triple 21 3.84
Double 20 3.61Triple 20 3.73
Independent 21 19.88
Multi-bit Hashing
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 19/30
• Experimental results• n = 606,211, m = 3 £ 8 M bits, k = 30
Implementa-tion
Coverage (%)
Double 97.5Triple 99.99
Conclusion
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 20/30
• Bitstate hashing can give a good coverage in a fixed amount of memory
• Model checking using bitstate hashing is not sound but does not generate an infeasible coun-terexample
• Multi-bit hashing using triple hashing scheme can provide good coverage efficiently
Reference
Bitstate Hashing, Yunho Kim, Provable Software Lab, KAIST 21/30
• The SPIN Model Checkerby Gerard J. HolzmannAddison –Wesley, 2004
• An Analysis of Bitstate Hashingby Gerard J. Holzmannin FMSD 1998
• Fast and Accurate Bitstate Verification for SPINby Peter C. Dillinger and Panagiotis Manolios
in SPIN 2004