Logics and symbolic AI: Knowledge representation and reasoning
Combining data-driven and symbolic reasoning for Invariant...
Transcript of Combining data-driven and symbolic reasoning for Invariant...
Combining data-driven and symbolic reasoning forInvariant Synthesis in SMT
(Work in Progress)
Haniel BarbosaAndrew Reynolds
Cesare Tinelli
Clark Barrett
MVD 2018
2018–09–29, Iowa City, IA, USA
SyGuS Solving
CEGIS [Solar-Lezama et al. 2006]
B Most common technique for SyGuS solving
B Specification: x ≤ f(x, y) ∧ y ≤ f(x, y)
B Expression search space:
I Combinations of x, y, 0, 1,≤,+, if-then-else
Learningalgorithm
Verificationoracle
Counter-examples = {}
Counter-Exemplef(x=0,y=1)
Candidatef(x,y)=x
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 1 / 16
CEGIS [Solar-Lezama et al. 2006]
B Most common technique for SyGuS solving
B Specification: x ≤ f(x, y) ∧ y ≤ f(x, y)
B Expression search space:
I Combinations of x, y, 0, 1,≤,+, if-then-else
Learning algorithm
Verificationoracle
Counter-exemples =
{f(x=0,y=1)}
Counter-Exemplef(x=1,y=0)
Candidatef(x,y)=y
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 1 / 16
CEGIS [Solar-Lezama et al. 2006]
B Most common technique for SyGuS solving
B Specification: x ≤ f(x, y) ∧ y ≤ f(x, y)
B Expression search space:
I Combinations of x, y, 0, 1,≤,+, if-then-else
Learningalgorithm
Verificationoracle
Counter-examples =
{f(x=0,y=1)
f(x=1, y=0)
f(x=0, y=0)
f(x=1, y=1)}
CandidateITE(x ≤ y, y,x)
SUCCESS
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 1 / 16
Scalability issues
For this bit-vector grammar, enumerating
B Terms of size = 1 : .05 seconds
B Terms of size = 2 : .6 seconds
B Terms of size = 3 : 48 seconds
B Terms of size = 4 : 5.8 hours
B Terms of size = 5 : ??? (100+ days)
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 2 / 16
Divide-and-conquer [Alur et al. 2017]
B Generate partial solutions correct on subset of inputB Combine using conditionals
Only applicable for plainly separable specifications
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 3 / 16
A new framework for SyGuS solving
CegisUnif : combining CEGIS with unification
B Not limited to plainly separable specifications
B Data-driven: refinement lemmas generate data points
B Divide-and-conquer : each point yields a new function to synthesize
I Terms assigned to functions must satisfy refinement lemmasI SMT solving provides term candidates through constraint solving
Learningalgorithm
Verificationoracle
Counter-examples =
f (x=0,y=1)
f (x=1, y=0)
f (x=0, y=0)
f (x=1, y=1)
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 4 / 16
CegisUnif : combining CEGIS with unification
B Not limited to plainly separable specifications
B Data-driven: refinement lemmas generate data points
B Divide-and-conquer : each point yields a new function to synthesize
I Terms assigned to functions must satisfy refinement lemmasI SMT solving provides term candidates through constraint solving
Learningalgorithm
Verificationoracle
Counter-examples =
f_1 (x=0,y=1)
f_2 (x=1, y=0)
f_3 (x=0, y=0)
f_4 (x=1, y=1)
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 4 / 16
Feature synthesis
B Symbolic approach : derive minimal number of features that separateconflicting points (i.e. those that cannot be assigned the same term)I Optimal fairness criteria?
Currently: consider terms of size up to log2(#features)
B Heuristic approach : accumulate “feature pool” and chose separatingfeatures based on information gain heuristic for decision tree learning
I Select features that maximize information gain
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 5 / 16
Solving Invariant synthesis with CegisUnif
Invariant Synthesis
Add(Int x, y) { z := x; i := 0; assume(y > 0); while (i < y) { z := z + 1; i := i + 1; } return z;}
Post-condition: Result is the sum of the inputs
Verification:
z = x ∧ i = 0 ∧ y > 0 → Inv(x, y, z, i)Inv(x, y, z, i)∧ i < y ∧ z′ = z + 1 ∧ i′ = i+ 1 → Inv(x, y, z′, i′)Inv(x, y, z, i)∧ i ≥ y → z = x+ y
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 6 / 16
Invariant Synthesis
Add(Int x, y) { z := x; i := 0; assume(y > 0); while (i < y) { z := z + 1; i := i + 1; } return z;}
Post-condition:
Invariant?
Result is the sum of the inputs
Verification:
z = x ∧ i = 0 ∧ y > 0 → Inv(x, y, z, i)Inv(x, y, z, i)∧ i < y ∧ z′ = z + 1 ∧ i′ = i+ 1 → Inv(x, y, z′, i′)Inv(x, y, z, i)∧ i ≥ y → z = x+ y
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 6 / 16
Invariant Synthesis
Add(Int x, y) { z := x; i := 0; assume(y > 0); while (i < y) { z := z + 1; i := i + 1; } return z;}
Post-condition: Result is the sum of the inputs
Verification:
z = x ∧ i = 0 ∧ y > 0 → Inv(x, y, z, i)Inv(x, y, z, i)∧ i < y ∧ z′ = z + 1 ∧ i′ = i+ 1 → Inv(x, y, z′, i′)Inv(x, y, z, i)∧ i ≥ y → z = x+ y
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 6 / 16
Invariant Synthesis in SyGuS
B State-of-the-art: LoopInvGen [Padhi and Millstein 2017]: data-drivenloop invariant inference with automatic feature synthesisI Precondition inference from sets of “good” and “bad” states
Feature synthesis for solving conflicts
I PAC (probably approximately correct) algorithm for building candidateinvariants
B “Bad” states are dependent on model of initial condition (noguaranteed convergence)
B No support for implication counterexamples
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 7 / 16
Invariant Synthesis with CegisUnif
B Refinement lemmas allows derivation of three kinds on data points:
I “good points” (invariant must always hold)I “bad points” (invariant can never hold)I “implication points” (if invariant holds in first point it must hold in
second)
B No need for restriction to one initial state
B Native support for implication counterexamples
B Straightforward usage of classic information gain heuristic to buildcandidate solutions with decision tree learning
I SMT solver “resolves” implication counterexample points as “good” and“bad”
I Out-of-the-box Shannon entropy
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 8 / 16
Preliminary results
Invariant generation for Lustre
B Test suite with 487 invariant synthesis benchmarks generated by theKind 2 model checker from Lustre models
B We evaluate three configurations of CVC4
I cegis : regular CEGIS
I c unif : CegisUnif framework with symbolic solution building
I c unif-infogain : CegisUnif framework with solution building determinedby information gain heuristic
B 1800s timeout
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 9 / 16
50 100 150 200 250 30010−1
100
101
102
103
CPU
time
(s)
c unif-infogaincegisc unif
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 10 / 16
10−1 100 101 102 103
cegis10−1
100
101
102
103
cun
if
B + 38 / - 13
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 11 / 16
10−1 100 101 102 103
c unif-infogain10−1
100
101
102
103
cun
if
B + 63 / - 19
10−1 100 101 102 103
c unif-infogain10−1
100
101
102
103
cegi
sB + 73 / - 42
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 12 / 16
Invariants category from SyGuS-Comp 2018
B Test suite with 127 invariant synthesis benchmarks from numerousapplications
B We evaluate three configurations of CVC4
I cegis : regular CEGIS
I c unif : CegisUnif framework with symbolic solution building
I c unif-infogain : CegisUnif framework with solution building determinedby information gain heuristic
B We also compare against LoopInvGen, the current winner of theinvariants category in SyGuS-Comp
B 1800s timeout
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 13 / 16
50 60 70 80 90 100 110 12010−1
100
101
102
103
CPU
time
(s)
loopinvgencegisc unifc unif-infogain
10−1 100 101 102 103
cegis10−1
100
101
102
103
cun
if
10−1 100 101 102 103
c unif10−1
100
101
102
103
cun
if-i
nfog
ain
10−1 100 101 102 103
cegis10−1
100
101
102
103
cun
if-i
nfog
ain
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 14 / 16
Future work
B Adapt ICE [Garg et al. 2016] information gain heuristics to our setting;derive new heuristics
B Extend heuristics to function synthesis [Alur et al. 2017]
B Use data to determine “relevant arguments”
I f1(0, 0, 0, 1, 2, 1, 0) � f2(1, 0, 0, 5, 2, 1, 3)
I Reducing noise: make points as similar as possible
f ′1(1, 0, 0, 1, 2, 1, 0) � f ′
2(1, 0, 0, 5, 2, 1, 0)
I Only consider relevant arguments when synthesizing features
Can drastically reduce search space
Combining data-driven and symbolic reasoning for Invariant Synthesis (WIP) 15 / 16
References
Alur, Rajeev, Arjun Radhakrishna, and Abhishek Udupa (2017). “Scaling EnumerativeProgram Synthesis via Divide and Conquer”. In:Tools and Algorithms for Construction and Analysis of Systems (TACAS). Ed. byAxel Legay and Tiziana Margaria. Vol. 10205. Lecture Notes in Computer Science,pp. 319–336.
Garg, Pranav et al. (2016). “Learning invariants using decision trees and implicationcounterexamples”. In: Symposium on Principles of Programming Languages. Ed. byRastislav Bod́ık and Rupak Majumdar. ACM, pp. 499–512.
Padhi, Saswat and Todd D. Millstein (2017). “Data-Driven Loop Invariant Inference withAutomatic Feature Synthesis”. In: CoRR abs/1707.02029. arXiv: 1707.02029.
Solar-Lezama, Armando et al. (2006). “Combinatorial sketching for finite programs”. In:Architectural Support for Programming Languages and Operating Systems (ASPLOS).Ed. by John Paul Shen and Margaret Martonosi. ACM, pp. 404–415.