Exact and Approximate Methods for Proving Unrealizability...

15
Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems Qinheping Hu University of Wisconsin-Madison USA John Cyphert University of Wisconsin-Madison USA Loris D’Antoni University of Wisconsin-Madison USA Thomas Reps University of Wisconsin-Madison USA Abstract We consider the problem of automatically establishing that a given syntax-guided-synthesis (SyGuS) problem is unre- alizable (i.e., has no solution). We formulate the problem of proving that a SyGuS problem is unrealizable over a fnite set of examples as one of solving a set of equations: the solution yields an overapproximation of the set of possible outputs that any term in the search space can produce on the given examples. If none of the possible outputs agrees with all of the examples, our technique has proven that the given SyGuS problem is unrealizable. We then present an algorithm for exactly solving the set of equations that result from SyGuS problems over linear integer arithmetic (LIA) and LIA with conditionals (CLIA), thereby showing that LIA and CLIA SyGuS problems over fnitely many examples are decidable. We implement the proposed technique and algorithms in a tool called nay. nay can prove unrealizability for 70/132 existing SyGuS benchmarks, with running times comparable to those of the state-of-the-art tool nope. Moreover, nay can solve 11 benchmarks that nope cannot solve. CCS Concepts: · Software and its engineering Auto- matic programming; · Theory of computation Ab- straction. Keywords: Program Synthesis, Unrealizability, Grammar Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps. 2020. Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems. In Proceedings of the 41st Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from [email protected]. PLDI ’20, June 15ś20, 2020, London, UK © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-7613-6/20/06. . . $15.00 htps://doi.org/10.1145/3385412.3385979 ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI ’20), June 15ś20, 2020, London, UK. ACM, New York, NY, USA, 15 pages. htps://doi.org/10.1145/ 3385412.3385979 1 Introduction The goal of program synthesis is to fnd a program in some search space that meets a specifcationÐe.g., satisfes a set of examples or a logical formula. Recently, a large family of synthesis problems has been unifed into a framework called syntax-guided synthesis (SyGuS). A SyGuS problem is spec- ifed by a regular-tree grammar that describes the search space of programs, and a logical formula that constitutes the behavioral specifcation. Many synthesizers support a specifc format for SyGuS problems [1], and compete in an- nual synthesis competitions [2]. These solvers are now quite mature and are fnding a wealth of applications [9, 12]. While existing SyGuS synthesizers are good at fnding a solution when one exists, there has been only a small amount of work on methods to prove that a given SyGuS problem is unrealizableÐi.e., the problem does not admit a solution. The problem of proving unrealizability arises in applications such as pruning infeasible paths in symbolic-execution en- gines [16] and computing syntactically optimal solutions to SyGuS problems [13]. However, proving that a SyGuS problem is unrealizable is particularly hard and, in general, undecidable [6]. When a SyGuS problem is realizable, any search technique that systematically explores the infnite search space of possible programs will eventually identify a solution to the synthesis problem. In contrast, proving that a problem is unrealizable requires showing that every program in the infnite search space fails to satisfy the specifcation. Although we cannot hope to have a complete algorithm for establishing unrealizability, the goal of this paper is to de- velop a framework for solving the kinds of problems that ap- pear in practice. Our framework can be used in tandem with existing synthesizers that use the counterexample-guided in- ductive synthesis (CEGIS) approach, in which the synthesizer iteratively builds a set of input examples and fnds programs consistent with the examples. Our approach builds on the observation that unrealizabil- ity of a SyGuS problem sy can be proved by showing, for 1128

Transcript of Exact and Approximate Methods for Proving Unrealizability...

Page 1: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for ProvingUnrealizability of Syntax-Guided Synthesis Problems

Qinheping HuUniversity of Wisconsin-Madison

USA

John CyphertUniversity of Wisconsin-Madison

USA

Loris D’AntoniUniversity of Wisconsin-Madison

USA

Thomas RepsUniversity of Wisconsin-Madison

USA

Abstract

We consider the problem of automatically establishing thata given syntax-guided-synthesis (SyGuS) problem is unre-alizable (i.e., has no solution). We formulate the problem ofproving that a SyGuS problem is unrealizable over a finite setof examples as one of solving a set of equations: the solutionyields an overapproximation of the set of possible outputsthat any term in the search space can produce on the givenexamples. If none of the possible outputs agrees with all ofthe examples, our technique has proven that the given SyGuSproblem is unrealizable. We then present an algorithm forexactly solving the set of equations that result from SyGuS

problems over linear integer arithmetic (LIA) and LIA withconditionals (CLIA), thereby showing that LIA and CLIASyGuS problems over finitely many examples are decidable.We implement the proposed technique and algorithms ina tool called nay. nay can prove unrealizability for 70/132existing SyGuS benchmarks, with running times comparableto those of the state-of-the-art tool nope. Moreover, nay cansolve 11 benchmarks that nope cannot solve.

CCS Concepts: · Software and its engineering→ Auto-

matic programming; · Theory of computation→ Ab-

straction.

Keywords: Program Synthesis, Unrealizability, GrammarFlow Analysis, Syntax-Guided Synthesis (SyGuS)

ACM Reference Format:

Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps.

2020. Exact and Approximate Methods for Proving Unrealizability

of Syntax-Guided Synthesis Problems. In Proceedings of the 41st

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies

are not made or distributed for profit or commercial advantage and that

copies bear this notice and the full citation on the first page. Copyrights

for components of this work owned by others than the author(s) must

be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior specific

permission and/or a fee. Request permissions from [email protected].

PLDI ’20, June 15ś20, 2020, London, UK

© 2020 Copyright held by the owner/author(s). Publication rights licensed

to ACM.

ACM ISBN 978-1-4503-7613-6/20/06. . . $15.00

https://doi.org/10.1145/3385412.3385979

ACM SIGPLAN International Conference on Programming Language

Design and Implementation (PLDI ’20), June 15ś20, 2020, London,

UK. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/

3385412.3385979

1 Introduction

The goal of program synthesis is to find a program in somesearch space that meets a specificationÐe.g., satisfies a setof examples or a logical formula. Recently, a large family ofsynthesis problems has been unified into a framework calledsyntax-guided synthesis (SyGuS). A SyGuS problem is spec-ified by a regular-tree grammar that describes the searchspace of programs, and a logical formula that constitutesthe behavioral specification. Many synthesizers support aspecific format for SyGuS problems [1], and compete in an-nual synthesis competitions [2]. These solvers are now quitemature and are finding a wealth of applications [9, 12].While existing SyGuS synthesizers are good at finding a

solution when one exists, there has been only a small amountof work on methods to prove that a given SyGuS problemis unrealizableÐi.e., the problem does not admit a solution.The problem of proving unrealizability arises in applicationssuch as pruning infeasible paths in symbolic-execution en-gines [16] and computing syntactically optimal solutionsto SyGuS problems [13]. However, proving that a SyGuS

problem is unrealizable is particularly hard and, in general,undecidable [6]. When a SyGuS problem is realizable, anysearch technique that systematically explores the infinitesearch space of possible programs will eventually identify asolution to the synthesis problem. In contrast, proving that aproblem is unrealizable requires showing that every programin the infinite search space fails to satisfy the specification.Although we cannot hope to have a complete algorithm

for establishing unrealizability, the goal of this paper is to de-velop a framework for solving the kinds of problems that ap-pear in practice. Our framework can be used in tandem withexisting synthesizers that use the counterexample-guided in-

ductive synthesis (CEGIS) approach, in which the synthesizeriteratively builds a set of input examples and finds programsconsistent with the examples.

Our approach builds on the observation that unrealizabil-ity of a SyGuS problem sy can be proved by showing, for

1128

Page 2: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

some finite set of examples E, that syEÐthe same problemwith the weaker specification of merely satisfying the exam-ples in EÐis unrealizable [11]. We combine this observationwith techniques from the abstract-interpretation literatureto show that determining realizability of a linear integerarithmetic (LIA) SyGuS problem over a finite set of examplesis actually decidable. Our work gives a decision procedureto show unrealizability for a syE instance, whereas the priorwork by Hu et al. [11] reduced the problem to a program-reachability problem. In their approach, if an assertion insidea constructed program is shown to be valid, then the originalproblem is unrealizable. The issue with prior work is thatthe resulting reachability problem is passed to an incompletesolver that may not terminate or may only return unknown.Even though we consider a finite set of examples, show-

ing realizability is non-trivial because the grammar can stillgenerate an infinite set of terms. The main idea of this paperis to use an abstract domain to overapproximate the possiblyinfinite set of outputs that the terms derivable from each non-terminal of the grammar of syE can produce on examples E.The overapproximation is formalized using grammar-flow-

analysis (GFA), a method that extends dataflow analysis togrammars [17]. We define a GFA problem whose solutionassociates an overapproximating abstract-domain value witheach non-terminal of the SyGuS grammar. We then use thenotion of symbolic concretization [20] to represent the ab-stract values as logical formulas, which get combined withthe SyGuS specification to produce an SMT query whoseresult can imply that the original problem is unrealizable.

Using this framework, a variety of abstract domains can beused to show unrealizability for arbitrary SyGuS problems.However, we also give a particular instantiation of the frame-work to obtain a decision procedure for (un)realizability of LIASyGuS problems over a finite set of examples. The key to thisreduction is the use of the abstract domain of semi-linear sets.We show that the GFA problem over semi-linear sets can besolved to yield a semi-linear set that exactly captures the setof possible outputs of the SyGuS grammar. The problem syE

is unrealizable if and only if the semi-linear set for the startnon-terminal of the grammar contains no value that satisfiesthe specification. We extend this result to SyGuS problemswhose grammar contains LIA terms and conditionals (CLIA).

Our work makes the following three contributions:(1) We reduce the problem of proving unrealizability of aSyGuS problem, where the specification is given by examples,to the problem of solving a set of equations in an abstractdomain (ğ2). The correctness of our reduction is based onthe framework of grammar-flow analysis (ğ3 and ğ4).(2) We show that the equations resulting from our reduc-tion can be solved exactly for SyGuS problems in which thegrammars only generate terms in LIA (ğ5) and CLIA (ğ6),therefore yielding the first decision procedures for LIA andCLIA SyGuS problems over a finite set of examples.

(3) We implement our technique in a tool, nay (ğ7). nay canprove unrealizability for 70/132 benchamrks that were usedto evaluate the state-of-the-art tool nope. In particular, naycan solve 11 benchmarks that nope could not solve (ğ8).ğ9 discusses related work. Proofs and additional details

can be found in the supplementary material.

2 Illustrative Examples

SyGuS problems in LIA. Consider the SyGuS problem inwhich the goal is to create a term ef whose meaning isef (x) := 2x + 2, but where ef is in the language of the fol-

lowing regular tree grammar G1:1

Start ::= Plus(Var(x),Var(x),Var(x), Start) | Num(0) (1)

This problem is unrealizable because every term in the gram-mar G1 is of the form 3kx (with k ≥ 0).A typical synthesizer tries to solve this problem using a

counterexample-guided inductive synthesis (CEGIS) strategythat searches for a program consistent with a finite set ofexamples E. Here, let’s assume that the initial input examplein E is i1, which has x set to 1Ði.e i1(x) = 1. For this example,the input i1 corresponds to the output o1 = 4.

In this particular case, there exists no term in the grammarG1 that is consistent with the example i1. To prove that thisgrammar does not contain a term that is consistent with thespecification on the example i1, we compute for each nonter-minal A a value n1,E (A)

2 that describes the set of values anyterm derived from A can produce when evaluated on i1Ði.e.,γ (n1,E (A)) ⊇ JeK(i1) | e ∈ LG1

(A), where, as usual in ab-stract interpretation, γ denotes the concretization function.As we show in ğ4, for n1,E (A) to be an overapproximationof the set of output values that any term derived from A canproduce for the current set of examples E, it should satisfythe following equation:

n1,E (Start) = JPlusK#E (JVar(x)K#E , JVar(x)K

#E , JVar(x)K

#E ,

n1,E (Start)) ⊕ JNum(0)K#E .(2)

For every term e , the notation JeK#E denotes an abstract se-

mantics of eÐi.e., JeK#E overapproximates the set of values ecan produce when evaluated on the examples in EÐand ⊕denotes the join operator, which overapproximates ∪.In this example, we represent each n1,E (A) using a semi-

linear setÐi.e., a set of terms l1, . . . , ln, where each li isa term of the form c + λ1c1 + · · · + λkck (called a linear

set), the values λi ∈ N are parameters, and the valuesc j ∈ Z are fixed coefficients. We then replace each JeK#Ewith a corresponding semi-linear-set interpretation. For ex-ample, JVar(x)K#E is the vector of inputs E projected onto the

1 For readability, we allow grammars to contain n-ary Plus symbols and

trees. In the next sections, we will write the grammar G1 as follows:

Start ::= Plus(S1, Start) | Num(0) S1 ::= Plus(S2, Var(x ))

S2 ::= Plus(S3, Var(x )) S3 ::= Var(x ).2 This section uses a simplified notation for readability. In ğ4 the term

n1,E (A) is written nG1E where G1 is used to denote a GFA problem.

1129

Page 3: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

x coordinateÐi.e., JVar(x)K#E = i1(x) = 1. We rewrite

JPlusK#E as ⊗, with x ⊗ y being the semi-linear set represent-ing a + b | a ∈ x,b ∈ yWe rewrite Eqn. (2) to use semi-linear sets:

n1,E (Start) =(1 ⊗ 1 ⊗ 1 ⊗ n1,E (Start)

)⊕ 0, (3)

where x ⊕ y is the semi-linear set representing a | a ∈x ∨ a ∈ y. These operations can be performed precisely.

In this example, an exact solution to this set of equationsis the semi-linear set n1,E (Start) = 0+ λ3, which describesthe set of all possible values produced by any term in gram-mar G1 for the set of examples E = ⟨i1⟩. In particular, sucha solution can be computed automatically [10].This SyGuSproblem does not have a solution, because none of the val-ues in n1,E (Start) meets the specification on the given inputexample, i.e., the following formula is not satisfiable:

∃λ.[i1 = 1 ∧ o1 = 0 + λ3 ∧ λ ≥ 0] ∧ o1 = 2i1 + 2. (4)

SyGuS problems in CLIA. For grammars with a more complexbackground theory, such as CLIA (LIA with conditionals), itmay be more complicated to compute an overapproximationof the possible outputs of any term in the grammar. Forexample, consider the SyGuS problem where once again thegoal is to synthesize a term whose meaning is ef (x) := 2x+2,but now in the more expressive CLIA grammar G2:

Start ::= IfThenElse(BExp, Exp3, Start) | Exp2 | Exp3

BExp ::= LessThan(Var(x),Num(2))

| LessThan(Num(0), Start) | And(BExp,BExp)

Exp2 ::= Plus(Var(x),Var(x), Exp2) | Num(0)

Exp3 ::= Plus(Var(x),Var(x),Var(x), Exp3) | Num(0)

(5)

Consider again the input example i1=1 with output o1=4.The term Plus(Var(x),Var(x), Plus(Var(x),Var(x),Num(0)))in this grammar is correct on the input i1. A SyGuS solverthat enumerates all terms in the grammar will find this term,test it on the given specification, see that it is not correcton all inputs, and produce a counterexample. In this case,suppose that the counterexample is i2 where i2(x)=2 withthe corresponding output o2=6. There is no term in G2 thatis consistent with both of these examples, and we will provethis fact like we did before, that is, by solving the followingset of equations:3

n2,E (Start) = JIfThenElseK#E(n2,E (BExp),n2,E (Exp3),

n2,E (Start)) ⊕ n2,E (Exp2) ⊕ n2,E (Exp3)

n2,E (BExp) = JLessThanK#E(JVar(x)K#

E, JNum(2)K#

E)

⊕ JLessThanK#E(JNum(0)K#

E,n2,E (Start))

⊕ JAndK#E(n2,E (BExp),n2,E (BExp))

n2,E (Exp2) = JPlusK#E(JVar(x)K#

E, JVar(x)K#

E,n2,E (Exp2))

⊕ JNum(0)K#E

n2,E (Exp3) = JPlusK#E(JVar(x)K#

E, JVar(x)K#

E, JVar(x)K#

E,

n2,E (Exp3)) ⊕ JNum(0)K#E

(6)

3 Note that the ⊕ symbol is overloaded. On the right-hand side of

n2,E (BExp), ⊕ is an operation on an abstract Boolean value, whereas the ⊕

on the right-hand-side of the other equations is an operation on semi-linear

sets. Both operations denote set union, and are handled in a uniform way

by operating over a multi-sorted domain of Booleans and semi-linear sets.

Because we want to track the possible values each term canhave for both examples, we need a domain that summarizesvectors of values. Luckily, semi-linear sets can easily be ex-tended to vectorsÐi.e., each li in a semi-linear set sl is a linearset of the form ®v0+λ1 ®v1+ · · ·+λk ®vk | λi ∈ N (with ®vj∈Z

k ).Second, because some nonterminals are Boolean-valued andsome are integer-valued, we need different representationsof the possible outputs of each nonterminal.Wewill use semi-linear sets for n2,E (Start), n2,E (Exp2) and n2,E (Exp3), and aset of Boolean vectors for n2,E (BExp)Ðe.g., n2,E (BExp) couldbe a set (t, f), (t, t), which denotes that a Boolean expres-sion generated by BExp can be true for i1 and false for i2, ortrue for both. We can now instantiate all constant terminalsand variable terminals with their abstraction, e.g., JVar(x)K#Ewith (1, 2) and JNum(0)K#E with (0, 0). We then start solv-ing part of our equations by observing that Exp2 and Exp3are only recursive in themselves. Therefore, we can com-pute their summaries independently, obtaining n2,E (Exp2) =(0, 0) + λ(2, 4),n2,E (Exp3) = (0, 0) + λ(3, 6). We can nowreplace all instances of n2,E (Exp2) and n2,E (Exp3), and ob-tain the following set of equations:

n2,E (Start) = JIfThenElseK#E(n2,E (BExp), (0, 0) + λ(3, 6),

n2,E (Start)) ⊕ (0, 0) + λ(2, 4)

⊕ (0, 0) + λ(3, 6)

n2,E (BExp) = (t, f) ⊕ JLessThanK#E((0, 0),n2,E (Start))

⊕ JAndK#E(n2,E (BExp),n2,E (BExp))

(7)

We now have to face the problem of solving equations overn2,E (BExp) and n2,E (Start), which represent different typesof values and are mutually recursive. Because the domainof n2,E (BExp) is finite (it has at most 2 |E | elements), we cansolve the equations iteratively until we reach a fixed pointfor both variables. In particular, we initialize all variables tothe empty set and evaluate right-hand sides, so n0

2,E(BExp) =

(t, f) (the superscript denotes the iteration the algorithm isin). We can replace n2,E (BExp) with the value of n0

2,E(BExp)

in the equation for n12,E (Start) as follows:

n12,E(Start) = JIfThenElseK#

E((t, f), (0, 0) + λ(3, 6),

n12,E(Start)) ⊕ (0, 0) + λ(2, 4)

⊕ (0, 0) + λ(3, 6)

(8)

At this point, we face a new problem: we need to express theabstract semantics of IfThenElse using the semi-linear setoperators ⊕ and ⊗. In particular, we would like to produce asemi-linear set in which, for each vector, some componentscome from the semi-linear set for the then-branch (i.e., valuescorresponding to inputs for which the IfThenElse guard wastrue), and some components come from the semi-linear setfor the else-branch (i.e., values corresponding to inputs forwhich the IfThenElse guard was false). We overcome this

1130

Page 4: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

problem by rewriting the above equations as follows:

n12,E(Start(t,t)) = (0, 0) + λ(3, 0) ⊗ n1

2,E(Start(f,t))

⊕ (0, 0) + λ(2, 4) ⊕ (0, 0) + λ(3, 6)

n12,E(Start(f,t)) = (0, 0) + λ(0, 0) ⊗ n1

2,E(Start(f,t))

⊕ (0, 0) + λ(0, 4) ⊕ (0, 0) + λ(0, 6)

(9)

Intuitively, n12,E (Start

(f,t)) is the abstraction obtained by only

executing the expressions generated by Start on the secondexample and leaving the output of the first example as 0to represent the fact that only the example i2 followed theelse branch of the IfThenElse statement. Similarly, the semi-linear set (0, 0) + λ(3, 0) zeroes out the second componentof the semi-linear set appearing in the then branch. The

value of n12,E (Start

(t,t)) (which is also the value of n12,E (Start)),

is then computed by summing (⊗) together the then andelse values. This set of equations is now in the form that wecan solve automaticallyÐi.e., it only involves the operations⊕ and ⊗ over semi-linear setsÐand thus we can computethe value of n1

2,E (Start). We now plug that value into the

equation for BExp and compute the value of n12,E (BExp),

n12,E(BExp) = (t, f) ⊕ JLessThanK#

E((0, 0),n1

2,E(Start))

⊕ JAndK#E(n1

2,E(BExp),n1

2,E(BExp))

(10)

Because n12,E (BExp) has a finite domain, equations over such

a domain can be solved iteratively, in this case yielding thefixed-point value n1

2,E (BExp) = (t, f), (t, t), (f, f). We now

plug this solution into the equation for Start and computethe value of n2

2,E (Start) similarly to how we computed that of

n12,E (Start). We then use n2

2,E (Start) to compute n22,E (BExp)

and discover thatn22,E (BExp) = n

2

2,E (BExp). Becausewe have

reached a fixed point, we have found the set of possiblevalues the grammar can output on our set of examples, i.e.,the abstraction n1

2,E (Start) captures all possible values the

grammarG2 can output on E. By plugging such values in theoriginal formula similarly to what we did in Eqn. (4) we getthat no output set satisfies the formula on the given inputexamples, and therefore this SyGuS problem is unrealizable.

3 Background

In this section, we recall the definition of syntax-guidedsynthesis over a finite set of examples.

3.1 Trees and Tree Grammars

A ranked alphabet is a tuple (Σ, rkΣ) where Σ is a finite set ofsymbols and rkΣ : Σ→ N associates a rank to each symbol.For everym ≥ 0, the set of all symbols in Σwith rankm is de-noted by Σ(m). In our examples, a ranked alphabet is specifiedby showing the set Σ and attaching the respective rank toevery symbol as a superscriptÐe.g., Σ = Plus(2),Var (x)(0).(For brevity, the superscript is sometimes omitted.) We useTΣ to denote the set of all (ranked) trees over ΣÐi.e., TΣ isthe smallest set such that (i) Σ(0) ⊆ TΣ, (ii) if σ

(k ) ∈ Σ(k ) and

t1, . . . , tk ∈ TΣ, then σ(k )(t1, · · · , tk ) ∈ TΣ. In what follows,

we assume a fixed ranked alphabet (Σ, rkΣ).

Definition 3.1 (Regular-Tree Grammar). A regular tree

grammar (RTG) is a tuple G = (N , Σ, S, δ ), where N is afinite set of nonterminal symbols of arity 0; Σ is a rankedalphabet; S ∈ N is an initial nonterminal; and δ is a finiteset of productions of the form A0 → σ (i)(A1, . . . ,Ai ), wherefor 1 ≤ j ≤ i , each Aj ∈ N is a nonterminal.

Given a tree t ∈ TΣ∪N , applying a production r = A→ β

to t produces the tree t ′ resulting from replacing the left-most occurrence of A in t with the right-hand side β . A treet ∈ TΣ is generated by the grammarGÐdenoted by t ∈ L(G)Ðiff it can be obtained by applying a sequence of productionsr1 · · · rn to the tree whose root is the initial nonterminalS . δA ⊆ δ denotes the set of productions associated withnonterminal A, and ΣA := σ (i) | A→ σ (i)(A1, ...,Ai ) ∈ δA.

3.2 Syntax-Guided Synthesis

A SyGuS problem is specified with respect to a backgroundtheory TÐe.g., linear arithmeticÐand the goal is to synthe-size a function f that satisfies two constraints provided bythe user. The first constraint,ψ (f (x), x), describes a semantic

property that f should satisfy. The second constraint limitsthe search space S of f , and is given as a set of terms specifiedby an RTG G that defines a subset of all terms in T .

Definition 3.2 (SyGuS). A SyGuS problem over a back-ground theory T is a pair sy = (ψ (f , x),G), where G is aregular tree grammar that only contains terms in TÐi.e.,L(G) ⊆ TÐandψ (f , x) is a Boolean formula constraining thesemantic behavior of the synthesized program f .4

A SyGuS problem is realizable if there exists an expres-sion e ∈ L(G) such that ∀x .ψ (JeK, x) is true. Otherwise wesay that the problem is unrealizable.

Theorem 3.3 (Undecidability [6]). Given a SyGuS problem

sy, it is undecidable to check whether sy is realizable.

Many SyGuS solvers do not solve the problem of findinga term that satisfies the specification on all inputs. Instead,they look for an expression that satisfies the specificationon a finite example set E. If such a term is found, it is thenchecked if it can be generalized to all inputs.We take a similarapproach to show unrealizability.

Definition 3.4. Given a SyGuS problem sy = (ψ (f , x),G)and a finite set of inputs E = ⟨i1, . . . , in⟩, let sy

E :=

(ψ E (f ),G) denote the problem of finding a term e ∈ L(G)such that JeK is only required to be correct on the examples inE. Let JeKE denote the vector of outputs ⟨JeK(i1), . . . , JeK(in)⟩

4In this paper, we focus on single-invocation SyGuS problems for which the

formula ψ only contains instances of the function f that are called on the

input x . We write ψ (f , x ) instead of ψ (f (x ), x ) for brevity.

1131

Page 5: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

(= ⟨o1, . . . ,on⟩) produced by e on E. A syE problem is real-

izable ifψ E (JeKE )def=

∧i j ∈E ψ (JeK(i j ), i j ) holds, and unreal-

izable otherwise.

Lemma 3.5 ([11]). If syE is unrealizable then sy is unrealiz-

able.

Example 3.6. The regular tree grammar of all linear integerarithmetic (LIA) terms is

TLIA ::= Plus(TLIA,TLIA) | Minus(TLIA,TLIA) | Num(c) | Var(x)

where c ∈ Z, and x ∈ V is an input variable to the functionbeing synthesized. The semantics of these productions is asexpected, and is extended to terms in the usual way.In the case of a syE instance, we consider the restricted

semantics of LIA with respect to a set of examples E =⟨i1, . . . , in⟩, given by a function J·KE : TLIA → Z

n . J·KE mapsan LIA term to the corresponding output vector producedby evaluating the term with respect to all of the examples inE. Let µE : V → Z

n be the function that projects the inputsonto the x coordinateÐi.e., µE (x) = ⟨i1(x), . . . , in(x)⟩. Thesemantics of the LIA operators with respect to an exampleset E is then defined as follows:

JPlusKE (®v1, ®v2) := ®v1 + ®v2 JNum(c)KE := ⟨c, ..., c⟩JMinusKE (®v1, ®v2) := ®v1 − ®v2 JVar(x)KE := µE (x)

where + (resp. −) denotes the component-wise addition (resp.subtraction) of two vectors. J·KE : TLIA → Z

n is extended toterms in the usual way. For brevity, we overload the termłLIAž to refer both to the logic LIA and to LIA grammarsÐi.e.,grammars over the alphabet Plus,Minus,Num(c),Var(x).

In ğ4.3, we present an algorithm based on Counterexample-Guided Inductive Synthesis (CEGIS) to show unrealizabilityof a SyGuS problem, sy, by showing unrealizability of a syE

problem. The idea is to check unrealizability of syE for someset E. If syE is unrealizable, the algorithm reports unrealiz-able, otherwise it generates a new example, in+1, adds it toE ′ = E∪in+1, and tries to prove unrealizability of sy

E′ , andso on. In ğ5, we show that the unrealizability problem fora syE instance is decidable for LIA grammars. However, wenote that there are SyGuS problems for which CEGIS-stylealgorithms cannot prove unrealizability [11]. Despite thisnegative result, we will show that a CEGIS algorithm canprove unrealizability for many SyGuS instances (ğ8).

4 Proving Unrealizability using GrammarFlow Analysis

In this section, we present a formalism called grammar flow

analysis (GFA) [17], which connects regular tree grammarsto equation systems, and show how to use GFA to prove un-realizability of SyGuS problems for finitely many examples.

4.1 Grammar Flow Analysis

GFA is a formalism used for equipping the language of agrammar with a semantics in which the meaning of a treeis a value from a (complete) combine semilattice.

Definition 4.1 (Combine Semilattice). A combine semilat-

tice is an algebraic structureD = (D, ⊕), where ⊕ : D×D →D is a binary operation on D (called łcombinež) that is com-mutative, associative, and idempotent. A partial order, de-noted by ⊑, is induced on the elements of D as follows: forall d1,d2 ∈ D,d1 ⊑ d2 iff d1 ⊕ d2 = d2. A combine semilatticeis complete if it is closed under infinite combines.

Definition 4.2. [GFA] [17, 19] Let D = (D, ⊕) be a com-plete combine semilattice. Recall that in a regular-tree gram-mar G = (N , Σ, S, δ ), δ is a set of productions of the form

X0 → д(X1, . . . ,Xk ), with д ∈ Σ.

In a GFA problem G = (G,D), each production is associ-ated with a production function J·K# that provides an interpre-

tation of дÐi.e., JдK# : Dk → D. J·K# is extended to trees inL(G) in the usual way, by thinking of each tree e ∈ L(G) as aterm over the operations JдK#. Term e denotes a compositionof functions, and corresponds to a unique value in D, whichwe call JeK#

G(or simply JeK# when G is understood).

Let LG (X ) denote the trees derivable from a nonterminalX .The grammar-flow-analysis problem is to overapproximate,for each nonterminal X , the combine-over-all-derivations

valuemG(X ) defined as follows:

mG(X ) =⊕

e ∈LG (X )

JeK#G .

We can also associate G with a system of mutually recur-sive equations, where each equation has the form

nG(X0) =⊕

X0→д(X1, ...,Xk )∈δ

JдK#(nG(X1), . . . ,nG(Xk )). (11)

We use nG(X ) to denote the value of nonterminal X in theleast fixed-point solution of G’s equations.

In essence, GFA is about two ways of folding the semanticsof terms onto nonterminals:

Derivation-tree based: mG(X ) defines the semantics of aterm in a compositional fashion, and folds all terms inLG (X ) onto nonterminal X by combining (⊕) their values.

Equational: nG(X ) obtains a value for X by using the val-ues of łneighboringž nonterminalsÐi.e., nonterminals thatappear on the right-hand side of productions of X .

Furthermore, GFA ensures that for all X ,mG(X ) ⊑ nG(X ).The relevance of GFA for showing unrealizability is that

whenever an RTG G is recursive, L(G) is an infinite setof trees. Thus, in general, there is not a clear method tocompute the combine-over-all-derivations valuemG(X ) =⊕

e ∈L(G)JeK#G. However, we can employ fixed-point finding

procedures to compute nG(X ). BecausemG(X ) ⊑ nG(X ), ourcomputed value will be a safe overapproximation.

1132

Page 6: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

However, in some cases we have a stronger relationshipbetweenmG(X ) and nG(X ). A production function JдK# isinfinitely distributive in a given argument position if

JдK#(. . . ,⊕

j ∈J

x j , . . .) =⊕

j ∈J

JдK#(. . . , x j , . . .)

where J is a finite or infinite index set.

Theorem 4.3. [17, 19] If every production function JдK#, д ∈Σ, is infinitely distributive in each argument position, then for

all nonterminals X ,mG(X ) = nG(X ).

This theorem is key to our decision procedures for LIAand CLIA grammars, because the domain of semi-linear setshas this property (ğ5.3).

4.2 Connecting GFA to Unrealizability

In this section, we show how GFA can be used to checkwhether a SyGuS problem with finitely many examples E isunrealizable. Intuitively, we use GFA to overapproximate theset of values the expressions generated by the grammar canyield when evaluated on a certain set of input examples E.

Definition 4.4. Let syE = (ψ E,G) be a SyGuS problem with

example set E, regular-tree grammar G = (N , Σ, S, δ ), andbackground theory T . Let J·KE be the semantics of trees inLG (X ) obtained via T , when µE (·) is used to interpret occur-rences of terminals of G that represent arguments to thefunction to be synthesized in the SyGuS problem.Let D = (D, ⊕) be a complete combine semilattice for

which there is a concretization function γ : D → Val |E | ,where Val is the type of the output values produced bythe function to be synthesized in the SyGuS problem. LetGE = (G,D) be a GFA problem that uses µE (·) to interpretoccurrences of terminals of G that represent arguments tothe function to be synthesized. Then

1. GE is a sound abstraction of the semantics of LG (X ) if

γ (mGE (X )) ⊇ JeKE | e ∈ LG (X ).

2. GE is an exact abstraction of the semantics of LG (X ) if

γ (mGE (X )) = JeKE | e ∈ LG (X ).

By using such abstractions, including the one describedin ğ2 based on semi-linear sets (see ğ5 and ğ6), the resultsobtained by solving a GFA problem can imply that a SyGuSproblem with finitely many examples E is unrealizable.The idea is that, given a SyGuS problem syE = (ψ E

,G)with example set E, regular-tree grammar G = (N , Σ, S, δ ),and background theory T , we can (i) solve the GFA problemGE = (G,D) with some complete domain semilattice D =(D, ⊕) to obtain an overapproximation of γ (mGE (S)), andthen (ii) check if the approximation is disjoint from the spec-ification, i.e., the predicate ®o ∈ γ (mGE (S)) ∧

∧i j ∈E ψ (®oj , i j ) is

unsatisfiable.Checking that the previous predicate holds can be opera-

tionalized with the use of symbolic concretization [20] and

an SMT solver. We view an abstract domainD as (implicitly)a logic fragment LD of some general-purpose logic L, andeach abstract value as (implicitly) representing a formulain LD . The connection between D and LD can be madeexplicit: we say that γ is a symbolic-concretization operation

for D if γ (·, ®o) : D → LD maps each a ∈ D to a formulawith free variables ®o, such that [[γ (a, ®o)]]L = γ (a). If γ exists,we say that L supports symbolic concretization for D.

Theorem 4.5. Let syE = (ψ E,G) be a SyGuS problem with

example set E, regular-tree grammar G = (N , Σ, S, δ ), andbackground theory T . Let D = (D, ⊕) be a complete combine

semilattice, and GE = (G,D) be a grammar-flow-analysis

problem over regular-tree grammar G. Assume the theory T

supports symbolic concretization of D. Let P be the property

Pdef= γ (nGE (S), ®o) ∧

i j ∈E

ψ (®oj , i j ).

1. Suppose that GE is a sound abstraction of the semantics

of L(G) with respect to background theory T . Then syE is

unrealizable if P is unsatisfiable.2. Suppose that GE is an exact abstraction of the semantics

of L(G) with respect to background theory T . Then syE is

unrealizable if and only if P is unsatisfiable.

4.3 Algorithm for Showing Unrealizability

Alg. 1 summarizes our strategy for showing unrealizability.

Example 4.6. Recall the SyGuS problem, from ğ2, of syn-thesizing a function ef (x) = 2x + 2 using the grammar fromEqn. (1). Suppose that we call Alg. 1 with the example setE = 1, and use the abstract domain of semi-linear sets.Alg. 1 first creates a GFA problem GE , which is shown asthe recursive equation system given as Eqn. (3). The solu-tion of the GFA problem then gets assigned to s at line (2).In this example, s is the semi-linear set 0 + λ3. This setcan be symbolically concretized as the set of models of∃λ ≥ 0.o1 = 0 + λ3. Then, on line (3) the LIA formula∃λ ≥ 0.o1 = 0 + λ3 ∧ o1 = 2i1 + 2 ∧ i1 = 1 is passed to anSMT solver, which will return unsat.

GFA in Practice. So far we have been vague about howGFA problems are computationally solved. In general, there

Algorithm 1: Checking whether syE is unrealizable

Function :CheckUnrealizable(G,ψ , E)

Input :Grammar G, specificationψ , set of examples E

1 GE ← (G,D) // GFA problem from G and E (Def. 4.4) ;

2 s ← nGE (Start) // Compute solution to the GFA problem;

3 if γ (s, ®o) ∧∧i j ∈E ψ (oj , i j ) is unsatisfiable then

4 return Unrealizable

5 return

Realizable, GE is an exact abstraction

Unknown, otherwise

1133

Page 7: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

is no universal method. The performance and precision of amethod depends on the choice of abstract domain D.

Kleene iteration. Traditionally one would employ Kleene it-eration to find a least fixed-point, nGE (X ). However, Kleeneiteration is only guaranteed to converge to a least fixed-pointif the domain D satisfies the finite-ascending-chain condi-tion. For example, the domain of predicate abstraction hasthis property, and therefore Alg. 1 could be instantiated withKleene iteration and predicate abstraction to attempt to showunrealizabilty, for arbitrary SyGuS problems. However, inthis paper we are focused on SyGuS problems using integerarithmetic, which does have infinite ascending chains. Thus,while predicate abstraction, and other domains with finiteheight, can provide a sound abstraction of LIA problems,they can never provide an exact abstraction. Alternatively,we could still use Kleene iteration on a domain with infiniteascending chains if we provide a widening operator, to en-sure convergence [7]. The issue with this strategy is thatwe are not guaranteed to achieve a least fixed-point. Such amethod would still be sound, but necessarily incomplete.

Constrained Horn clauses. Another incomplete, but general,method would employ the use of the domain of constrainedHorn clauses, (Φ,∨). The set Φ contains all first-order predi-cates over some theory. The order of predicates is given byP1(®v) ≤ P2(®v) iff P1(®v) → P2(®v), for all models ®v . The pro-duction functions J·K# of this GFA problem get translated toconstraints on the predicates. The advantage of using (Φ,∨)is that the resulting GFA problem is a Horn-clause program,which we can then pass to an off-the-shelf, incomplete Horn-clause solver, such as the one implemented in Z3 [8]. In thiscase, Alg. 1 would be slightly modified. Horn-clause solversdo not provide an abstract description of the nonterminals.Instead they determine satisfiabilty of a set of Horn clauseswith respect to a particular query. Therefore, in this caseAlg. 1 would use the formula in line (3) as the Horn-clausequery, instead of having a separate SMT check.

Example 4.7. The GFA problem in Eqn. (2) can be encodedusing the following constrained Horn clause:

∀v,v ′. Start(v) ← (v = 1+1+1+v ′∧Start(v ′))∨v = 0 (12)

A Horn-clause solver can prove that the LIA SyGuS prob-lem from ğ2 is unrealizable by showing that the followingformula is unsatisfiable: Eqn. (12) ∧ Start(o1) ∧ o1 = 2i1 + 2.

Newton’s Method. In the next two sections, we provide spe-cialized complete methods to solve GFA problems over LIAand CLIA grammars using Newton’s method [10]. Our cus-tom methods are limited to the case of LIA and CLIA gram-mars, but we show that the resulting solution is exact. Noprior method has this property for LIA and CLIA grammars.Consequently, our methods guarantee that not only does thecheck on line (3) imply unrealizability on a set of examplesif the solver returns unsat, but also realizability if the solver

returns sat. The latter property is important because it en-sures that the current set of examples is insufficient to proveunrealizability, and we must generate more.

5 Proving Unrealizability of LIA SyGuS

Problems with Examples

In this section, we instantiate the framework underlyingAlg. 1 to obtain a decision procedure for (un)realizability ofSyGuS problems in linear integer arithmetic (LIA), where thespecification is given by examples (as defined in Ex. 3.6). First,we review the conditions for applying Newton’s method forfinding the least fixed-point of a GFA problem over a commu-tative, idempotent, ω-continuous semiring (ğ5.1). We thenshow that the domain of semi-linear sets can be formulatedas such a problem. This approach provides a method to com-pute nGE (Start) for LIA SyGuS problems. We then show thatthe domain of semi-linear sets is exact and infinitely dis-

tributive (ğ5.3). Finally, we show that semi-linear sets admitsymbolic concretization (ğ5.4). Thus, by Thm. 4.5, we obtaina decision procedure for checking (un)realizability.

5.1 Solving Equations using Newton’s Method

We provide background definitions on semirings and New-ton’s method for solving equations over certain semirings.

Definition 5.1. A semiring S = (D, ⊕, ⊗, 0, 1) consists ofa set of elements D equipped with two binary operations:combine (⊕) and extend (⊗). ⊕ and ⊗ are associative, and haveidentity elements 0 and 1, respectively. ⊕ is commutative,and ⊗ distributes over ⊕. For every x ∈ D, x ⊗ 0 = 0 = 0 ⊗ x .

A semiring is commutative if for all a,b ∈ D, a ⊗ b = b ⊗ a.An ω-continuous semiring has a Kleene-star operator

⊛ : D → D defined as follows: a⊛ = ⊕i ∈N ai .

A semiring is idempotent if for all a ∈ D, a ⊕ a = a.

Recently, Esparza et al. [10] developed an iterative method,called Newtonian Program Analysis (NPA), which solves a setof semiring equations by an iterative computation.

Lemma 5.2. [Newton’s Method [10, Theorem 7.7]] For a sys-

tem of equations in N variables over a commutative, idempo-

tent, ω-continuous semiring, NPA reaches the least fixed point

after at most |N | iterations.

Lem. 5.2 is a powerful result because it applies even incases when the semiring has infinite ascending chains.

5.2 Removing Non-Commutative Operators

Our first step towards using GFA to generate equationsthat can be solved using Newton’s method removes non-commutative operators from the grammar.We define the language LIA+,

TLIA+ ::= Plus(TLIA+,TLIA+ ) | Num(c) | Var(x) | NegVar(x)

where the semantics of the Plus, Num, and Var operators arethe same as for LIA, and JNegVar(x)KE := −µE (x). We say a

1134

Page 8: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

regular-tree grammar is an LIA+ grammar if its alphabet isPlus,Num(c),Var(x),NegVar(x).The next example shows how our algorithm uses a func-

tion h to push negations to the leaves of LIA terms to yieldan LIA+ grammar.

Example 5.3. Consider the LIA grammar G:

Start ::= Minus(Start, Start) | 1 | x

The following LIA+ grammar h(G) is equivalent to G:

Start ::= Plus(Start, Start−) | Num(1) | Var(x)Start− ::= Plus(Start−, Start) | Num(−1) | NegVar(x).

5.3 Grammar Flow Analysis using Semi-Linear Sets

Thanks to ğ5.2, we can assume that the SyGuS grammar Gonly produces LIA+ terms. In this section, we use grammar-flow analysis to generate equations such that the solutionsto the equations assign a semi-linear set to each nonterminalX that, for the finitely many examples in E, exactly describesthe set of possible values produced by any term in LG (X ).We start by defining the complete combine semilattice(SL, ⊕) of semi-linear sets (see [10, ğ2.3.3] and [5, ğ3.4.4]).We then use them, together with the set of examples E, todefine a specific family of GFA problems: GE = (G,SL),where G = (N , Σ, S, δ ) is an LIA+ grammar. For simplicity,we use notation SL for both the semilattice and its domain

In the terminology of abstract interpretation, SL is an ab-stract domain that we can use to represent, for every nonter-minalX , the set of possible output vectors produced by evalu-ating each term in LG (X ) on the examples in E. Moreover, therepresentation is exact; i.e.,γ (mGE (X )) = JeKE | e ∈ LG (X )where γ denotes the usual operation of concretization.

Definition 5.4 (Semi-linear Set). A linear set

⟨®u, ®v1, · · · , ®vn⟩ denotes the set of integer vectors®u + λ1 ®v1 + · · · + λn ®vn | λ1, . . . , λn ∈ N, where®u, ®v1, ..., ®vn ∈ Z

d and d is the dimension of the linear set. Asemi-linear set is a finite union

⋃i ⟨®ui ,Vi ⟩ of linear sets, also

denoted by ⟨®ui ,Vi ⟩i .The concretization of a semi-linear set sl = ⟨®ui ,Vi ⟩i ,

denoted by γ (sl), is the set of vectors⋃

i

®ui + λ1,i ®v1,i + · · · + λn,i ®vn,i | λ1,i , . . . , λn,i ∈ N.

To apply Newton’s method for solving equations(Lem. 5.2), we need a commutative idempotent semiringover semi-linear sets. Fortunately, such a semiring exists [5,ğ3.4.4], with the operators ⊗, ⊕ and ⊛ defined as follows:

⟨®u1,i ,V1,i ⟩i ⊕ ⟨®u2, j ,V2, j ⟩j = ⟨®u1,i ,V1,i ⟩i ∪ ⟨®u2, j ,V2, j ⟩j

⟨®u1,i ,V1,i ⟩i ⊗ ⟨®u2, j ,V2, j ⟩j =⋃

i , j

⟨®u1,i + ®u2, j ,V1,i ∪V2, j ⟩

(⟨®ui ,Vi ⟩i )⊛= ⟨®0,

i

(®ui ∪Vi )⟩ (13)

The semi-linear sets 0def= ∅ and 1

def= ⟨®0, ∅⟩ are the identity

elements for ⊕ and ⊗, respectively. We use (SL, ⊕) to denotethe complete combine semilattice of semi-linear sets.We define the GFA problem GE = (G,SL) by giving the

following interpretations to LIA+ operators:

JPlusK#E (sl1, sl2)= sl1 ⊗ sl2 (14)

JNum(c)K#E = ⟨⟨c, · · · , c⟩, ∅⟩ (15)

JVar(x)K#E = ⟨µE (x), ∅⟩ (16)

JNegVar(x)K#E = ⟨−µE (x), ∅⟩ (17)

Now consider the combine-over-all-derivations valuemGE (X ) =

⊕e ∈LG (X )

JeK#E for the grammar-flow-analysis

problem GE . For an arbitrary tree e ∈ LG (X ), in the com-putation of JeK#E via Eqns. (14)ś(17), there is never any useof the ⊕ operation of SL. Consequently, the computationof JeK#E produces a semi-linear set that consists of a sin-

gle vectorÐthe same vector, in fact, that is produced bythe computation of JeKE shown in Ex. 3.6. In particular,⊕ two lines above Eqn. (13) preserves singleton sets, andhence for singleton sets, ⊗ one line above Eqn. (13) emu-lates JPlusKE . Therefore, the combine-over-all-derivationsvalue mGE (X ) =

⊕e ∈LG (X )

JeK#E is exactly the set of vec-

tors JeKE | e ∈ LG (X ). In other words,mGE (X ) is an ex-

act abstraction of the J·KE semantics of the terms in LG (X ),i.e., γ (mGE (X )) = JeKE | e ∈ LG (X ). Because JPlusK#E isinfinitely distributive over ⊕ ([10, Defn. 2.1 and ğ2.3.3]),mGE (X ) = nGE (X ) holds by Thm. 4.3, and thus we can com-putemGE (X ) by solving a set of equations in which, for eachX0 ∈ N , there is an equation of the form

nGE (X0)=⊕

X0→д(X1, ...,Xk )∈δ

JдK#E (nGE (X1), . . . ,nGE (Xk )). (18)

Example 5.5. Consider again the LIA+ grammar G1 fromEqn. (1), written out in the expanded form given in footnote 1.Let E be 1, 2, and thus µE (x) = ⟨1, 2⟩. The equation systemfor the GFA problem G1E is as follows:

nG1E (Start) = nG1E (S1) ⊗ nG1E (Start) ⊕ ⟨(0, 0), ∅⟩nG1E (S1) = nG1E (S2) ⊗ ⟨(1, 2), ∅⟩nG1E (S2) = nG1E (S3) ⊗ ⟨(1, 2), ∅⟩ nG1E (S3) = ⟨(1, 2), ∅⟩

which has the solution

nG1E (Start) = ⟨(0, 0), (3, 6)⟩ nG1E (S2) = ⟨(2, 4), ∅⟩nG1E (S1) = ⟨(3, 6), ∅⟩ nG1E (S3) = ⟨(1, 2), ∅⟩.

The concretizations of semi-linear sets in the solution are

γ (nG1E (Start)) = (0, 0) + λ(3, 6) | λ ∈ Nγ (nG1E (S1)) = (3, 6) γ (nG1E (S2)) = (2, 4)γ (nG1E (S3)) = (1, 2).

The following proposition shows that the equations gen-erated in Eqn. (18) can be solved using Newton’s method.

Proposition 5.6. (SL, ⊕, ⊗, 0, 1) is a commutative, idempo-

tent, ω-continuous semiring.

1135

Page 9: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

For a semi-linear set sl = ⟨®ui ,Vi ⟩i , let its size be∑i (|Vi |+

1). Given an LIA grammar , a finite set of examples E and anonterminal X ∈ N , the semi-linear set nGE (X ) yielded byNPA can contain exponentially many linear sets [15].

5.4 Checking Unrealizability

We now show how symbolic concretization for SL can beused to prove that no element ®o in nG(Start) satisfies thespecification ψ E (®o) of the SyGuS problem. The logic LIAsupports symbolic concretization for SL. For instance, fora linear set ⟨®u, ®v1, . . . , ®vn⟩, its symbolic concretizationγ (⟨®u, ®v1, . . . , ®vn⟩, ®o) is defined as follows:

∃λ1 ∈ N, . . . , λn ∈ N.(®o = ®u + λ1 ®v1 + · · · + λn ®vn).

Thus, the symbolic concretization for a semi-linear set is:

γ (⟨®ui ,Vi ⟩i , ®o)def=

i

γ (⟨®ui ,Vi ⟩, ®o). (19)

Note that ®o is shared among all disjuncts.Our decidability result follows directly from Thm. 4.5.

Theorem 5.7. Given an LIA SyGuS problem sy and a finite

set of examples E, it is decidable whether the SyGuS problem

syE is realizable.

6 Proving Unrealizability of CLIA SyGuS

Problems with Examples

In this section, we instantiate the framework from ğ4 to ob-tain a decision procedure for realizability of SyGuS problemsin conditional linear integer arithmetic (CLIA), where thespecification is given by examples. The decision procedurefollows the same steps as the one for LIA in ğ5. The main dif-ference is a technique for solving equations generated fromgrammars that involve both Boolean and integer operations.

6.1 Conditional Linear Integer Arithmetic

The grammar of all CLIA terms is the following:

TZ ::= IfThenElse(TB,TZ,TZ) | Plus(TZ,TZ)| Minus(TZ,TZ) | Num(c) | Var (x)

TB ::= And(TB,TB) | Not(TB) | LessThan(TZ,TZ)

where c ∈ Z is a constant and x ∈ V is a input variable tothe function being synthesized. Notice that the definitionsof TZ and TB are mutually recursive. The example grammarpresented in Eqn. (5) in ğ2 is a CLIA grammar.We now define the semantics of CLIA terms. Given an

integer vector ®v ∈ Zd and a Boolean vector ®b ∈ Bd , letproj®Z(®v,b) be the integer vector obtained by keeping the

vector elements of ®v corresponding to the indices for which®b is true, and zeroing out all other elements:

proj®Z(⟨u1, . . . ,ud ⟩, ⟨b1, . . . ,bd ⟩)

= ⟨if(b1) then u1 else 0, . . . , if(bd ) then ud else 0⟩

The semantics of symbols that are not in LIA is as follows:

JIfThenElseKE (®b, ®v1, ®v2) = proj®Z( ®v1,®b) + proj®Z( ®v2,¬

®b)

JNotKE (®b) = ¬®b JAndKE ( ®b1, ®b2) = ®b1 ∧ ®b2JLessThanKE ( ®v1, ®v2) = ®v1 < ®v2

where the operations +, ∧, <, and ¬ are performed element-wiseÐe.g., ®u < ®v = ⟨b1, . . . ,bn⟩ such that bi ⇔ ui < vi .

Similarly to what we did in ğ5.2, any CLIA grammar Gcan be rewritten into an equivalent CLIA+ grammar h(G)that does not contain any occurrences of Minus, but maycontain the symbol NegVar.The rest of the section is organized as follows. First, we

present the abstract domains used to represent Boolean andinteger terms (ğ6.2). Second, we show how to compute anexact abstraction of Boolean nonterminals in grammars with-out IfThenElse (ğ6.3). Third, we show how to solve SyGuSproblems with CLIA grammars containing arbitrary opera-tors, in particular IfThenElse and mutual recursion (ğ6.4).

6.2 Abstract Semantics for CLIA

We use sets of Boolean vectors as the abstract domain forBoolean nonterminals, and semi-linear sets as the abstract do-main for integer nonterminals. We use b to denote a Booleanvector and bset to denote sets of Boolean vectors.

Given a semi-linear set sl∈SL and a Boolean vector ®b∈Bd ,

let projSL(sl, ®b) be the semi-linear set obtained by zeroing

out elements at all index positions for which ®b is false:

projSL(⟨®ui ,Ωi ⟩i , ®b) = projS(⟨®ui ,Ωi ⟩, ®b)i

projS(⟨®u, ®v1, ..., ®vn⟩, ®b) = ⟨proj®Z(®u,®b), proj®Z(®vi ,

®b)i ⟩

Next, we lift the concrete semantics to semi-linear setsand define the abstract semantics of CLIA operators.

JIfThenElseK#E (bset, sl1, sl2) =⊕®b ∈bset

projSL(sl1, ®b) ⊗ projSL(sl2,¬®b)

JLessThanK#E (sl1, sl2) = v1<v2 | v1 ∈ sl1,v2 ∈ sl2

JNotK#E (bset) =⋃®b ∈bset¬®b

JAndK#E (bset1, bset2) =⋃®b1∈bset1, ®b2∈bset2

®b1 ∧ ®b2

Operationally, the semantics of the LessThan symbol canbe implemented using an SMT solver. As shown in ğ5.4, asemi-linear set sl can be symbolically concretized as a for-mula γ (sl, ®o) in LIA (a decidable SMT theory). Therefore,the set JLessThanK#E (sl1, sl2) = bset can be computed by per-

forming 2 |E | SMT queriesÐi.e., for every Boolean vector®b = ⟨b1, . . . ,b |E |⟩, we have that ®b ∈ bset iff the following

formula is satisfiable: γ (sl1, ®o1) ∧ γ (sl2, ®o2) ∧ ®b = ®o1 < ®o2.Similarly to how we defined J·K#E for multisorted terms,

we overload ⊕ as the union of sets of Boolean vectors, anddefine a multisorted semilattice DCLIA+ := (2B ⊎ SL, ⊕)over sets of Boolean vectors and semi-linear sets. We useGCLIA+E

:= (G,DCLIA+ ) to denote the GFA problem for a

CLIA+ grammar G and finitely many examples E. GCLIA+E

isan exact abstraction of the semantics of CLIA+ grammars.

1136

Page 10: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

6.3 CLIA Equations without Mutual Recursion

A CLIA grammar G contains Boolean and integer nontermi-nals. A nonterminal X is a Boolean nonterminal if JX K ∈ B,and is an integer nonterminal if JX K ∈ Z. In this subsection,we assume that there exists no mutual recursion, i.e., G con-tains no IfThenElse productions. Under this assumption, theonly operator that connects Boolean nonterminals and inte-ger nonterminals is LessThan, and hence no Boolean nonter-minal appears in the productions of an integer nonterminal.Therefore, we can proceed by first solving the equations thatinvolve integer nonterminals, using the technique presentedin ğ5.1, and then plugging the corresponding values into theequations that involve Boolean nonterminals. After this step,we are left with a set of equations eqs

Bthat involve only

Boolean nonterminals and Boolean symbols. Because thedomain of sets of Boolean vectors is finite, the least fixedpoint of eqs

Bcan be found using an algorithm that itera-

tively computes finer under-approximations of nGCLIA+E

as

nkGCLIA+E

Ði.e., the under-approximation at iteration kÐuntil it

reaches the least fixed point, whichÐby Thm. 4.3Ðis an exactabstraction. This algorithm terminates in at most 2 |E | |NB |iterations because the set of Boolean vectors has size at most2 |E | , and each iteration adds at least one Boolean vector toone of the variables until the least fixed point is reached.

6.4 CLIA Equations with Mutual Recursion

We have seen how to compute exact abstractions for gram-mars without mutual recursion, for both integer (ğ5.3) andBoolean (ğ6.3) nonterminals. In this section, we show howto handle grammars that involve IfThenElse symbols, whichintroduce mutual recursion between Boolean and integernonterminals. See Eqn. (7) in ğ2 for an example of equationsthat involve mutual recursion. To solve mutually recursiveequations, we cannot simply compute the abstraction for onetype and use the corresponding values to compute the ab-straction for the other type, like we did in ğ6.3. However, weshow that if we repeat such substitutions in an iterative fash-ion, we obtain an algorithm SolveMutual that computesan exact abstraction for a grammar with mutual recursion.At the k-th iteration, for every nonterminal X , the al-

gorithm computes an under-approximation nkGCLIA+E

(X ) of

nGCLIA+E(X ). Initially, n-1

GCLIA+E

(X ) = 0 for all nonterminals X of

type Z. At iteration k ≥ 0 the algorithm does the following:

Step 1. Replace each integer nonterminalZ with the valuenk-1GCLIA+E

(Z ) from iteration k-1 and use the technique in ğ6.3

to compute nkGCLIA+E

(B) for each Boolean nonterminal B.

Step 2. Replace each Boolean nonterminal B with thevaluenk

GCLIA+E

(B) from Step 1 and computenkGCLIA+E

(Z ) for each

integer nonterminal Z (see Eqn. (8) in ğ2 for an example).

The equations obtained at Step 2 only contain integernonterminals, but they may contain IfThenElse symbols forwhich the abstract semantics is not directly supported by theequation-solving technique presented in ğ5.1. In the rest ofthis section, we present a way to transform the given set ofequations into a new set of equations that faithfully describesthe abstract semantics of IfThenElse symbols, using only ⊗and ⊕ operations over semi-linear sets.The iterative algorithm SolveMutual is guaranteed to

terminate in |N |2 |E | iterations.

JIfThenElseK#E using Semi-Linear-Set Operations. Inthis section, we show how to solve equations that involveIfThenElse symbols. Recall the definition of the abstractsemantics of IfThenElse symbols:

JIfThenElseK#E (bset, sl1, sl2) =⊕

b ∈bset

projSL(sl1,b)

⊗ projSL(sl2,¬b)

In the rest of this section, we show how equations thatinvolve the semantics of IfThenElse symbols can be rewritteninto equations that involve only ⊕ and ⊗ operations, sothat they can be solved using Newton’s method. For everypossible Boolean vector b, the new set of equations containsa new variable nk

GCLIA+E

(Xb ), so that the solution to the set of

equations for this variable is projSL(nk

GCLIA+E

(X ),b).

Let eqs be a set of equations over a set of integer nonter-minals N . We write x/y to denote the substitution of everyoccurrence of x with y. We generate a set of equations eqs′

over the set of variables NBd

as follows. For every equationnkGCLIA+E

(X ) =⊕

i αi in eqs and b ∈ Bd , there exists an equa-

tion nkGCLIA+E

(Xb ) =⊕

i πb (αi ) in eqs′, where πb applies the

following substitution in this order:

1. For everyX ∈ N andb ′ ∈ Bd , πb applies the substitutionprojSL(n

k

GCLIA+E

(X ),b ′)/nkGCLIA+E

(Xb∧b′).

2. For every X ∈ N , πb applies nkGCLIA+E

(X )/nkGCLIA+E

(Xb ).

3. For any semi-linear set sl appearing in eqs, πb appliesthe substitution sl/projSL(sl,b). Because sl is a constant,this substitution yields a constant semi-linear set.

Example 6.1. Figure 1 illustrates how Eqn. (8) is rewritten

into Eqns. (9). We omit equations for variables n12,E (Start

f,f)

and n12,E (Start

t,f) because they do not contribute to the

solving of n12,E (Start

t,t). After expanding the definition of

JIfThenElseK#, we apply the substitutions to obtain Eqns. (9).Substitution 2 is not applied because there are no variablesof the form n1

2,E (X ) after applying substitution 1.

6.5 Checking Unrealizability

Using the symbolic-concretization technique described inğ5.4 , and the complexities described throughout this section,we obtain the following decidability theorem.

1137

Page 11: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

n12,E (Start) = JIfThenElseK#E((t, f), (0, 0) + λ(3, 6),

n12,E (Start)) ⊕ (0, 0) + λ(2, 4) ⊕ (0, 0) + λ(3, 6)

⇓ Generate equations for Startb

n12,E (Start(t,t)) = πt,t

(JIfThenElseK#

E((t, f), (0, 0) + λ(3, 6),

n12,E (Start)))⊕ πt,t

((0, 0) + λ(2, 4)

)

⊕ πt,t((0, 0) + λ(3, 6)

)

n12,E (Start(f,t)) = πf,t

(JIfThenElseK#

E((t, f), (0, 0) + λ(3, 6),

n12,E (Start)))⊕ πf,t

((0, 0) + λ(2, 4)

)

⊕ πf,t((0, 0) + λ(3, 6)

)

⇓ Expand definition of JIfThenElseK#

n12,E (Start(t,t)) = πt,t

(projSL((0, 0) + λ(3, 6), f, t)

)

⊗ πt,t(projSL(n

1

2,E (Start), (f, t)))

⊕ πt,t((0, 0) + λ(2, 4)

)⊕ πt,t

((0, 0) + λ(3, 6)

)

n12,E (Start(f,t)) = πf,t

(projSL((0, 0) + λ(3, 6), f, t)

)

⊗ πf,t(projSL(n

1

2,E (Start), (f, t)))

⊕ πf,t((0, 0) + λ(2, 4)

)⊕ πf,t

((0, 0) + λ(3, 6)

)

wwwApply projSL to constants

Apply substitution 1

n12,E (Start(t,t)) = πt,t

((0, 0) + λ(3, 0)

)⊗ n12,E (Start

(t,t)∧(f,t))

⊕ πt,t((0, 0) + λ(2, 4)

)⊕ πt,t

((0, 0) + λ(3, 6)

)

n12,E (Start(f,t)) = πf,t

((0, 0) + λ(3, 0)

)⊗ n12,E (Start

(f,t)∧(f,t))

⊕ πf,t((0, 0) + λ(2, 4)

)⊕ πf,t

((0, 0) + λ(3, 6)

)

⇓ Apply substitution 3

n12,E (Start(t,t)) = (0, 0) + λ(3, 0) ⊗ n12,E (Start

(f,t))

⊕ (0, 0) + λ(2, 4) ⊕ (0, 0) + λ(3, 6)

n12,E (Start(f,t)) = (0, 0) + λ(0, 0) ⊗ n12,E (Start

(f,t))

⊕ (0, 0) + λ(0, 4) ⊕ (0, 0) + λ(0, 6)

Figure 1. Rewriting Eqn. (8) into Eqns. (9).

Theorem 6.2. Given a CLIA SyGuS problem sy and a finite

set of examples E, it is decidable whether the SyGuS problem

syE is (un)realizable.

7 Implementation

We implemented a tool nay that can return two-sided an-swers to unrealizability problems of the form sy = (ψ ,G).When it returns unrealizable, no term in L(G) satisfies ψ ;when it returns realizable, some e ∈ L(G) satisfies ψ ; naycan also time out. nay consists of three components: 1) a veri-fier (the SMT solver CVC4 [3]), which verifies the correctnessof candidate solutions and produces counterexamples, 2) asynthesizer (ESolverÐthe enumerative solver introduced in[2]), which synthesizes solutions from examples, and 3) anunrealizability verifier, which proves whether the problemis unrealizable on the current set of examples.Alg. 2 shows nay’s CEGIS loop. Given a SyGuS problem

sy = (ψ ,G), nay first initialize E with a random input exam-ple with values in the range [−50, 50](line (1)), and then, in

parallel, 1 calls ESolver to find a solution of syE (line (4)),and 2 uses grammar flow analysis (Alg. 1) to decide whethersyE∪Er is unrealizable (line (11)), where Er is a set of ran-domly generated temporary examples. Randomly generatedexamples are used when the problem is proven to be re-alizable by GFA, but we do not have a candidate solutione∗ÐESolver did not return yetÐthat can be used to issuean SMT query to possibly obtain a counterexample. Duringeach CEGIS iteration, the following three events can happen:1) If GFA returns unrealizable, nay terminates and outputsunrealizable (line (16)). 2) If GFA returns realizable, nay addsa temporary random example to Er (line (18)), and rerunsGFA with E ∪ Er . 3) If ESolver returns a candidate solutione∗, the problem syE is realizable. (ESolver never uses thetemporary random examples.) Therefore, nay kills the GFAprocess and then issues an SMT query to check if e∗ is asolution to the SyGuS problem sy (line (6)): if not, nay addsa counterexample to E (line (7)) and triggers the next CEGISiteration, otherwise, nay return e∗ as a solution to the givenSyGuS problem sy (line (10)).

nay currently has two modes: nayHorn and naySL .nayHorn implements the constrained-Horn-clauses tech-

nique for solving equations presented in ğ4.3, and uses Z3’sHorn-clause solver, Spacer [8], to solve the Horn clauses.naySL implements the decision procedures presented in

ğ5 and ğ6 for solving LIA and CLIA problems. naySL alsoimplements two optimizations: (i) naySL eagerly removesa linear set from a semi-linear set whenever it is triviallysubsumed by another linear set; and (ii) naySL uses theoptimization presented in the following paragraph.

Algorithm 2: CEGIS with random examples

Function :Nay(G,ψ )

Input: Grammar G, specificationψ

1 i ← Random(−50, 50) Set of examples E ← i

2 while True do

3 do in parallel

4 1 e∗ ←ESolver(G,ψ , E)

5 kill 2

6 if ∃icex .¬ψ (Je∗K, icex ) then

7 E ← E ∪ icex

8 continue

9 else

10 return e∗

11 2 Er ← ∅

12 while True do

13 result ←CheckUnrealizable(G,ψ , E ∪ Er )

14 if result =Unrealizable then

15 kill 1

16 return Unrealizable

17 i ← Random(−50, 50)

18 Er ← Er ∪ i

19 continue

1138

Page 12: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

Solving GFA Equations via Stratification. The nGequations (Eqn. (11)) that arise in a GFA problem areamenable to the standard optimization technique of iden-tifying łstrataž of dependences among nonterminals, andsolving the equations by finding values for nonterminals oflower łstrataž first, working up to higher strata in an orderthat respects dependences among the equations.This idea can be formalized in terms of the strongly con-

nected components (SCCs) of a dependence graph, definedas follows: the nodes are the nonterminals of G; the edgesrepresent the dependence of a left-hand-side nonterminal ona right-hand-side nonterminal. For instance, ifG has the pro-ductions X0 → д(X1,X2) | h(X2,X3), then the dependencegraph has three edges into node X0: X1 → X0, X2 → X0, andX3 → X0. There are three steps to finding an order in whichto solve the equations:

• Find the SCCs of the dependence graph.• Collapse each SCC into a single node, to form a directedacyclic graph (DAG).• Find a topological order of the DAG.

The set of nonterminals associated with a given node ofthe DAG corresponds to one of the strata referred to earlier.The equation solver can work through the strata in anytopological order of the DAG.

8 Evaluation

In this section, we evaluate the effectiveness and perfor-mance of naySL and nayHorn.

5

Benchmarks. We perform our evaluation using 132 vari-ants of the 60 CLIA benchmarks from the CLIA SyGuS com-petition track [2]. These benchmarks are the same ones usedin the evaluation of the tool we compare against, nope [11],which like nay only supports LIA and CLIA SyGuS problems.

The benchmarks are divided into three categories, andarise from a tool used to synthesize terms in which a certainsyntactic feature appears a minimal number of times [13].LimitedPlus (resp. LimitedIf) contains 30 (resp. 57) bench-marks in which the grammar bounds the number of times aPlus (resp. IfThenElse) operator can appear in an expression-tree to be one less than the number required to solve the orig-inal synthesis problem. LimitedConst contains 45 bench-marks that restrict what constants appear in the grammar.In each of the benchmarks, the grammar that specifies thesearch space generates infinitely many terms.

8.1 Effectiveness of nay

EQ 1. How effective is nay at proving unrealizability?

5All the experiments were performed on an Intel Core i7 4.00GHz CPU, with

32GB of RAM. We used version 1.8 of CVC4 and commit d37c50e of ESolver.

The timeout for each individual nay/ESolver call is set at 10 minutes.

Table 1. Performance of nay and nope for LimitedIf andLimitedPlus benchmarks.6 The table shows the number ofnonterminals (|N |), productions (|δ |), and variables (|V |) inthe problem grammar; the number of examples required toprove unrealizability (|E |); and the average running time ofnaySL , nayHorn, and nope. denotes a timeout.

ProblemGrammar

|E |time (s)

|N | |δ | |V | naySL nayHorn nope

LimitedPlus

guard1 7 24 3 2 0.24

guard2 9 34 3 3 12.86

guard3 11 41 3 1 0.07

guard4* 11 72 3 3.5 147.50

plane1 2 5 2 1 0.07 0.55 0.69

plane2 17 60 2 1.6 0.90

plane3 29 122 2 1.5 15.73

ite1* 7 2 3 2 1.05

ite2* 9 34 3 4 294.88

sum_2_5 11 40 2 4 15.48

search_2 5 16 3 3 1.21

search_3 7 25 4 4 2.65

LimitedIf

max2 1 5 2 4 0.13 1.13 1.48

max3 3 15 3 - 9.67 58.57

sum_2_5 1 5 2 3 0.17 0.61 0.69

sum_2_15 1 5 2 3 0.17 0.56 0.87

sum_3_5 3 15 3 - 17.85 101.44

sum_3_15 3 15 3 - 16.65 134.87

search_2 3 15 3 - 25.85 112.78

example1 3 10 2 3 0.14 0.73 1.12

guard1 1 6 2 4 0.13 0.44 0.43

guard2 1 6 2 4 0.22 0.33 0.49

guard3 1 6 2 4 0.16 0.27 0.46

guard4 1 6 2 4 0.11 0.72 0.58

ite1 3 15 3 - 2.68 369.57

We compare naySL and nayHorn against nope, the state-of-the-art tool for proving unrealizability of SyGuS prob-lems [11]. For each benchmark, we run each tool 5 timeson different random seeds, therefore generating differentrandom sets of examples, and report whether a tool success-fully terminated on at least one run. This process guaranteesthat all tools are evaluated on the same final example setthat causes a problem to be unrealizable. Table 1 shows theresults for the LimitedPlus and LimitedIf benchmarks thatat least one of the three tools could solve. Because both toolsuse a CEGIS loop to produce input examples, only the lastiteration of CEGIS is unrealizable. For naySL and nope, thatiteration is the one that dominates the runtime. On aver-age, it accounts for 60.4% of the running time for naySLand 90.3% for nope, but only 8.3% for nayHorn. (For nayHorn,counterexample generation is the most costly step.) The Lim-itedConst benchmarks could be solved by all tools, andresults are given in the supplementary material.

6 We discovered that three of the benchmarks from [11] were actually

realizable (marked with *). Because these benchmarks were created by

bounding the number of Plus operators, we further reduced the bound by

one to make them unrealizable.

1139

Page 13: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

Findings. naySL solved 70/132 benchmarks, with an av-erage running time of 1.97s.nayHorn and nope solved iden-tical sets of 59/132 benchmarks, with an average runningtime of 0.63s and 15.59s, respectively. All tools can solve allthe LimitedConst benchmarks with similar performance.These benchmarks are easier than the other ones.

naySL can solve 11 LimitedPlus benchmarks that nopecannot solve. These benchmarks involve large grammars, aknown weakness of nope (see [11]). In particular, NaySL canhandle grammars with up to 29 nonterminals while Nopecan only handle grammars with up to 3 nonterminals. For 8benchmarks, naySL only terminated for some of the randomruns (certain random seeds triggered more CEGIS iterations,making the final problem harder for nay to solve).nope solved 5 LimitedIf benchmarks that naySL can-

not solve. nope solves these benchmarks using between 7and 9 examples in the CEGIS loop. Because the size of thesemi-linear sets computed by naySL depends heavily on thenumber of examples, naySL only solves benchmarks thatrequire at most 4 examples. ğ8.2 analyzes the effect of thenumber of examples on naySL ’s performance. When naySLterminated, it took 1 to 15 iterations (avg. 6.6) to find a fixedpoint for IfThenElse guards, and the final abstract domain ofeach guard contained 2 to 16 Boolean vectors (avg. 5.9). Onaverage, the running time for computing semi-linear sets is70.6% of the total running time. On the benchmarks that alltools solved, all tools terminated in less than 2s.nayHorn and nope solved exactly the same set of bench-

marks. This outcome is not surprising becausenope uses Sea-Horn, a verification solver based on Horn clauses that buildson Spacer, which is the constrained-Horn-clause solver usedby nayHorn. nayHorn directly encodes the equation-solvingproblem, while nope reduces the unrealizability problem to averification problem that is then translated into a potentiallycomplex constrained-Horn-clause problem. For this reason,nayHorn is on average 19 times faster than nope. On bench-marks for which nope took more than 2 seconds, nayHorn is82x faster than nope (computed as the geometric mean).The reason we use random examples in Alg. 2 is that

there is a trade-off between the size of solutions and thenumber of examples when we are proving the realizability ofSyGuS-with-examples problems. On the one hand, ESolveris not affected by the number of examples, and can efficientlysynthesize a solution when a small solution exists. On theother hand the time required to prove realizability by naySLonly depends on the size of grammars and the number ofexamples but not on the size of solutions. For the realizableSyGuS-with-examples problems produced during the CEGISloop of our experiments, ESolver terminates on average in 1.9seconds when there exists a solution with size no more than10, but terminates on average in 54.5 seconds when thereexists a solutionwith size greater than 10 (the largest solutionhas size 24). For the same problems, naySL could not proverealizability for problems with more than 5 examples, but it

5 10 15 20 25

10−1

100

101

102

|N |

SLtime(s)

|E | = 1

|E | = 2

|E | = 3

|E | = 4

Figure 2. Time to compute semi-linear set vs. |N |.

did prove realizability for 7 problems on which ESolver failed.On the problems both ESolver and naySL solved, ESolver is87% faster than naySL calculated as a geometric mean.To answer EQ 1: if both nay techniques are considered

together, nay solved 11 benchmarks that nope did not solve,

and was faster on the benchmarks that both tools solved.

8.2 The Cost of Proving Unrealizability

EQ 2. How does the size of the grammar and the numberof examples affect the performance of different solvers?

Finding. First, consider naySL : when we fix the numberof examples (different marks in Fig. 2), the time taken tocompute the semi-linear set grows roughly exponentially.Also, the time grows roughly exponentially with respect to2 |E | .

nayHorn and nope (shown in Fig. 3 and Fig. 4, respectively)can only solve benchmarks involving up to 3 nonterminals.When we fix the number of nonterminals, the running timeof these two tools grows roughly exponentially with respectto the number of examples.

To answer EQ 2: the running time of naySL grows expo-nentially with respect to |N |2 |E | , and the running time ofnayHorn and nope grows exponentially with respect to |E |.

8.3 Effectiveness of Grammar Stratification

EQ 3. Is the stratification optimization from ğ7 effective?

Finding. Using stratification, naySL can compute thesemi-linear sets for 9 benchmarks for which naySL times outwithout the optimization. On benchmarks that take morethan 1s to solve, the optimization results on average in a3.1x speedup. To answer EQ 3: the grammar-stratificationoptimization is highly effective.

1140

Page 14: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

PLDI ’20, June 15ś20, 2020, London, UK Qinheping Hu, John Cyphert, Loris D’Antoni, and Thomas Reps

1 2 3 4 5 6 7 8 910−1

100

101

|E |

time(s)

|N | = 1

|N | = 2

|N | = 3

Figure 3. Running time of nayHorn vs. number of examples.

9 Related Work

Unrealizability in SyGuS. Several SyGuS solvers compete inyearly SyGuS competitions [2], and can produce solutionsto SyGuS problems when a solution exists. If the problem isunrealizable, these solvers only terminate if the language ofthe grammar is finite or contains finitely many functionallydistinct programs, which is not the case in our benchmarks.nope [11], the tool we compare against in ğ8, is the only

tool that can prove unrealizability for non-trivial SyGuSproblems. nope reduces the problem of proving unrealiz-ability to one of proving unreachability in a recursive non-deterministic program, and uses off-the-shelf verifiers tosolve the unreachability problem. Unlike nay, nope doesnot provide any insights into how we can devise specializedtechniques for solving unrealizability, because nope reducesa constrained SyGuS problem to a full-fledged program-reachability problem. In contrast, the approach presentedin this paper gives a characterization of unrealizability interms of solving a set of equations. Using the equation-solving framework, we provided the first decision procedures

for LIA and CLIA SyGuS problems over examples. More-over, the equation-based approach allows us to use knownequation-solving techniques, such as Newton’s method andconstrained Horn clauses.

Unrealizability in Program Synthesis. For certain synthesisproblemsÐe.g., reactive synthesis [4]Ðrealizability is decid-able. However, SyGuS is orthogonal to such problems.

Mechtaev et al. [16] propose to use unrealizability to pruneirrelevant paths in symbolic-execution engines. The synthe-sis problems generated by Mechtaev et al. are not directly

1 2 3 4 5 6 7 8 9

100

101

102

|E |

time(s)

|N | = 1

|N | = 2

|N | = 3

Figure 4. Running time of nope vs. number of examples.

expressible in SyGuS. Moreover, these problems are decid-able because they can be encoded as SMT formulas.

Abstractions in Program Synthesis. SYNGAR [22] uses pred-icate abstraction to prune the search space of a synthesis-from-examples problem. Given an input example i and aregular-tree grammar A representing the search space, SYN-GAR builds a new grammar Aα in which each nonterminalis a pair (q,a), where q is a nonterminal of A and a is a pred-icate of a predicate-abstraction domain α . Any term that canbe derived from (q,a) is guaranteed to produce an outputsatisfying the predicate a when fed the input i . Aα is con-structed iteratively by adding nonterminals in a bottom-upfashion; it is guaranteed to terminate because the set α isfinite. SYNGAR can be viewed as a special case of our frame-work in which the set of values nG(X ) is based on predicateabstraction (see ğ4.3). SYNGAR’s approach is tied to finiteabstract domains, while our equational approach extends toinfinite domainsÐe.g., semi-linear setsÐbecause it does notspecify how the equations must be solved.

Acknowledgments

Supported, in part, by a gift from Rajiv and Ritu Batra; byONR under grants N00014-17-1-2889 and N00014-19-1-2318;by NSF under grants 1763871 and 1750965; and by a Facebookfellowship. The U.S. Government is authorized to reproduceand distribute reprints for Governmental purposes notwith-standing any copyright notation thereon. Opinions, findings,conclusions, or recommendations expressed in this publica-tion are those of the authors, and do not necessarily reflectthe views of the sponsoring agencies.

1141

Page 15: Exact and Approximate Methods for Proving Unrealizability ...loris/papers/pldi20unrealizability.pdf · Flow Analysis, Syntax-Guided Synthesis (SyGuS) ACM Reference Format: Qinheping

Exact and Approximate Methods for Proving Unrealizability of Syntax-Guided Synthesis Problems PLDI ’20, June 15ś20, 2020, London, UK

References[1] Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MKMartin, Mukund

Raghothaman, Sanjit A Seshia, Rishabh Singh, Armando Solar-Lezama,

Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis.

In Formal Methods in Computer-Aided Design (FMCAD). IEEE, 1ś8.

[2] Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama.

2016. SyGuS-Comp 2016: results and analysis. arXiv preprint

arXiv:1611.07627 (2016).

[3] Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean,

Dejan Jovanović, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011.

CVC4. In International Conference on Computer Aided Verification (CAV).

Springer-Verlag, 171ś177.

[4] Roderick Bloem. 2015. Reactive Synthesis. In Formal Methods in

Computer-Aided Design (FMCAD) (Austin, Texas). 3ś3.

[5] A. Bouajjani, J. Esparza, and T. Touili. 2003. A Generic Approach to

the Static Analysis of Concurrent Programs with Procedures. In Princ.

of Prog. Lang.

[6] Benjamin Caulfield, Markus N. Rabe, Sanjit A. Seshia, and Stavros

Tripakis. 2015. What’s Decidable about Syntax-Guided Synthesis?

arXiv preprint arXiv:1510.08393 (2015).

[7] P. Cousot and N. Halbwachs. 1978. Automatic Discovery of Linear

Constraints Among Variables of a Program. In Princ. of Prog. Lang.

[8] Leonardo De Moura and Nikolaj Bjùrner. 2008. Z3: An Efficient SMT

Solver. In Proceedings of the Theory and Practice of Software, 14th In-

ternational Conference on Tools and Algorithms for the Construction

and Analysis of Systems (Budapest, Hungary) (TACAS’08/ETAPS’08).

Springer-Verlag, Berlin, Heidelberg, 337ś340.

[9] Hassan Eldib, Meng Wu, and Chao Wang. 2016. Synthesis of

Fault-Attack Countermeasures for Cryptographic Circuits. In Com-

puter Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.).

Springer International Publishing, Cham, 343ś363.

[10] Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2010. Newto-

nian program analysis. J. ACM 57, 6 (2010), 33:1ś33:47.

[11] QinhepingHu, Jason Breck, John Cyphert, Loris D’Antoni, and Thomas

Reps. 2019. Proving Unrealizability for Syntax-Guided Synthesis. In In-

ternational Conference on Computer Aided Verification (CAV). Springer-

Verlag.

[12] Qinheping Hu and Loris D’Antoni. 2017. Automatic program inver-

sion using symbolic transducers. In Proceedings of the 38th ACM SIG-

PLAN Conference on Programming Language Design and Implementa-

tion, (PLDI). 376ś389.

[13] Qinheping Hu and Loris D’Antoni. 2018. Syntax-Guided Synthesis

with Quantitative Syntactic Objectives. In Computer Aided Verification

- 30th International Conference, (CAV). 386ś403.

[14] J.B. Kam and J.D. Ullman. 1977. Monotone Data Flow Analysis Frame-

works. Acta Inf. 7, 3 (1977), 305ś318.

[15] Eryk Kopczynski and Anthony Widjaja To. 2010. Parikh images of

grammars: Complexity and applications. In 2010 25th Annual IEEE

Symposium on Logic in Computer Science. IEEE, 80ś89.

[16] Sergey Mechtaev, Alberto Griggio, Alessandro Cimatti, and Abhik

Roychoudhury. 2018. Symbolic Execution with Existential Second-

order Constraints. In Proceedings of the 2018 26th ACM Joint Meeting

on European Software Engineering Conference and Symposium on the

Foundations of Software Engineering (ESEC/FSE). 389ś399.

[17] U. Möncke and R. Wilhelm. 1991. Grammar Flow Analysis. In Attribute

Grammars, Applications and Systems, (Int. Summer School SAGA). 151ś

186.

[18] Rohit J. Parikh. 1966. On Context-Free Languages. J. ACM 13, 4 (Oct.

1966), 570ś581. https://doi.org/10.1145/321356.321364

[19] G. Ramalingam. 1996. Bounded Incremental Computation. Springer-

Verlag.

[20] T. Reps, M. Sagiv, and G. Yorsh. 2004. Symbolic implementation of the

best transformer. In VMCAI.

[21] M. Sharir and A. Pnueli. 1981. Two Approaches to Interprocedural

Data Flow Analysis. In Program Flow Analysis: Theory and Applications.

Prentice-Hall.

[22] Xinyu Wang, Isil Dillig, and Rishabh Singh. 2018. Program synthesis

using abstraction refinement. PACMPL 2, POPL (2018), 63:1ś63:30.

1142