Logical and Combinatorial Characterizations of Consistency...

28
Logical and Combinatorial Characterizations of Consistency Algorithms Robert Robere Memorial University of Newfoundland August 12, 2011 Abstract Constraint satisfaction problems of bounded width have been the subject of intense study since being introduced in Feder and Vardi’s seminal paper [FV98]. Intuitively, a constraint satisfaction problem has bounded width if the existence of a local solution implies the existence of a global solution. In this paper, we extend the results of Chen et. al. in [CDG11] on certain subclasses of bounded width problems (those definable using extensions of the arc consistency algorithm) to more closely match the treatment of the classes of bounded width problems introduced in [CDK08]. We also develop and explore the power of extended tree duality, which was introduced in [FV98], and determine it’s relative strength to the aforementioned algorithms. Acknowledgements The road to this paper was paved with the help of two people — Antonina Kolokolova and Todd Wareham. The enthusiasm, guidance, and friendship from both of them have been absolutely irreplaceable. Todd is responsible for introducing me to research – and the thought of an academic career – in the first place. Thank you for this, and your endless amount of advice. Antonina is responsible for so much at this point that listing it all would be impossible. Thank you for putting up with my ideas, for introducing me to the community, for reading every manuscript, for every criticism, and for keeping the fun in research. To Todd, thanks for getting me into the theory game. To Antonina, thanks for keeping me going. 1

Transcript of Logical and Combinatorial Characterizations of Consistency...

Page 1: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Logical and Combinatorial Characterizations of Consistency

Algorithms

Robert Robere

Memorial University of Newfoundland

August 12, 2011

Abstract

Constraint satisfaction problems of bounded width have been the subject of intense study since being

introduced in Feder and Vardi’s seminal paper [FV98]. Intuitively, a constraint satisfaction problem

has bounded width if the existence of a local solution implies the existence of a global solution. In this

paper, we extend the results of Chen et. al. in [CDG11] on certain subclasses of bounded width problems

(those definable using extensions of the arc consistency algorithm) to more closely match the treatment

of the classes of bounded width problems introduced in [CDK08]. We also develop and explore the power

of extended tree duality, which was introduced in [FV98], and determine it’s relative strength to the

aforementioned algorithms.

Acknowledgements

The road to this paper was paved with the help of two people — Antonina Kolokolova and Todd Wareham.

The enthusiasm, guidance, and friendship from both of them have been absolutely irreplaceable.

Todd is responsible for introducing me to research – and the thought of an academic career – in the first

place. Thank you for this, and your endless amount of advice.

Antonina is responsible for so much at this point that listing it all would be impossible. Thank you for

putting up with my ideas, for introducing me to the community, for reading every manuscript, for every

criticism, and for keeping the fun in research.

To Todd, thanks for getting me into the theory game. To Antonina, thanks for keeping me going.

1

Page 2: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

1 Introduction

Consider the following problem: you are given a set of variables, each of which can take on a single value

from a fixed collection. However, some of these variables are subject to constraints — some sets of variables

are restricted to a certain set of values. We ask: can you find an assignment for each variable such that

every constraint is satisfied?

The constraint satisfaction problem (CSP) captures exactly this theoretical framework. Instances of the

CSP can be found in a multitude of mathematical fields, including artificial intelligence, logic, combinatorics,

and algebra. The CSP has also found widespread usage in industry as a convenient language for expressing

search problems. In fact, many people even solve CSPs recreationally — it would be a difficult task to find

a home without a Sudoku book.

When can one solve a constraint satisfaction problem efficiently? And, given that a particular CSP is

efficiently solvable, how would one go about solving it? These were two of the questions asked by Feder

and Vardi in their seminal paper [FV98]. It was here that the CSPs of bounded width were introduced.

Intuitively, a CSP has bounded width if a local solution for the problem can always be extended to a global

solution. Feder and Vardi defined this class of problems through Datalog, the database query language.

They also showed that if a CSP has bounded width, then it can be characterized in several other equivalent

ways, each interesting in their own right.

There is a class of efficient procedures that can be used to solve CSPs of bounded width: these are

the so-called consistency algorithms. These algorithms work by maintaining a list of possible mappings for

each variable in our instance, and incrementally enforcing the constraints on these mappings until the lists

stabilize. The arc consistency algorithm was the first one of these introduced [Mac77], and it’s strength

has been rigorously characterized [DP99]. Chen et. al. studied several consistency algorithms using arc

consistency as a subroutine, namely, look-ahead arc consistency, peek arc consistency, and singleton arc

consistency [CDG11]. Here, they introduced an algebraic characterization of the structures that these

algorithms accept and a hierarchy representing their relative power [CDG11].

1.1 The Many Lenses of CSPs

Feder and Vardi originally introduced a multi-pronged approach to classifying the complexity of bounded

width CSPs. They introduced Datalog to capture the class of all bounded-width CSPs, as well as classifying

them in several other combinatorial and logical ways [FV98]. Perhaps most importantly, they showed that

homomorphism dualities are a flexible tool for capturing different subclasses of bounded width problems.

A number of equivalent methods of defining bounded width problems have since evolved, including the

existence of winning strategies in certain pebble games and expressibility in different logics [BKL08]. Another

important method of classification is through the so-called algebraic approach to the CSP, pioneered by

2

Page 3: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Jeavons [Jea98]. Viewing CSPs under these many different lenses is both a statement to the problem’s

generality and a useful technical tool which introduces many lanes of attack.

A beautiful example of defining classes of CSPs through these different lenses is the recent work of Krokhin

et. al. [CDK08]. In this paper, they studied the smallest fragments of Datalog, and produced a multifaceted

analysis of the classes of CSPs solvable by these fragments. All in all, they showed that acceptance by these

small fragments of Datalog is equivalent to certain homomorphism dualities, the existence of homomorphisms

from particular structures, and the existence of several algebraic properties. It is with this rigour that we

would like to examine the consistency algorithms studied in [CDG11].

1.2 Our Results

In this article, we follow the lead of Krokhin et. al. and match the development of the peek arc consistency

(PAC) algorithm with arc consistency. We introduce a Datalog fragment and a homomorphism duality

that each exactly characterize the class of structures that PAC accepts. Thus, our work can be viewed as

an extension of the work of Chen et. al. on the classification of consistency algorithms that solve bounded

width CSPs [CDG11]. We also work in the opposite direction — we introduce an algorithm that exactly

characterizes extended tree duality, which was introduced by Feder and Vardi [FV98]. We determine the

relative strength of our new algorithm, extended arc consistency, with respect to the other consistency

algorithms discussed in [CDG11]. Finally, we compare the power of PAC with other known bounded width

subclasses, and show that it is not equivalent to the class of width two CSPs.

2 Preliminaries

2.1 Relational Structures

A vocabulary τ = {R1, R2, . . . , Rn} is a set of relation symbols, and each relation symbol has an associated

arity, which is a positive integer corresponding to the number of positions in each tuple of the relation.

A relational structure over a vocabulary τ is a tuple A = 〈A,RA1 , . . . , R

An 〉 where A is an m element set

called the universe or domain of A, and for each relation symbol Ri of arity k in τ , A has a k-ary relation

RAi ⊆ Ak. If we want to introduce a new relation S into a structure A, we will use the notation [A, SA],

and we will assume that the relational vocabulary for the expanded structure is expanded to include the

relational symbol S.

For a k-ary relation R ⊆ Ak, we will denote by πi(R) the projection of R onto coordinate i, for 1 ≤ i ≤ k.

In other words, πi(R) = {t[i] | t ∈ R}.

Let n ≥ 1 be an integer. We define the structure An as follows:

3

Page 4: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

• The domain of An is An = A×A× · · · ×A︸ ︷︷ ︸n times

.

• For each m-ary relation R ∈ τ, (a1,a2, . . . ,am) ∈ RAn

if and only if (a1[i],a2[i], . . . ,am[i]) ∈ RA for

all 1 ≤ i ≤ n.

This structure will be called the n-fold product of A. We will also define the power structure of A, denoted

℘(A):

• The domain of ℘(A) is ℘(A) \ ∅, or the set of all non-empty subsets of A.

• For each m-ary relation R ∈ τ, (S1, S2, . . . , Sm) ∈ R℘(A) if and only if, for each 1 ≤ i ≤ m and for each

a ∈ Si, there exists aj ∈ Sj for each j 6= i such that (a1, a2, . . . , ai−1, a, ai+1, . . . , an) ∈ RA.

For a structure A, we say that two elements a, a′ are adjacent if there exists a triple (Rl, i, j) such that there

exists a tuple t in the relation RAl where a = t[i] and a′ = t[j]. We say that a, a′ are connected if there

exists a sequence of elements s = {a1, a2, . . . , am} such that a = a1, am = a′, and (ak, ak+1) are adjacent for

1 ≤ k ≤ m − 1. If every pair of elements in a relational structure is connected, then we will say that the

structure is connected. In general we will assume that relational structures are connected, but note that this

may not be the case once we consider power structures.

The incidence multigraph of a structure A (see [CDK08,NT00]), denoted Inc(A), is the bipartite multi-

graph with parts A and Block(A), where Block(A) consists of all pairs (R,a) such that R is in the vocabulary

τ and a is in RA. The edges ea,i,Z of the graph join a in A to Z = (R, (a1, . . . , am)) when ai = a. We

say that A is a τ -tree if it’s incidence multigraph is a tree (we will often call A a tree if the vocabulary is

understood from context). Note that the definition of a τ -tree collapses to the normal definition of a tree

when τ contains only a single binary relation symbol. The incidence multigraph is a generalization of graphs

to arbitrary relational structures, and will be of interest to us when we start speaking about Datalog and

dualities.

2.2 Homomorphisms, Constraint Satisfaction, and Polymorphisms

Let τ be a relational vocabulary. A homomorphism between two structures A and B on τ is a function h

mapping A into B such that the following holds: for each relation R ∈ τ , if (a1, a2, . . . , an) is a tuple in RA,

then (h(a1), h(a2), . . . , h(an)) is in RB. If there is a homomorphism between the two structures A and B,

we write A→ B.

For a relational structure B, the constraint satisfaction problem for B (denoted CSP(B)), is defined as

follows. Given an input structure A over the same vocabulary as B, is A homomorphic to B? Note that

this can be effectively described by the language CSP(B) = {A : A→ B}.

4

Page 5: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

It will also be useful to consider the complement of a constraint satisfaction problem over a relational

structure B — define co-CSP(B) = {A : A 6→ B}.

A homomorphism from An to A will be called a polymorphism. In general, we will prefer to use the

following definition. Let f : An → A be a n-ary operation on A, and let RA be an m-ary relation on A. Let

t1, t2, . . . , tn be tuples in RA, with ti = (ai1, ai2, . . . , aim). We say that RA is invariant under f if, for any

selection of n tuples in RA, we have that

(f(a11, a21, . . . , an1), f(a12, a22, . . . , an2), . . . , f(a1m, a2m, . . . , anm)) ∈ RA.

We say that f is a polymorphism of A if every relation in A is invariant under f . In [Jea98], Jeavons showed

that the complexity of CSP(B) is determined completely by the polymorphisms of the relational structure

B. This result arguably spurred the development of the algebraic approach to the CSP that has been so

successful in the recent years [BL08,BK09,BJK05,LT09].

2.3 Datalog

Fix a vocabulary τ . A Datalog program over τ is a set of rules of the form

H :– B1, B2, . . . , Bn,

where each H,B1, . . . , Bn are atomic formulae R(x0, x1, . . . , xm). The formula H is called the head of the

rule, while the formulae B1, . . . , Bn are collectively called the body of the rule. The predicate in the head is

not from τ , and is called an IDB, or intensional database predicate. The predicates in the body of the rule

can be IDBs or can be relations from τ , in which case they are called EDBs (extensional database predicates).

The rule as a whole is interpreted as follows: if the each formula in the body B1, B2, . . . , Bn is true, then the

formula H is true. One of the IDBs, which is usually 0-ary, is called the goal predicate. Datalog programs

are therefore recursive specifications of IDBs, and given a structure A we say that the program accepts A if

the goal predicate evaluates to true.

For 0 ≤ j ≤ k, a (j, k)-Datalog program is a Datalog program with at most j variables in the head and k

variables in the body of each rule. A Datalog program is called is called monadic if each IDB has at most

one variable. If B is a relational structure on τ , CSP(B) is said to have width (j, k) if co-CSP(B) is definable

in (j, k)-Datalog. If this is the case, then we say that (j, k)-Datalog solves CSP(B).

Example 1. Consider the structure B2COL, on the universe B = {0, 1} with a single binary relation

EB2COL = {(0, 1), (1, 0)}. The corresponding constraint problem, CSP(B2COL), is exactly the problem of 2-

Colouring — given a graph G, is there an assignment of one of two colours to each vertex such that adjacent

vertices have different colours?

5

Page 6: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

The complement problem co-CSP(B2COL) is expressible by the following (2, 4)-Datalog program:

O(X,Y ) :– E(X,Y )

O(X,Y ) :– O(X,Z), E(Z, T ), E(T, Y )

goal :– O(X,X)

Recall that a graph G is two colourable if and only if it is bipartite, and therefore does not contain an odd

cycle. This Datalog program recursively searches through the input graph G for an odd cycle, and thus the

goal predicate will evaluate to true if and only if the the graph is not two colourable.

We therefore say that CSP(B2COL) has width (2, 4), although we will often shorten this to just “width

two”.

It should be clear that the structures accepted by Datalog programs are closed under extensions. Specif-

ically, if a structure A is accepted by a Datalog program π, then any structure A′ containing A as a

substructure is also accepted by π.

What’s the use in defining co-CSP(B) in Datalog, as opposed to the CSP(B) itself? This can be succinctly

answered by the following proposition:

Proposition 1. Let π be a Datalog program and let B be a relational structure over a vocabulary τ .

1. The set of relational structures accepted by π is closed under homomorphism.

2. co-CSP(B) is closed under homomorphism.

3. CSP(B) is not necessarily closed under homomorphism.

Proof. (1) Let A be a relational structure that is accepted by π, and let C be a relational structure such

that A → C. We need to show that C is also accepted by π. Let h : A → C be the homomorphism from

A to C, and consider the derivation produced by π when given A as input. Each step of the derivation will

look something like this:

I0(x0) :– R1(x1), . . . , Rm(xm), I1(y1), . . . , In(yn),

where each variable in each vector xi and yi is selected from the universe of A. Note that this derivation

will also hold for the structure C if we replace each variable a in each vector xi and yi by h(a). Thus C is

also accepted by π.

(2) Let A be a structure, and suppose A 6→ B but A → C. If C → B, then A → C → B, which is a

contradiction. Therefore, C 6→ B.

(3) Consider the following relational structures over the vocabulary τ = {E} containing a single binary

relation:

6

Page 7: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

• B = 〈{0, 1}, {(0, 1), (1, 0)}〉,

• C = 〈{0, 1, 2}, {(0, 1), (1, 0), (1, 2), (2, 1), (2, 0), (0, 2)}〉

Consider CSP(B). Clearly B→ B, and B→ C, but C 6→ B.

An important concept that we will use repeatedly is that of a canonical Datalog program. Feder and

Vardi showed the following in their seminal paper [FV98, Theorem 17], which we will restate here.

Theorem 1. For any structure B, there exists a canonical (l, k)-Datalog program π with the following

property: If any (l, k)-Datalog program solves CSP(B), then so does the canonical program.

We will describe the canonical program as in [CDK08]. Denote B = 〈B;RB1 , R

B2 , . . . , R

Bn 〉, and let

S0, S1, . . . , Sp be an enumeration of all (at most) l-ary relations on B that can be expressed by a ∃∧-formula

on B, and assume that S0 is the empty relation. For each Si of arity m, 1 ≤ m ≤ l, we introduce a m-ary

IDB Ii. The canonical program will include each of the EDBs R1, R2, . . . , Rn, each of the IDBs I0, I1, . . . , Ip,

and all rules of size at most k with the following property: if every Ii in a rule is replaced with Si and every

Rj with RBj , then every assignment of values in the domain of B to variables that satisfy the conjunction

of relations in the body must also satisfy the relation appearing in the head. Finish by adding a rule

G :– I0(· · · ), which will be our goal rule.

We close this section with a quick observation. Consider the canonical (1, k)-Datalog program for some

fixed k. This program works by inferring all possible unary relations, using at most k variables, definable

by the structure B on our input structure A. Since a unary relation on B is representable as a subset B′

of B, it follows that the canonical monadic Datalog program produces a mapping from input variables in A

to subsets of B. Unsurprisingly, this is closely related to the power structure ℘(B) of B. We will make this

connection clear in Section 3.2.

2.4 Homomorphism Dualities

The study of homomorphism dualities and obstruction sets have been an integral part of the complexity of

constraint satisfaction since the area’s inception. To illustrate what we mean by a homomorphism duality,

fix a relational structure B on a vocabulary τ . We say that a class O of relational structures on τ is an

obstruction set for B if the following holds, for any A on τ : A 6→ B if and only if O→ A for some O ∈ O.

If a structure B has such an obstruction set, we say that B has a homomorphism duality. It turns out that

homomorphism dualities define several natural classes of constraint satisfaction problems (a nice survey on

homomorphism dualities is [BKL08], see also [FV98,CDK08,HZN96,HN04,Dal05]).

Often in our duality theorems we will only be considering a “nice” set of structures O (for example all

τ -trees). When this happens we will sometimes use the equivalent positive definition of a homomorphism

duality, which is stated as follows: A→ B if and only for any O in O, if O→ A then O→ B.

7

Page 8: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

3 Arc Consistency

We now present the arc consistency algorithm, which will be the focus of our work. Arc consistency was

originally introduced as an efficient method of solving certain problems on constraint networks in [Mac77].

It was later investigated by Feder and Vardi through Datalog. Instead of describing arc consistency through

Datalog initially, we will first describe it as an algorithm as in [Mac77]. Then, we will present a number of

equivalent notions to solvability by arc consistency — the existence of a particular homomorphism duality,

homomorphisms from particular structures, and solvability by monadic Datalog. These proofs each originally

appeared in [FV98], unless otherwise stated.

3.1 The Arc Consistency Algorithm

Intuitively, the arc consistency algorithm works as follows. Given a pair of input structures (A,B) on the

same vocabulary, the arc consistency algorithm incrementally infers constraints on the possible mappings of

variables a in the source structure A to the values b in the target structure B. If at any point, we conclude that

any variable has no possible mappings, we stop and reject the input. It follows that consistency algorithms

are inherently rejection procedures. While they will never falsely reject an input, false acceptance of input is

a real possibility. So, what properties must structures have so that consistency procedures will accept them?

Where is the dividing line between false acceptance and correct acceptance? These are important questions,

and to answer them we will give the same presentation as that of Chen, Dalmau and Grußien [CDG11].

First, let’s show exactly when arc consistency accepts an input (A,B). The method of development

below was originally introduced in [BC10], where it was stated without proof with reference to [DP99]. Thus

we will re-prove the results for completeness.

Definition 1. An instance (A,B) has the arc consistency condition (ACC) if there exists a homomorphism

from A to ℘(B).

Proposition 2. The arc consistency algorithm accepts an instance (A,B) if and only if the instance has

the ACC.

Proof. First, suppose that the arc consistency algorithm accepts an instance (A,B), and we will show that

the instance has the arc consistency condition.

Since the instance (A,B) is accepted by the algorithm it follows that when the algorithm terminates,

each element a has an associated set of elements Sa ⊆ B. We claim that the mapping h : A → ℘(B) by

h(a) = Sa is a homomorphism. We need to show, then, that if (a1, a2, . . . , an) is a tuple of variables in a

relation RA, then (Sa1 , Sa2 , . . . , San) is a tuple in R℘(B). The truth of this becomes clear when we consider

the last iteration of the outermost loop of the arc consistency algorithm, which will be the loop in which

no set Saibecomes changed. Since the projection of RB ∩ (Sa1

× Sa2× · · · × Sak

) is nonempty on each

8

Page 9: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Algorithm 1 Arc Consistency. Input: Two relational structures A,B over a vocabulary τ .

for all a ∈ A do

Sa ← B

end for

repeat

for all relations RA ∈ A do

for all tuples (a1, a2, . . . , ak) ∈ RA do

for all i ∈ {1, 2, . . . , k} do

Sai← πi(R

B ∩ (Sa1× Sa2

× · · · × Sak))

end for

end for

end for

until no set Sa is changed

if there exists a ∈ A such that Sa = ∅ then

Reject

else

Accept

end if

9

Page 10: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

coordinate, the tuple (Sa1, Sa2

, . . . , Sak) is in R℘(B), since for each j ∈ [k], and any element b ∈ Saj

, there

exists bi ∈ Sai for each i such that (b1, b2, . . . , bj−1, b, bj+1, . . . , bk) ∈ RB. Thus, for any tuple (a1, a2, . . . , ak)

in any relation RA, the tuple (Sa1, Sa2

, . . . , Sak) is in R℘(B).

Now suppose that (A,B) has the ACC, and let φ be the homomorphism from A to ℘(B). We will show

that throughout the algorithm, φ(a) ⊆ Sa for each a in A. Note that this is trivially true at the beginning

of the algorithm’s execution, when Sa is set to B for each a.

On some iteration of the loop through the algorithm, consider an element b in φ(a). We will show that b

is not removed from Sa. To see this, suppose otherwise by way of contradiction. By the inductive hypothesis,

no set Sa′ will ever grow smaller than φ(a′). Consider any tuple (Sa1, Sa2

, . . . , Sa, . . . , Sak) containing Sa.

Since no other set Sai will ever grow smaller than φ(ai), it follows on the next iteration of the loop some

other set Saj must lose an element c. This contradicts our inductive hypothesis, and so it must be that

φ(a) ⊆ Sa for each a. It follows that no set Sa will ever become empty, and so the algorithm must accept

(A,B).

Definition 2. Let B be a structure. We say that AC solves CSP(B) if, for all instances (A,B), the following

holds: (A,B) has the ACC implies that A→ B.

So, we have answered our questions about the dividing line between AC accepting and rejecting an

input (A,B). However, our definition of AC solving a particular CSP is rather bare of intuition. Luckily,

from [FV98] originally, and investigated further in [DP99], we have the following result.

Theorem 2. AC solves CSP(B) if and only if there is a homomorphism from ℘(B) to B.

Proof. Assume that AC solves CSP(B). It follows that the identity function f : ℘(B)→ ℘(B) is a homomor-

phism from ℘(B) to ℘(B), and so (℘(B), ℘(B)) has the ACC – it follows that there exists a homomorphism

from ℘(B) to B.

Now assume that there exists a homomorphism h : ℘(B) → B, and suppose that (A,B) is an input

to AC that has the ACC. Denote by g : A → ℘(B) the homomorphism from the ACC. We can trivially

compose f and g to produce a homomorphism from A to B, and our proof is complete.

3.2 Monadic Datalog and Tree Duality

In this section, we will show the equivalence of AC solving CSP(B) to definability in monadic Datalog. This

will be done by showing that the definability of co-CSP(B) in monadic Datalog is equivalent to B having

tree duality, and then by showing that tree duality is equivalent to the existence of a homomorphism from

℘(B) to B. These results originally appeared in [FV98], and have been proven and re-proven in countless

papers since. We include proofs for completeness, and to “get a feel” of the language we use.

10

Page 11: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Recall that we say that a Datalog program is monadic if each IDB appearing in the program is unary

(has a single variable). We will refer to the canonical monadic Datalog program as a canonical tree program,

as each derivation produced by such a program will be a τ -tree. In such a case we will also assume that the

body of each rule contains at most one EDB (we can easily transform any monadic Datalog program to one

of this form by introducing auxiliary IDBs). This assumption will simplify the proofs later on.

Definition 3. Let B be a relational structure. We say that B has tree duality if it has an obstruction set

consisting entirely of trees. Equivalently, a structure A is homomorphic to B if and only if every tree T that

homomorphically maps to A also homomorphically maps to B.

Theorem 3. Let B be a relational structure on a relational structure τ . The following are equivalent:

1. co-CSP(B) is definable in monadic Datalog.

2. B has tree duality.

3. There exists a homomorphism from ℘(B) to B.

4. CSP(B) is solvable by AC.

Proof. (1⇒ 2) Let π be the canonical tree program for co-CSP(B), and consider an input A to the program.

If the program π accepts A, then it’s derivation tree will be τ -tree T using relation symbols from τ and

variables from A. This tree T will homomorphically map to A, but will not map to B (since B is not

accepted by it’s tree program), and thus tree duality holds. If the program rejects A, then no tree can be

produced that maps to A that doesn’t also map to B, and so tree duality still holds.

(2 ⇒ 1) Suppose that B has tree duality, and let A be a relational structure. Observe that we can

interpret the canonical monadic Datalog program as inferring all possible monadic constraints on elements

of the input structure A. Since this program can infer all possible monadic constraints on any set of variables,

the program must be able to produce any possible τ -tree. It follows that A is accepted by the canonical

program if and only if there exists a tree T that homomorphically maps to A but does not map to B.

This follows from the fact that the program will produce a τ -tree that maps to A, structures accepted

by Datalog programs are closed under homomorphism, and B is explicitly not accepted by the canonical

program. Therefore A is accepted by the canonical monadic program if and only if A is not homomorphic

to B by the definition of tree duality.

(2 ⇒ 3) Suppose that B has tree duality. Every tree T that maps to ℘(B) through a homomorphism φ

has images that are nonempty sets B′. For each set B′ chosen as an image for an element t in the tree, we

can create a mapping from T to B by selecting consistent elements from each of the sets — select an element

from the set mapped to by the root, then select elements consistent with the both φ and the structure ℘(B)

11

Page 12: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

from the “children” sets, and continue on in this manner until we have an element chosen for each element

t in T . By the definition of tree duality, we conclude that ℘(B) homomorphically maps to B.

(3⇒ 2) Suppose that there exists a homomorphism h from ℘(B) to B, and let A be a structure such that

every tree T that homomorphically maps to A also homomorphically maps to B. If we use the canonical

tree program on A, then every element a in A will be assigned a non-empty set of possible mappings

by the program (otherwise, we will get a tree T′ that homomorphically maps to A and that does not

homomorphically map to B). Call this program induced mapping φ. Since φ was produced by the canonical

monadic Datalog program, this will be a mapping to ℘(B). The result follows by the composition of φ and

h.

(4 ⇔ 3) Proved in previous theorem.

We’ll close this section with two examples: one of a CSP that is solvable by AC, and one CSP that isn’t.

Example 2. Consider the following structure, B3H , on the universe B = {0, 1} with one unary relation

UB3H = {1}, and two ternary relations PB3H = {0, 1}3 \ {(1, 1, 1)} and NB3H = {0, 1}3 \ {(1, 1, 0)}. The

problem CSP(B3H) is the 3-Horn-Satisfiability problem, which is well known to be in P (recall that 3-Horn-

Satisfiablity is simply 3-Satisfiability where the input clauses are restricted to be implications of the form

a ∧ b→ c or a ∧ b→ c).

We can express co-CSP(B3H) in (1, 3)-Datalog as follows:

T (X) :– U(X)

T (Z) :– P (X,Y, Z), T (X), T (Y )

unsat :– N(X,Y, Z), T (X), T (Y ), T (Z)

This Datalog program expresses the usual unit-propagation algorithm for 3-Horn-Satisfiablity — it searches

for truth values for each variable by setting variables so that each of each clause is true, and then using the

implication graph to search for the next clause to satisfy. It then checks to see if following this chain of

implications leads to a contradictory assignment to one of the “negative” clauses C ∈ N .

Example 3. Recall the relational structure B2COL on the vocabulary τ = {E}, containing a single binary

relation, as follows:

• B = {0, 1},

• EB2COL = {(0, 1), (1, 0)}.

This problem defines exactly the 2-Colouring problem, and we can easily show that it is not solvable by AC.

Consider the power structure, ℘(B2COL):

• ℘(B) = {{0}, {1}, {0, 1}}

12

Page 13: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

• E℘(B2COL) = {({0}, {1}), ({1}, {0}), ({0, 1}, {0, 1})}

Recall that a graph G is two colourable if and only if the graph does not contain an odd cycle. Therefore,

G→ B2COL if and only if G does not contain an odd cycle. Since E℘(B2COL) contains a self-loop, it follows

that ℘(B2COL) 6→ B2COL, and so CSP(B2COL) is not solvable by AC.

We make one note – the individual width classes seem to be orthogonal to standard complexity classes.

Horn-Satisfiablity is known to be complete for P under log-space reductions, while 2-Colouring is known to

be in NL. However, 2-Colouring has width (2, 4) (see Example 1), while 3-Horn-Satisfiability has width (1, 3).

In fact, the computational complexity of CSPs that have bounded width are more closely aligned with the

structure of the Datalog program instead of the numerical width (for more information, see [BKL08,Dal05]).

4 Extended Tree Duality

In this section we will consider a simple extension of tree duality (and thus arc consistency) — namely, that

of extended tree duality. This was originally introduced by Feder and Vardi in [FV98] as a stepping stone to

defining CSPs of bounded width.

Definition 4. Let B be a structure. We say that CSP(B) has extended tree duality if the following holds,

for all structures A on the same vocabulary as B: The structure A, with one element a ∈ A preassigned to

an element b ∈ B, homomorphically maps to B if and only if all trees T that can map to A can be mapped

to B in such a way that all elements t in T that map to a are also mapped to b.

Again, extended tree duality is a simple extension of tree duality. Despite this, the addition of a single

“peek” does increase the power of arc consistency, and thus is worth further investigation. Therefore, we will

develop extended tree duality in the same way that we developed arc consistency: through an algorithm, and

structural definition. As for a Datalog definition, we will argue that defining extended tree duality through

Datalog is difficult without introducing a new type of control operator; this is due to the implicit existential

quantification in the pre-assignment of extended tree duality.

4.1 The Extended Arc Consistency Algorithm

We will now introduce a simple extension of the arc consistency algorithm that captures the “extra mapping”

in the definition of extended tree duality. It will be developed along the same lines that we developed the

arc consistency algorithm above. First, we define an analogous structural definition (introduced in [FV98])

that we can use to describe acceptance conditions for our extended arc consistency algorithm.

Definition 5. We will denote by S(B) the substructure of ℘(B) with the following restriction: S(B) consists

of the union of connected components of ℘(B) which contain a singleton set.

13

Page 14: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Algorithm 2 Extended Arc Consistency Input: Two relational structures A,B over a vocabulary τ , an

element b in B.for all a ∈ A, b ∈ B do

if ArcConsistency([A, {a}], [B, {b}]) accepts then

Accept.

end if

end for

Reject.

Definition 6. An instance (A,B) has the extended arc consistency condition (EACC) if there exists a

homomorphism from A to S(B).

Proposition 3. The extended arc consistency algorithm accepts an instance (A,B) if and only if the instance

has the EACC.

Proof. Assume that (A,B) is accepted by the extended arc consistency algorithm. It follows that, for some a

in A and some b in B, the instance ([A, {a}], [B, {b}]) has the ACC. Therefore, there exists a homomorphism

h from A to ℘(B) such that a maps to the singleton set {b}. We can safely assume that A is connected,

and so h must be a homomorphism from A to S(B).

Now assume that (A,B) has the EACC. Let h be our homomorphism from A to S(B), with h(a) = {b}.

It follows that ([A, {a}], [B, {b}]) will have the ACC, and so the EAC algorithm will halt and accept.

Definition 7. Let B be a relational structure. We say that the extended arc consistency algorithm solves

CSP(B) if the following holds: An instance (A,B) has the EACC implies that A→ B.

We have showed exactly when the EAC algorithm accepts an input — it is a simple matter to give a

better condition for when EAC solves CSP(B).

Theorem 4. EAC solves CSP(B) if and only if there exists a homomorphism from S(B) to B.

Proof. If we assume that EAC solves CSP(B), it follows that the identity homomorphism for S(B) will be a

homomorphism from S(B) to S(B), and thus (S(B),B) has the EACC. Now assume that h is a homomor-

phism from S(B) to B, and consider an instance (A,B) that has the EACC. Let g be the homomorphism

from A to S(B) — we can create a homomorphism from A to B by composing h and g.

In light of this result, we will use “extended arc consistency” and “extended tree duality” interchangeably.

As for the development of a Datalog fragment, it seems to be difficult without some sort of “cut” operator

a la Prolog. In order to create a canonical Datalog fragment that captures extended tree duality, we would

have to introduce a new variable into each of the IDBs so that the canonical tree program can remember

14

Page 15: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

the current target for our “peeked” variable. Once we find a value for the “peeked” variable that causes the

canonical tree program to accept, we have to somehow cut the derivation at that point. This cut will remove

the possibility of deriving the empty predicate on another variable. This is seemingly impossible in Datalog,

as this control structure for cutting derivations does not exist.

The root issue is that EAC is not a rejection procedure. Instead of “seeking rejection”, it “seeks accep-

tance”, by which we mean it halts and accepts once a favourable condition is satisfied, as opposed to halting

and rejecting when a negative condition is satisfied. Does the lack of definability in Datalog somehow remove

extended arc consistency’s legitimacy? This does not seem to be the case, as it is still more powerful than

arc consistency alone. We show this fact through the next two propositions.

Proposition 4. Let B be a relational structure. If B has tree duality then B has extended tree duality.

Proof. If B has tree duality, then there exists a homomorphism h from ℘(B) to B. Since S(B) is a substruc-

ture of ℘(B), it follows that h restricted to S(B) is a homomorphism from S(B) to B.

Proposition 5. There exists a structure B such that CSP(B) is solvable by extended arc consistency but

not solvable by arc consistency.

Proof. Consider 2-Colouring once again. Our template structure B2COL has a universe B = {0, 1} and a

single binary relation EB2COL = {(0, 1), (1, 0)}. Then ℘(B2COL) has a universe ℘(B) = {{0}, {1}, {0, 1}}

with a relation E℘(B2COL) = {({0}, {1}), ({0}, {1}), ({0, 1}, {0, 1})}. Taking the substructure S(B2COL)

of ℘(B2COL) will give us a universe of S(B2COL) = {{0}, {1}} and an edge relation of ES(B2COL) =

{({0}, {1}), ({1}, {0})}. The structure S(B2COL) is therefore isomorphic to B2COL, by the function

h : S(B2COL) → B2COL defined by h({k}) = k. Since 2-Colouring is not solvable by arc consistency alone

(see Example 2), the claim is proven.

Despite extended tree duality not being immediately definable in Datalog, the introduction of the “peek”

is still enough to capture some width two problems. Our next algorithm, peek arc consistency, can be seen as

a natural extension of the peek introduced here. Before we start, let’s summarize the results of this section:

Theorem 5. Let B be a relational structure. The following are equivalent:

1. CSP(B) is solvable by the extended arc consistency algorithm.

2. CSP(B) has extended tree duality.

3. There exists a homomorphism from S(B) to B.

15

Page 16: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

5 Peek Arc Consistency

We now present the peek arc consistency (PAC) algorithm. This algorithm “peeks” ahead, setting a single

variable a in A to a value in B, and then uses arc consistency as a subroutine to empty elements from the

“possible” mappings of a. Peek arc consistency was originally introduced by Bodirsky and Chen in [BC10].

Peek arc consistency is quite similar to the extended arc consistency algorithm. The difference between

the two is that PAC is truly a rejection procedure — it continues operating until a negative condition is

detected, and then halts and rejects.

Algorithm 3 Peek Arc Consistency Input: Two relational structures A,B over a vocabulary τ .

for all a ∈ A do

Sa ← B

end for

for all a ∈ A, b ∈ B do

if ArcConsistency([A, {a}], [B, {b}]) rejects then

remove b from Sa

end if

end for

if there exists an a ∈ A such that Sa = ∅ then

Reject

else

Accept

end if

We will follow the same development for PAC as we did for AC and EAC above, and as such will

present proofs more in line with those above from [CDG11], as opposed to the pp-definability approach used

in [BC10].

Definition 8. An instance (A,B) has the peek arc consistency condition (PACC) if for every element

a ∈ A, there exists a homomorphism h : A→ ℘(B) such that h(a) is a singleton.

Proposition 6. The peek arc consistency algorithm does not reject an instance (A,B) if and only if the

instance has the PACC.

Proof. Let (A,B) be an input to PAC, and first suppose that (A,B) has the PACC. We need to show that

the instance ([A, {a}], [B, {b}]) has the ACC for all a ∈ A, and at least one b ∈ B.

To show that ([A, {a}], [B, {b}]) has the ACC, we need to show that there exists a homomorphism h :

[A, {a}]→ ℘([B, {b}]). Notice that in any homomorphism to ℘([B, {b}]), we must have some set of elements

16

Page 17: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

mapping to the singleton {b}. Since (A,B) has the PACC, we know that there exists a homomorphism f :

A→ ℘(B) for every a ∈ A such that f(a) = {c} for some c ∈ B. Thus, if we take f to be our homomorphism

from A to ℘(B) it follows that for each a there will exist some c ∈ B such that ([A, {a}], [B, {c}]) has the

ACC. Thus, no set Sa will be emptied, and we have that the algorithm will accept our instance.

Conversely, suppose that the algorithm accepts (A,B) and, by way of contradiction, that (A,B) does

not have the PACC. Then there exists some element a ∈ A such that every homomorphism h : A→ ℘(B) we

have that |h(a)| > 1. It follows that the instance ([A, {a}], [B, {b}]) will not have the ACC for each b ∈ B,

and thus we have that Sa will be empty. Therefore, the algorithm will reject ([A, {a}], [B, {b}]), which is

our contradiction.

Definition 9. Let B be a structure. We say that PAC solves CSP(B) if the following holds: If an instance

(A,B) has the PACC then A→ B.

As shown in [BC10], we can come up with a structure-homomorphism definition of acceptance by PAC

similar to the definition presented for AC and EAC. We will denote by Sing(℘(B)n) the induced substructure

of ℘(B)n whose domain contains an n-tuple if and only if at least one element of the tuple is a singleton.

Proposition 7. Let B be a structure. PAC solves CSP(B) if and only if for all integer n ≥ 1 there exists

a homomorphism from Sing(℘(B)n) to B.

Proof. Suppose PAC solves CSP(B), and let n ≥ 1. We will show that (Sing(℘(B)n),B) has the PACC. Fix

a tuple t ∈ Sing(℘(B)n), we need to show that there exists a homomorphism h from Sing(℘(B)n) to ℘(B)

such that h(t) is a singleton. Recall that t has an element {b} for some b ∈ B. Suppose this element lies on

the ith coordinate of t – it follows that the projection πi on the ith coordinate will be a homomorphism to

℘(B) with h(t) = {b}.

Now suppose that there is a homomorphism h from Sing(℘(B)n) to B for all n > 1. We need to

show that PAC solves CSP(B). Suppose that (A,B) has the PACC. Then, for all a ∈ A, there exists

a homomorphism ha : A → ℘(B) such that ha(a) is a singleton. Define h : A → Sing(℘(B)|A|) by

h(x) = Πa∈Aha(x). This is clearly a homomorphism to ℘(B)|A|, and we know that for each x, hx(x) will

be a singleton. Thus h is a homomorphism from A to Sing(℘(B)|A|), and so A is homomorphic to B by

composition of homomorphisms.

An interesting fact about the structure-homomorphism definition of PAC acceptance is that it is not a

finite definition — it requires the existence of infinitely many homomorphisms. As such, this is not the most

useful definition in practice. Therefore, it’s natural to ask whether or not there exists a finite structural

definition of PAC. In fact, this does not seem to be the case, and we will argue as such in the next section.

17

Page 18: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

5.1 A Canonical PAC Datalog Fragment

In this section, we will construct a canonical Datalog program that will define co-CSP(B) if and only if

CSP(B) is solvable by PAC. Unlike the situation that came up in extended arc consistency, PAC is a true

rejection procedure, and as such is definable in Datalog rather easily.

To create our Datalog fragment we first backtrack and investigate the PAC algorithm. Fix relational

structures A and B, and suppose that PAC solves CSP(B). We ask the following: when does PAC reject

the instance (A,B)? Investigating the inner workings of PAC leads to the following observation:

Proposition 8. Let A, B be relational structures on the same vocabulary, and suppose that PAC solves

CSP(B). PAC rejects the instance (A,B) if and only if for some a in A and for each b in B, the instance

([A, {a}], [B, {b}]) does not have a homomorphism from [A, {a}] to ℘([B, {b}]).

Recall that Datalog is inherently a rejection procedure — when computing CSP(B) in Datalog, we

actually define the language co-CSP(B). It follows from Proposition 8 that to define co-CSP(B) in Datalog

when PAC solves CSP(B), we have to modify the canonical width one program to enforce the additional

restrictions in the PACC. Before we do so we need the following fact regarding definability in Datalog, which

we prove in Appendix A:

Lemma 1. Fix a relational structure B, and let b be an element of B. Any unary, singleton relation

Rb = {b} not in B is definable by a monadic, recursion free IDB Ib in the Datalog program for B.

Using the above lemma, for any instance A of CSP(B), we can constrain any variable a in A to any

single value b in B by adding Ib(a) to the body of any rule in our Datalog program. Moreover, we can do

this without increasing the width of our Datalog program beyond one.

Definition 10. Fix a relational structure B, and denote B = {b1, b2, . . . , bn}. The canonical PAC program

for B is defined as follows: Take the canonical tree program for B, and consider each IDB I0, I1, . . . , Il.

Suppose that

Ij(x) :– Ij1(xj1), Ij2(xj2), . . . , Ijr (xjr ), Rj(· · · )

be a rule in the canonical tree program. We modify each such rule by adding a new variable to the IDBs:

Ij(x, y) :– Ij1(xj1 , y), Ij2(xj2 , y), . . . , Ijr (xjr , y), Rj(· · · ).

For each bi in B introduce an IDB Jbi(x, y), each of which have a rule of the form

Jbi(x, y) :– I0(x, y), Rbi(y).

Also introduce an IDB J(x, y) with the rule

J(x, y) :– Jb1(x, y), Jb2(x, y), . . . , Jbn(x, y)

18

Page 19: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

and replace the goal rule

goal :– I0(x)

with

goal :– J(x, y).

In the above Datalog program, each IDB Jbi intuitively corresponds to saying “The IDB I0 can be derived

when a fixed element y is set to bi”. Thus, J(x, y) says that I0 can be derived when a fixed y is set to any

element in B. This essentially corresponds to the requirement for PAC to reject an instance (A,B), which

we will make formal in the next theorem.

Theorem 6. Let B be a relational structure. PAC solves CSP(B) if and only if the canonical PAC program

for B defines co-CSP(B).

Proof. First, suppose that PAC solves CSP(B). Let A be a structure on the same vocabulary as B, and

suppose that A 6→ B. It follows from Proposition 8 that for some a inA and for each b inB, ([A, {a}], [B, {b}])

does not have a homomorphism from [A, {a}] to ℘([B, {b}]). It follows that the canonical tree program for

[B, {b}] will accept [A, {a}] for any choice of b in B, which means that the empty predicate I0 will be derived

in the program for some x.

Suppose b ∈ B, and consider the rule Jb. It follows that if y in J(x, y) is bound to a, then a will be

restricted to being mapped to b in the derivation of Jb, since Jb :– I0(x, y), Rb(y). Moreover, since [A, {a}]

is accepted by the canonical tree program π for [B, {b}], it follows that I0 will be derived in π only if a is

restricted to being mapped to b. This implies that I0 will be derived in the above canonical PAC program,

as we can clearly create the exact same derivation tree under the I0 in the rule for Jb. It follows that Jb will

be derived in the canonical PAC program, and therefore J (and the goal predicate) will be derived.

Now suppose that the canonical PAC program defines co-CSP(B), and we will show that PAC solves

CSP(B). Again, let A be a structure on the same vocabulary as B, and suppose that A 6→ B. We will show

that (A,B) does not have the PACC, and the proof will be complete.

Since A is not homomorphic to B, it follows that A is accepted by the canonical PAC program for B.

Therefore, for b ∈ B, Jb(x, y) must be derived by the program when y is bound to some a ∈ A. By our

discussion above, it follows that the canonical tree program for [B, {b}] must accept [A, {a}]. Since we can

view the canonical tree program for B as mapping variables from A to elements in ℘(B) (as in [FV98]),

if I0 is derived by the canonical tree program then clearly a cannot map to {b} in any homomorphism

h : A→ ℘(B). Since Jb is derived for each b ∈ B in the canonical PAC program, it follows that I0 is derived

by the canonical tree program for [B, {b}] for any choice of b ∈ B while running on [A, {a}]. It follows that

in any homomorphism h : A → ℘(B), the variable a ∈ A cannot map to {b} for any b ∈ B. It follows that

(A,B) cannot have the PACC, and our proof is complete.

19

Page 20: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Corollary 1. If CSP(B) is solvable by PAC, then it has width (2, k) for some k.

From this result comes two natural questions regarding the relative strength of PAC. First, is PAC

stronger than extended tree duality? And if it is, can PAC solve every width two CSP? We will provide a

positive answer to the first question through the following two propositions.

Proposition 9. Let B be a relational structure. If CSP(B) is solvable by EAC then CSP(B) is solvable by

PAC.

Proof. This follows trivially from the algorithm descriptions.

Proposition 10. There exists a structure B2SAT such that CSP(B) is solvable by PAC but CSP(B2SAT ) is

not solvable by EAC.

Proof. The structure B2SAT will be defined on the universe B = {0, 1}, and will have three binary relations

RB00, R

B01, R

B11, where RB

ij = {0, 1}2 \ {(i, j)}. The corresponding constraint problem CSP(B2SAT ) is exactly

2-Satisfiability — to see this, observe that each relation Rij corresponds to satisfying assignments to each of

the three distinct types of clauses allowable in a 2-Satisfiability instance.

From [BC10, Theorem 17], we know that PAC solves CSP(B2SAT ). To see that EAC does not solve

CSP(B2SAT ), we again consider ℘(B2SAT ). This structure has a universe of ℘(B) = {{0}, {1}, {0, 1}}, and

three relations R℘(B2SAT )00 , R

℘(B2SAT )01 , R

℘(B2SAT )11 . We make two observations:

1. Unlike B2COL, each of the relations R℘(B2SAT )ij are connected, and so S(B2SAT ) = ℘(B2SAT ).

2. ({0, 1}, {0, 1}) ∈ R℘(B2SAT )ij for each 0 ≤ i ≤ j ≤ 1.

Suppose that a homomorphism h : ℘(B2SAT )→ B2SAT exists, and consider h({0, 1}). If h({0, 1}) = 0, then

a contradiction comes from observation 2 above and that (0, 0) is not in RB2SAT00 . If h({0, 1}) = 1, then a

contradiction comes from observation 2 and that (1, 1) is not in RB2SAT11 . Thus ℘(B2SAT ) (and therefore

S(B2SAT )) is not homomorphic to B2SAT , and we have that B2SAT does not have extended tree duality.

However, the second question has a negative answer. There does in fact exist a structure B whose

constraint problem is width two and is not solvable by PAC. To show this we will use the singleton arc

consistency algorithm, which is introduced in Section 6.

Proposition 11. There exists a structure B such that CSP(B) is both width two and not solvable by PAC.

5.2 A Duality for Solvability by PAC

In this section we will introduce a duality in the vein of extended tree duality that captures exactly those

structures whose constraint problems are solvable by the PAC algorithm. This peek duality will make explicit

the difference between solvability by PAC and solvability by EAC — it boils down to a change in quantifiers.

20

Page 21: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Definition 11. Let B be a structure, and let A be a structure on the same vocabulary. We say that B

has peek duality if the structure A is homomorphic to B if and only if for all a in A, there exists b in B

such that for all trees T, if T homomorphically maps to A with elements in T ′ ⊆ T mapping to a then T

homomorphically maps to B with the elements in T ′ mapping to b.

Before we prove that peek duality is equivalent to solvability by the PAC algorithm, let’s first compare the

above definition with that of extended tree duality (see Definition 4). As we remarked before, the difference

boils down to quantifiers — the above definition requires, for every tree T and every element a in A, a

homomorphic map from T to A and from T to B that preserves the mapping a 7→ b for some element b. The

definition for extended tree duality requires that for every tree T, there exists a homomorphic map from T

to A and from T to B that preserves some mapping a 7→ b for elements a in A and b in B. So, the change

of the quantifier on the element a in A from an existential to a universal strongly increases the power of our

duality, as well as transforming our corresponding algorithm from an acceptance procedure to a rejection

procedure.

The proof of the equivalence of peek duality with solvability by PAC is straightforward, and we are able

to simplify the proof using the results in Theorem 3.

Theorem 7. Let B be a structure. CSP(B) is solvable by PAC if and only if B has peek duality.

Proof. (⇒) First, assume that CSP(B) is solvable by PAC, and let A be a structure on the same vocabulary

as B. Assume also that for every element a in A there exists b in B such that for all trees T, if T

homomorphically maps to A with elements in T ′ ⊆ T mapping to a, then T also homomorphically maps

to B with elements in T ′ mapping to b. By Theorem 3 (specifically, the equivalence of (2) and (4) in that

theorem), for every element a in A there exists b in B and a homomorphism f : A → ℘(B) such that

f(a) = {b}. This is exactly the PACC, and so by our assumption A homomorphically maps to B.

Now assume that A is homomorphic to B. Since CSP(B) is solvable by PAC, it follows that (A,B) has

the PACC. This means that for every element a in A, there exists b in B and a homomorphism h : A→ ℘(B)

such that h(a) = {b}. By the equivalence of (2) and (4) in Theorem 3, it follows that for every a in A, there

exists a b in B such that A and B has a tree duality which preserves the mapping a 7→ b. It follows that if

CSP(B) is solvable by PAC, then B has peek duality.

(⇐) Now assume that B has peek duality, and we will show that CSP(B) is solvable by PAC. Let A be a

structure on the same vocabulary as B such that (A,B) has the PACC. Then for every a in A, there exists

a b in B and a homomorphism h : A → ℘(B) such that h(a) = {b}. It follows from the equivalence of (2)

and (4) in Theorem 3 that every tree T that maps to A also maps to B while preserving the map a 7→ b.

Using the definition of peek duality, we can conclude that A is homomorphic to B, and so CSP(B) must be

solvable by PAC.

21

Page 22: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

We finish this section with some remarks on the difference between PAC and EAC, and a quick summary

of results to this point. The above theorem shows that a structure B having extended tree duality is

essentially equivalent to a form of restricted PAC acceptance. More specifically, we don’t require that there

is a homomorphism h from A to ℘(B) such that h(a) is a singleton for any a in A. We only require that

there exists a homomorphism to ℘(B) such that some element a in A maps to a fixed element b in B. In this

sense EAC is somewhat similar to look-ahead arc consistency, which was also studied in [CDG11]. We’ve

shown that the distinction between the two cases lies in the quantifier of our elements in A — the first

condition requires us to check that there is a corresponding homomorphism for every element of A, while

the second condition requires that we check that there is a homomorphism for some element of A. Thus

these conditions are, in some loose sense, negations of each other. It follows that this requirement to check

homomorphisms for each element a in A seems to necessitate the infinite definition shown in the previous

section, as the universe of A is finite but unbounded.

Finally, we present a summary of the results on PAC.

Theorem 8. Let B be a structure. The following are equivalent:

1. CSP (B) is solvable by PAC.

2. For each n ≥ 1, there exists a homomorphism h : Sing(℘(B)n)→ B.

3. The canonical PAC program for B defines co-CSP(B).

4. The structure B has peek duality.

6 Singleton Arc Consistency

We will now briefly present the singleton arc consistency (SAC) algorithm. SAC was originally introduced

by Debruyne and Bessiere [DB97], and it’s power was thoroughly explored by Chen et. al. in [CDG11]. We

will introduce it as a natural extension of the PAC algorithm, and use it to show the existence of a structure

B who’s constraint problem is width two yet not solvable by PAC.

The SAC algorithm works by enforcing additional constraints on each variable, and uses AC as a pruning

procedure for each possible set of mappings. One can also view SAC as “laying” a list homomorphism

problem on top of the regular homomorphism problem, and continually pruning the lists until they stabilize

for each variable.

We now ask the same questions that we asked for AC: for what structures does SAC accept, and for

what structures does it accept correctly? The answers are understandably more complicated, and presented

in [CDG11] (we refer the reader there for proofs).

22

Page 23: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Algorithm 4 Singleton Arc Consistency Input: Two relational structures A,B over a vocabulary τ .

for all a ∈ A do

Sa ← B

end for

Denote A = {a1, a2, . . . , an}.

repeat

for all a ∈ A, b ∈ Sa do

if Arc Consistency([A, {a1}, {a2}, . . . , {an}, {a}], [B, Sa1, Sa2

, . . . , San, {b}]) rejects then

Remove b from Sa.

end if

end for

until no set Sa changes

if there exists a ∈ A such that Sa is empty then

Reject.

else

Accept.

end if

Definition 12. An instance (A,B) has the singleton arc consistency condition (SACC) if there exists a

mapping s : A → ℘(B) such that, for each a ∈ A, b ∈ s(a), there exists a homomorphism ha,b : A → ℘(B)

with the following properties:

• ha,b(a) = {b} and,

• for each a′ ∈ A, ha,b(a′) ⊆ s(a).

Proposition 12. The singleton arc consistency algorithm does not reject an instance (A,B) if and only if

the instance has the SACC.

Definition 13. Let B be a structure. We say that SAC solves CSP(B) if, for all instances (A,B), the

following holds: (A,B) has the SACC implies that A→ B.

There is also a structure, like ℘(B), whose homomorphisms define whether or not CSP(B) can be solved

by SAC. To define it, we will use the following notation. We denote by UnionSing(℘(B)n) the induced

substructure of ℘(B)n whose universe contains an n-tuple (S1, S2, . . . , Sn) of ℘(B)n if and only if⋃

i∈[n] Si =⋃i∈[n],|Si|=1 Si. In other words, membership of an n-tuple in the universe is determined by the tuple’s

singleton sets.

23

Page 24: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

Theorem 9. Let B be a structure. Then SAC solves CSP(B) if and only if for all n ≥ 1 there exists a

homomorphism from UnionSing(℘(B)n) to B.

It was shown by Chen et. al. that if CSP(B) is solvable by SAC, then CSP(B) has width-(r, r + 1)

for r = max(2,maxar(B)) [CDG11, Proposition 25]. They also showed that the following structure B∗ is

solvable by SAC and not by PAC [CDG11, Theorem 22].

Definition 14. We define the structure B∗ as follows. The universe of B∗ is the set B = {0, 1, 2, 3}, and

the structure has the following relations:

UB∗

0 = {(0)},

UB∗

1 = {(1)},

UB∗

2 = {(2)},

UB∗

3 = {(3)},

RB∗

1 = {0, 1, 2, 3}2 \ {(0, 0)},

RB∗

2 = {(1, 2), (2, 3), (3, 1), (0, 0)}.

Theorem 10. CSP(B∗) is solvable by SAC but not solvable by PAC.

Note that the maximum arity of any relation in B∗ is two. It follows from Proposition 25 in [CDG11] that

CSP(B∗) is width (2, 3), and so combining this with the above Theorem we have the following Corollary.

Corollary 2. There exists a structure B such that CSP(B) is width two and is not solvable by PAC.

We can summarize each of the containment results for each algorithm as follows:

Theorem 11. Let B be a structure. Then

1. If CSP(B) is solvable by AC, then CSP(B) solvable by EAC.

2. If CSP(B) is solvable by EAC, then CSP(B) is solvable by PAC.

3. If CSP(B) is solvable by PAC, then CSP(B) is solvable by SAC.

4. If CSP(B) is solvable by SAC, then CSP(B) has bounded width.

The containments 1, 2, and 3 are known to be strict.

7 Conclusion

In this paper, we followed the development of different consistency algorithms using arc consistency in the

same way as Chen et. al. in [CDG11]. We produced a Datalog fragment and duality characterization for

24

Page 25: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

PAC, bringing it up to the rigorous development that AC has had for years. We also performed a deeper

analysis of extended arc consistency, studying it’s power relative to the other consistency algorithms.

The development of extended arc consistency is likely the most interesting group of results — we were

unable to find any mention of it ever being studied before with any real rigour. It would be interesting to

see if there is any way to apply the “seeking acceptance” philosophy of EAC to a class beyond PAC, perhaps

finding a class of structures that is to PAC as EAC is to AC. If anything, our results show that interesting

classes of relational structures certainly exist “between” the standard width classes definable by Datalog,

and it wouldn’t hurt to further investigate them.

Creating a Datalog fragment that captures PAC is a natural result. Most candidly it speaks to how

Datalog is fundamentally a procedure for rejection instead of acceptance, as shown by the comparison between

the existence of a Datalog fragment for PAC and the seeming non-existence of a Datalog fragment for EAC.

As shown by the duality theorem proven for PAC, the existence of an infinite structure-homomorphism

definition for acceptance by a consistency algorithm seems to be tied to this “rejection” property that

consistency algorithms share.

The most important open question from this work is to give SAC the same rigorous treatment as AC

and PAC. It was conjectured in [CDG11] that the CSPs solvable by SAC are exactly the CSPs of bounded

width. If this is true, then having a canonical Datalog fragment for SAC and a duality theorem will give an

interesting new characterization of these languages.

Another line of research would be to extend our results on algorithms to algebra. It would be interesting

to see what algebraic identities could be introduced to capture EAC, PAC and SAC, similar to the existence

of a totally symmetric polymorphism for AC [DP99]. We note that it is simple to prove that if CSP(B) is

solvable by SAC, then it has a 1-immune polymorphism defined by Kun and Szegedy [KS09]. Proving the

converse, however, would solve the above conjecture of Chen et. al.

References

[BC10] Manuel Bodirsky and Hubie Chen. Peek arc consistency. Theor. Comput. Sci., 411(2):445–453,

2010.

[BJK05] Andrei A. Bulatov, Peter Jeavons, and Andrei A. Krokhin. Classifying the complexity of constraints

using finite algebras. SIAM J. Comput., 34(3):720–742, 2005.

[BK09] Libor Barto and Marcin Kozik. Constraint satisfaction problems of bounded width. In FOCS,

pages 595–603, 2009.

25

Page 26: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

[BKL08] Andrei A. Bulatov, Andrei A. Krokhin, and Benoit Larose. Dualities for constraint satisfaction

problems. In Complexity of Constraints, volume 5250 of Lecture Notes in Computer Science, pages

93–124. Springer, 2008.

[BL08] Laszlo Zadori Benoit Larose. Bounded width problems and algebras. Algebra Universalis, 56:439–

466, 2008.

[CDG11] Hubie Chen, Vıctor Dalmau, and Berit Grußien. Arc consistency and friends. CoRR,

abs/1104.4993, 2011.

[CDK08] Catarina Carvalho, Vıctor Dalmau, and Andrei A. Krokhin. Caterpillar duality for constraint

satisfaction problems. In LICS, pages 307–316. IEEE Computer Society, 2008.

[Dal05] Vıctor Dalmau. Linear datalog and bounded path duality of relational structures. CoRR,

abs/cs/0504027, 2005.

[DB97] Romuald Debruyne and Christian Bessiere. Some practicable filtering techniques for the constraint

satisfaction problem. In In Proceedings of IJCAI97, pages 412–417, 1997.

[DP99] Vıctor Dalmau and Justin Pearson. Closure functions and width 1 problems. In Joxan Jaffar,

editor, CP, volume 1713 of Lecture Notes in Computer Science, pages 159–173. Springer, 1999.

[FV98] Tomas Feder and Moshe Y. Vardi. The computational structure of monotone monadic snp and

constraint satisfaction: A study through datalog and group theory. SIAM J. Comput., 28(1):57–

104, 1998.

[HN04] P. Hell and J. Nesetril. Graphs and Homomorphisms. Oxford University Press, 2004.

[HZN96] P. Hell, X. Zhu, and J. Nesetril. Duality and polynomial testing of tree homomorphisms. Trans.

Amer. Math. Soc, 348:1281–1297, 1996.

[Jea98] Peter Jeavons. On the algebraic structure of combinatorial problems. Theor. Comput. Sci., 200(1-

2):185–204, 1998.

[KS09] Gabor Kun and Mario Szegedy. A new line of attack on the dichotomy conjecture. In Proceedings

of the 41st annual ACM Symposium on Theory of Computing, STOC ’09, pages 725–734, New

York, NY, USA, 2009. ACM.

[LT09] Benoit Larose and Pascal Tesson. Universal algebra and hardness results for constraint satisfaction

problems. Theor. Comput. Sci., 410(18):1629–1647, 2009.

26

Page 27: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

[Mac77] Alan K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8(1):99 – 118,

1977.

[NT00] Jaroslav Nesetril and Claude Tardif. Duality theorems for finite structures (characterising gaps

and good characterisations). J. Comb. Theory Ser. B, 80:80–97, September 2000.

27

Page 28: Logical and Combinatorial Characterizations of Consistency Algorithmsrobere/paper/honoursproject.pdf · of extended tree duality, which was introduced in [FV98], and determine it’s

A Proof of Lemma 1

In this appendix, we will prove Lemma 1 (see page 18), regarding the definability of singleton, unary relations

in Datalog. To do that, we will use the following lemma, which was first proven in [LT09]. For a set of

relations Γ, we will denote by Pol(Γ) the set of all polymorphisms that the relations in Γ are each invariant

under. We also remark that by a primitive positive formula, we mean a formula in FO restricted to only

∃∧-sentences.

Lemma 2. Let Γ and Γ′ be finite sets of relations on A. Then the following conditions are equivalent:

1. Pol(Γ) ⊆ Pol(Γ′);

2. for every R ∈ Γ′ of arity k there exists a (primitive positive) formula

φ(x1, x2, . . . , xk) ≡ ∃y1∃y2 · · · ∃ymψ(x1, x2, . . . , xk, y1, y2, . . . , ym)

where ψ is a conjunction of atomic formulae with relations in Γ∪{=} such that (a1, a2, . . . , ak) ∈ R if

and only if φ(a1, a2, . . . , ak) holds;

3. there exists a finite sequence Γ = Γ0,Γ1, . . . ,ΓS = Γ′ such that each set of relations Γi is obtained from

the previous set of relations from one of the following operations:

(a) removing a relation,

(b) adding a relation obtained by permuting the variables of a relation,

(c) adding the intersection of two relations of the same arity,

(d) adding the product of two relations,

(e) adding a relation obtained by projecting an n-ary relation to it’s first n− 1 variables

(f) adding the equality relation

Using this, we can prove our lemma:

Lemma 1. Fix a relational structure B, and let b be an element of B. Any unary, singleton relation

Rb = {b} not in B is definable by a monadic, recursion free IDB Ib in the Datalog program for B.

Proof. Let ΓB be the set of relations in B. It follows from Theorem 4.7 in [BJK05] that Pol(ΓB) ⊆

Pol(ΓB ∪ {Rb | b ∈ B}). Therefore, by the above Lemma, we have that for each Rb there exists a

1-ary primitive positive formula φ(x) that defines Rb using only relations in ΓB and equality. This can

easily be transformed into a monadic IDB Ib in the straightforward manner, with each relation appearing

in φ appearing in the body of the IDB, using the same existential variables. To represent equality between

two variables yi, yj in Datalog (recall that the pp-sentence is defined on ΓB ∪ {=}), we simply replace all

occurrences of yi with yj in the body of the rule.

28