On propositional definability

27
Artificial Intelligence 172 (2008) 991–1017 www.elsevier.com/locate/artint On propositional definability Jérôme Lang a , Pierre Marquis b,a IRIT, CNRS/ UniversitéPaul Sabatier, 118 route de Narbonne, 31062 Toulouse, France b CRIL, CNRS / Université d’Artois, rue Jean Souvraz, S.P. 18, 62307 Lens, France Received 21 November 2006; received in revised form 19 December 2007; accepted 28 December 2007 Available online 11 January 2008 Abstract In standard propositional logic, logical definability is the ability to derive the truth value of some propositional symbols given a propositional formula and the truth values of some propositional symbols. Although appearing more or less informally in various AI settings, a computation-oriented investigation of the notion is still lacking, and this paper aims at filling the gap. After recalling the two definitions of definability, which are equivalent in standard propositional logic (while based on different intuitions), and defining a number of related notions, we give several characterization results, and many complexity results for definability. We also show close connections with hypothesis discriminability and with reasoning about action and change. © 2008 Elsevier B.V. All rights reserved. Keywords: Knowledge representation; Propositional logic; Computational complexity; Definability; Hypothesis discriminability; Reasoning about action and change 1. Introduction When reasoning about knowledge represented in propositional logic, exhibiting structure can be of a great help. By “structure” we mean some relationships which exist between some sets of propositional symbols and/or formulas within a propositional formula . Such relationships are known under various names, including dependency, rele- vance, novelty, controllability, and some of them have been investigated, see among others [1,2]. In this paper we focus on an additional form of dependency, called definability. Definability captures two different intuitions: implicit definability and explicit definability. A propositional symbol y can be implicitly defined in a given formula in terms of a set X of propositional symbols if and only if the knowledge of the truth values of the propositional symbols of X (whatever they are) enables concluding about the truth value of y , while y can be explicitly defined in in terms of X when there exists a formula X built up from X only, such that X is equivalent to y in . This paper is an extended and revised version of some parts of two papers: “Complexity results for independence and definability in propositional logic”, appeared in the Proceedings of the Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), pages 356–367; and “Two forms of dependence in propositional logic: Controllability and definability”, appeared in the Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI’98), pages 268–273. * Corresponding author. E-mail addresses: [email protected] (J. Lang), [email protected] (P. Marquis). 0004-3702/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.artint.2007.12.003

Transcript of On propositional definability

Page 1: On propositional definability

Artificial Intelligence 172 (2008) 991–1017

www.elsevier.com/locate/artint

On propositional definability ✩

Jérôme Lang a, Pierre Marquis b,∗

a IRIT, CNRS / Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse, Franceb CRIL, CNRS / Université d’Artois, rue Jean Souvraz, S.P. 18, 62307 Lens, France

Received 21 November 2006; received in revised form 19 December 2007; accepted 28 December 2007

Available online 11 January 2008

Abstract

In standard propositional logic, logical definability is the ability to derive the truth value of some propositional symbols given apropositional formula and the truth values of some propositional symbols. Although appearing more or less informally in variousAI settings, a computation-oriented investigation of the notion is still lacking, and this paper aims at filling the gap. After recallingthe two definitions of definability, which are equivalent in standard propositional logic (while based on different intuitions), anddefining a number of related notions, we give several characterization results, and many complexity results for definability. We alsoshow close connections with hypothesis discriminability and with reasoning about action and change.© 2008 Elsevier B.V. All rights reserved.

Keywords: Knowledge representation; Propositional logic; Computational complexity; Definability; Hypothesis discriminability; Reasoning aboutaction and change

1. Introduction

When reasoning about knowledge represented in propositional logic, exhibiting structure can be of a great help.By “structure” we mean some relationships which exist between some sets of propositional symbols and/or formulaswithin a propositional formula �. Such relationships are known under various names, including dependency, rele-vance, novelty, controllability, and some of them have been investigated, see among others [1,2].

In this paper we focus on an additional form of dependency, called definability. Definability captures two differentintuitions: implicit definability and explicit definability. A propositional symbol y can be implicitly defined in a givenformula � in terms of a set X of propositional symbols if and only if the knowledge of the truth values of thepropositional symbols of X (whatever they are) enables concluding about the truth value of y, while y can be explicitlydefined in � in terms of X when there exists a formula �X built up from X only, such that �X is equivalent to y in �.

✩ This paper is an extended and revised version of some parts of two papers: “Complexity results for independence and definability inpropositional logic”, appeared in the Proceedings of the Sixth International Conference on Principles of Knowledge Representation and Reasoning(KR’98), pages 356–367; and “Two forms of dependence in propositional logic: Controllability and definability”, appeared in the Proceedings ofthe Fifteenth National Conference on Artificial Intelligence (AAAI’98), pages 268–273.

* Corresponding author.E-mail addresses: [email protected] (J. Lang), [email protected] (P. Marquis).

0004-3702/$ – see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.artint.2007.12.003

Page 2: On propositional definability

992 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Table 1The complexity of DEFINABILITY

Fragment C DEFINABILITY

PROPPS (general case) coNP-cDNNF in Pq-HornCNF in PIP coNP-c

Definability is acknowledged as an important logical concept for decades. It is closely related to the Craig/Lyndoninterpolation theorem [3]. Many studies in logic are about determining whether a given logic (standard or modal,propositional or first-order) satisfies the “basic” Beth property (whenever a theory implicitly defines a symbol interms of all others, there is an explicit definition of that symbol in terms of all others), or even the (stronger) projectiveBeth property (when implicit definability and explicit definability coincide). Thus classical first-order logic satisfiesthe “basic” Beth property (this is the famous Beth’s theorem [4]), as well as the projective one, while for instancefirst-order logic on finite structures does not (see e.g., [5]).

Standard propositional logic has been known to satisfy the projective Beth property. In this paper, we consider de-finability in standard propositional logic from a computational point of view. We present several characterization andcomplexity results which prove useful for several AI applications, including hypothesis discrimination and reasoningabout actions and change.

From a computational point of view, our results concern both time and space complexity. As to time complexity,we mainly considered the decision problem DEFINABILITY which consists in determining whether a given formula� defines a given symbol y (or more generally a given set Y of symbols) in terms of a given set X of symbols. Weidentify its complexity both in the general case and under restrictions induced by a number of propositional fragments(formally defined in Section 2) that proved of interest in many AI contexts (see [6–9]); the results are summarized inTable 1.

While the table shows that the definability problem is intractable in the general case (unless P = NP), it also showsthat:

• The main propositional fragments which are tractable for SAT are also tractable for DEFINABILITY. Indeed, DNNFcontains (among others) all DNF formulas and all OBDD “formulas”, while q-HornCNF contains all renamableHorn CNF formulas. The fact that large propositional fragments (including complete ones, i.e., fragments intowhich any propositional formula has an equivalent, as DNNF is) is of great value from a practical perspective.

• Nevertheless, tractability for SAT is not enough for ensuring tractability for DEFINABILITY. Thus the Blake frag-ment IP is tractable for SAT but likely not for DEFINABILITY. We also identify some sufficient conditions (referredto as stability conditions) under which a propositional fragment is tractable for SAT if and only if it is tractable forDEFINABILITY.

About space complexity, we focus on the size of definitions; we show that in the general case, the size of anyexplicit definition of a symbol y in terms of a set of symbols X in � is not polynomially bounded in the input size. Weidentified some sufficient conditions (polytime conditioning and polytime forgetting) on propositional fragments forensuring that definitions can be computed in polynomial time (hence are of polynomial size) when such definitionsexist. Interestingly, the influential DNNF fragment satisfies them, as well as the Blake fragment IP. The result for IPshows that it can be the case that computing an explicit definition of y on X in � is easy when one knows that such adefinition exists, while deciding whether it exists is hard.

The rest of the paper is organized as follows. In Section 2, we give some necessary background about propositionallogic and computational complexity. In Section 3 the notion of definability is presented, as well as a number ofrelated notions, including the notions of minimal defining family (or base), undefinable symbol, necessary symboland relevant symbol, as well as the notion of unambiguous definability. We also show how such notions relate oneanother and are connected to previous concepts, especially variable forgetting (see [2,10]) as well as the notions ofweakest sufficient and strongest necessary conditions [11]. In Section 4, we give a number of complexity results fordefinability and the related notions. We identify a number of tractable restrictions of the decision problems underconsideration. We also report some complexity results about the size of explicit definitions and present an algorithm

Page 3: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 993

for computing a base. In Section 5, we show that definability is closely related to hypothesis discriminability. InSection 6, we explain how many important issues in reasoning about action and change can be characterized in termsof definability. In Section 7, we briefly sketch how definability can prove useful to automated reasoning. In Section 8,we relate our results to the literature. Finally, Section 9 concludes the paper.

2. Formal preliminaries

2.1. Propositional logic

Let PS be a finite set of propositional symbols (also called variables). PROPPS is the DAG-based propositionallanguage built up from PS, the connectives ¬, ∨, ∧, ⇒, ⇔ and the Boolean constants true and false in the usual way.Subsets of PS are denoted X, Y , etc. For every X ⊆ PS, PROPX denotes the sublanguage of PROPPS generated fromthe propositional symbols of X only.

From now on, � denotes a finite set of propositional formulas from PROPPS. Var(�) is the set of propositionalsymbols appearing in � and |�| is the size of �, i.e., the number of symbols used to write it. Elements of PS aredenoted x, y, etc. Specific formulas from PROPPS are of interest: a literal is a symbol x of PS (positive literal) ora negated one ¬x (negative literal). x and ¬x are two complementary literals. A clause (resp. term) is a disjunction(resp. conjunction) of literals, or the constant false (resp. true). A Conjunctive Normal Form formula (for short, a CNFformula) is a conjunction of clauses. A Disjunctive Normal Form formula (for short, a DNF formula) is a disjunctionof terms. A CNF formula is Krom [12] if and only if each clause in it contains at most two literals. A Krom formulais also said to be a 2-CNF formula or a quadratic formula. A CNF formula is Horn [13] if and only if each clause init contains at most one positive literal. A CNF formula � is renamable Horn [14] if and only if there exists a Hornrenaming for it, i.e., a set V of symbols v such that replacing every occurrence of v ∈ V (resp. ¬v) in � by thecomplementary literal ¬v (resp. v) leads to a Horn CNF formula. A CNF formula � has a QH-partition [6] if andonly if there exists a partition {Q,H } of Var(�) s.t. for every clause δ of �, the following conditions hold:

• δ contains no more than two variables from Q;• δ contains at most one positive literal from H ;• if δ contains a positive literal from H , then it contains no variable from Q.

A CNF formula � is q-Horn [6] if and only if there exists a q-Horn renaming for it, i.e., a set V of symbols v

such that replacing in � every occurrence of a positive literal v (resp. a negative literal ¬v) by the complementaryliteral ¬v (resp. v) leads to a CNF formula having a QH-partition {Q,H }. The propositional fragment q-HornCNF isthe set all q-Horn formulas from PROPPS; it includes both the Krom formulas (H = ∅) and the renamable Horn CNFformulas (Q = ∅) as proper subsets.

A Negation Normal Form formula (for short, an NNF formula) is any formula � built up from PS, the connectives¬, ∨, ∧ and the Boolean constants true and false, such that the scope of any occurrence of ¬ in � is a symbol ora Boolean constant. Thus, every CNF (resp. every DNF) formula also is an NNF formula. An NNF formula � isdecomposable (i.e., a DNNF formula) [7,9] if and only if every subformula in � of the form ϕ ∧ ψ is such thatVar(ϕ)∩ Var(ψ) = ∅. Obviously, every DNF formula also is a DNNF formula, but the converse does not hold. DNNFis the propositional fragment containing all DNNF formulas from PROPPS.

Formulas from PROPPS are interpreted in the standard, usual way. Full instantiations of propositional symbols ofPS on BOOL = {0,1} (worlds) are denoted by �ω and their set is denoted by �. Any world satisfying a given formulaϕ is said to be a model of ϕ. Full instantiations of propositional symbols of X ⊆ PS are denoted by �x and called X-worlds; their set is denoted by �X . We shall identify �x with the corresponding canonical conjunction of literals overX in order to simplify the notations; for instance, if X = {a, b} and �x = (a = 1, b = 0) then we also write �x = a ∧¬b.We shall also identify any finite set of formulas with the conjunction of all formulas from the set. |= denotes logicalentailment and ≡ denotes logical equivalence. If �,�, ∈ PROPPS, � and are said to be �-equivalent if and onlyif � |= � ⇔ .

Assuming that worlds are represented by the subsets of all variables they satisfy (i.e., �ω is given by {x ∈ PS |�ω(x) = 1}), the Horn envelope of a Horn CNF formula � is the smallest set of models (w.r.t. set-inclusion) of �

Page 4: On propositional definability

994 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

(over Var(�)) whose intersection closure1 is the whole set of models of �. A q-Horn envelope of a q-Horn CNFformula � which has a QH-partition is any smallest set (w.r.t. set-inclusion) of models of � (over Var(�)) whoseQH-convolution closure is the whole set of models of � (see [15] for details).

In order to avoid heavy notations, we sometimes abuse notations and write x instead of {x}. For every formula� ∈ PROPPS and every propositional symbol x ∈ PS, �x←0 (resp. �x←1) is the formula obtained by replacing in �

every occurrence of x by the constant false (resp. true). More generally, if γ is a satisfiable conjunction of literals thenthe conditioning �γ of � by γ is the formula obtained by replacing in � every occurrence of each positive literal x

of γ by true and every occurrence of each negative literal ¬x of γ by false.An implicate (resp. implicant) of a formula � is a clause δ (resp. a conjunction of literals γ ) which is a logical

consequence of � (resp. such that � is a logical consequence of γ ). A prime implicate (resp. prime implicant) of� is one of its logically strongest implicates (resp. one of its logically weakest implicants). A formula � is in primeimplicates normal form (or a Blake formula or a prime formula) [16] if and only if it is a CNF formula whoseclauses are the prime implicates of � (one representative per equivalence class, only). IP is the propositional fragmentcontaining all Blake formulas.

Example 1.

• (a ∨ b) ∧ (a ∨ (¬b ∧ c)) is an NNF formula but neither a DNNF formula nor a CNF formula.• (a ∨ b) ∧ (c ∨ (¬c ∧ d)) is a DNNF formula but neither a DNF formula nor a CNF formula.• (a ∧ b) ∨ (¬a ∧ d) is a DNF formula.• (a ∨ b ∨ c) ∧ (¬a ∨ ¬b ∨ ¬c) ∧ (¬a ∨ d) is a CNF formula but neither a DNNF one nor a q-Horn CNF one nor

a Blake one.• (a ∨ b ∨ c) ∧ (¬a ∨ ¬b ∨ ¬c) is a Blake formula but neither a DNNF one nor a q-Horn CNF one.• (¬a ∨¬b∨c)∧ (a ∨¬b∨¬c) ∧(¬a ∨b∨¬c)∧ (¬a ∨¬d ∨¬e) ∧(¬b∨¬d ∨e)∧ (¬c∨d ∨¬e) ∧(d ∨e∨¬f )

is a q-Horn CNF formula but neither a DNNF one nor a Blake one nor a renamable Horn CNF one nor a Kromone.

• (a ∨ b) ∧ (¬a ∨ ¬c ∨ d) is a renamable Horn CNF formula but neither a DNNF one nor a Horn CNF one nor aKrom one nor a Blake one.

• (a ∨ ¬b) ∧ (b ∨ ¬c ∨ ¬d) is a Horn CNF formula but neither a DNNF one nor a Krom one nor a Blake one.• (a ∨ b) ∧ (¬b ∨ c) is a Krom formula but neither a DNNF one nor a Horn CNF one nor a Blake one.

For each of the propositional fragments listed in this section, the recognition problem is tractable (i.e., there existsa (deterministic) polynomial time algorithm for determining whether any given propositional formula belongs to thefragment). This is obvious for most of those fragments, except qHornCNF (and its subset consisting of all renamableHorn CNF formulas) and to a lesser extent, IP. For qHornCNF, see [6,17]; for IP, this comes from the correctnessof any resolution-based prime implicates algorithm (like Tison’s one [18]): a CNF formula � is Blake if and only ifwhenever two clauses of it have a resolvent, there exists a clause in � which implies it, and no clause of � is impliedby another clause of �.

Unlike PROPPS and some of its subsets (as the set of all CNF formulas), qHornCNF, DNNF and IP are known astractable for the satisfiability problem SAT; this means that for each of these fragments, there exists a (deterministic)polynomial time algorithm for determining whether any given formula from the fragment is satisfiable. For instance,in order to determine whether a Blake formula is satisfiable, it is enough to check that it does not reduce to false(the empty clause) (this is a direct consequence of the definition of a Blake formula). For the qHornCNF and DNNFfragments, see respectively [6] and [7,9].

2.2. Computational complexity

We assume that the reader is familiar with some basic notions of computational complexity, especially the com-plexity classes P, NP, and coNP, as well as the basic decision problems SAT and UNSAT (and their restrictions to

1 The intersection closure C of a set S is the smallest set w.r.t. ⊆ such that S ⊆ C and ∀e1, e2 ∈ C,e1 ∩ e2 ∈ C.

Page 5: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 995

CNF formulas, noted CNF-SAT and CNF-UNSAT) and the classes �pk , �

pk and �

pk of the polynomial hierarchy PH

= ⋃k�0 �

pk = ⋃

k�0 �pk = ⋃

k�0 �pk (see [19] for details).

Let us recall that a decision problem is said to be at the kth level of PH if and only if if it belongs to �p

k+1, and iseither �

pk -hard or �

pk -hard.

It is well-known that if there exists i > 0 such that �pi = �

pi then for every j > i, we have �

pj = �

pj = �

pi : PH

is said to collapse to level i. It is strongly believed that PH does not collapse (to any level), i.e., it is a truly infinitehierarchy (for every integer k, PH �= �

pk ).

BH2 (also known as DP) is the class of all languages L such that L = L1 ∩ L2, for some L1 in NP and L2 in coNP.The canonical BH2-complete problem is SAT-UNSAT: a pair of formulas 〈ϕ,ψ〉 is in SAT-UNSAT if and only if ϕ

is satisfiable and ψ is not. This class belongs to the Boolean hierarchy; unless NP = coNP, BH2 strictly contains bothNP and coNP.

An advice-taking Turing machine is a Turing machine that has associated with it a special “advice oracle” A, whichcan be any function (not necessarily a recursive one). On input s, a special “advice tape” is automatically loaded withA(|s|) and from then on the computation proceeds as normal, based on the two inputs, s and A(|s|).

An advice-taking Turing machine uses polynomial advice if its advice oracle A satisfies |A(n)| � p(n) for somefixed polynomial p and all non-negative integers n; finally, P/poly is the class of all languages which can be decidedin polynomial time by deterministic Turing machines augmented by polynomial advice. It is believed that NP ∩ coNPis not included in P/poly.

3. Definability: Definitions, properties and characterizations

3.1. Implicit and explicit definability

Definability is a strong form of dependence: while dependent propositional symbols interact in some situations,definability imposes that some propositional symbols are fixed whenever some other propositional symbols are fixedas well.

Definition 2 ((implicit) definability). Let � ∈ PROPPS, X,Y ⊆ PS and y ∈ PS.

• � defines y in terms of X (denoted by X �� y) if and only if ∀�x ∈ �X, �x ∧ � |= y or �x ∧ � |= ¬y.• X �� Y if and only if X �� y for every y ∈ Y .

Note that requiring �x ∧� to be satisfiable would be useless since �x ∧� |= y holds whenever �x ∧� is unsatisfiable.When no X-world consistent with � can be found, � is unsatisfiable. In this case, definability trivializes, i.e., X �� y

holds for every X and y.

Example 3. Let l stand for “leap year”, and d4 (resp. d25, d100, d400) for “divisible by 4” (resp. by 25, 100, 400).Let � = {d400 ⇒ l, (d100 ∧ ¬d400) ⇒ ¬l, (d4 ∧ ¬d100) ⇒ l,¬d4 ⇒ ¬l, d100 ⇔ (d4 ∧ d25), d400 ⇒ d100} aset of formulas making precise some connections between those symbols.

We have {d4, d25} �� d100; {d4, d100, d400} �� l; {d4, d25, d400} �� l; � does not define l in terms of{d25, d100, d400}, because the joint falsity of these three propositional symbols does not enable telling whether l

is true or false, since we do not know whether d4 holds or not.Other definability relations hold; in particular, {l, d100, d400} �� d4; {l, d100} �� d4; {l, d100} �� {d4, d100,

d400}.

When X �� y holds, one can state equivalently that the functional dependency X → y holds in �. This notionof functional dependency is the well-known one from the relational database theory restricted to binary domains (see[20,21]).

Definability satisfies the following easy properties (which we give without proofs):

(1) �� is transitive.(2) If X′ ⊆ X, then X �� X′. In particular, �� is reflexive.

Page 6: On propositional definability

996 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

(3) If X �� Y and X �� Y ′, then X �� Y ∪ Y ′.(4) If X �� Y and �′ |= �, then X ��′ Y .(5) If X �� Y and X′ �� Y ′, then X ∪ X′ �� Y ∪ Y ′.

(1), (2) and (3) correspond to the famous Armstrong’s rules of inference (and known respectively as the transitivityrule, the inclusion rule and the augmentation rule) [20]. (4) is a monotonicity property; (5) is a derived rule of inferencein Armstrong’s system (and is known as the addition rule or the composition rule).

It is also easy to show that if � is satisfiable and y /∈ Var(�) ∪ X, then X ��� y. Similarly, if � is valid thenX �� Y holds if and only if Y ⊆ X.2 Other properties that can be shown when {x} �� Y are reported in Lemma 2.3from [22].

Now, another notion of definability can be easily defined, relating a set of propositional symbols X to a proposi-tional symbol y given a formula �; it requires the existence of an explicit definition of y in � using propositionalsymbols of X, only. While the previous form of definability is typically referred to as implicit definability, the latterone is called explicit definability.

Definition 4 (explicit definability; definition of a propositional symbol). Let � ∈ PROPPS, X ⊆ PS and y ∈ PS. � ex-plicitly defines y in terms of X if and only if there exists a formula �X ∈ PROPX s.t. � |= �X ⇔ y. In such a case,�X is called a definition of y on X in �.

As a corollary of Craig’s interpolation theorem [3] (stated in the more general framework of first-order logic), theequivalence between the implicit form of definability (as given above) and the explicit form can be stated. This resultis known as the projective Beth’s theorem in propositional logic. We give a proof for this basic result since it enablesfor pointing out a first, simple (explicit) definition.

Theorem 5 (propositional projective Beth’s theorem). Let � ∈ PROPPS, X ⊆ PS and y ∈ PS. � explicitly defines y interms of X if and only if X �� y.

Proof. The (⇒) direction is obvious. As to the (⇐) direction, suppose � implicitly defines y in terms of X. Foreach world �x satisfying �, let ϕ�x be the conjunction of all literals over X true in �x. Since the truth value of y in aworld satisfying � depends only on the truth values of the symbols of X, we have that � ∧ϕ�x |= y. It follows that thedisjunction � of all ϕ�x ’s for �x a world satisfying � ⇒ y is an explicit definition of y on X. Similarly, the negationof the disjunction of all ϕ�x ’s for �x a world satisfying � ⇒ ¬y is an explicit definition of y on X (indeed, we have� |= � ⇔ ¬). �

In Lemma 5.1 from [15], one can find representations, based on the prime implicates of �, of the two explicitdefinitions � and ¬ given in the proof of Theorem 5. The first one is noted f(X,y) and the second one is notedf(X,y). Clearly enough, such representations are not always the more succinct one from the spatial efficiency pointof view (both of them can be exponential in the size of �), and since any formula �-equivalent to � (resp. ) is anexplicit definition of y on X in �, there is no specific need to focus on prime implicates representations.

Example 3 (continued). The following explicit definitions hold:

� |= (d4 ∧ (¬d100 ∨ d400)

) ⇔ l;� |= (

d4 ∧ (¬d25 ∨ d400)) ⇔ l;

� |= (l ∨ d100) ⇔ d4;� |= (d100 ∧ l) ⇔ d400;� |= (

(l ∧ ¬d25) ∨ d100) ⇔ d4.

2 This shows the system of rules above complete when � is valid since every definability relation is an instance of axiom schema (3).

Page 7: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 997

Theorem 5 shows that � defines y in terms of X if and only if there exists a definition �X of y in � such thatX = Var(�X). Now, what about the unicity of the definition of y on X in �, when � defines y in terms of X? Assuggested in the proof of Theorem 5, there are several possible definitions of y on X in �, which are generally notlogically equivalent, but which are nevertheless �-equivalent: if � and are both definitions of y in �, we have� |= � ⇔ , and additionally, � ∨ and � ∧ are also definitions of y in �; thus, the set of all definitions of y onX in �, quotiented by logical equivalence, is a finite lattice. The least (resp. greatest) element of this lattice is calledthe strongest (resp. weakest) definition of y in �, and is denoted by DefX,l

� (y) (resp. DefX,u� (y)). Note that DefX,l

� (y)

and DefX,u� (y) are defined only when X �� y holds.

Now, the previous notion of definability of a propositional symbol can be easily turned into a more general notionof formula definability. Formally:

Definition 6 (formula definability). Let �, ∈ PROPPS and X ⊆ PS. � defines in terms of X (noted X �� ) ifand only if ∀�x ∈ �X, �x ∧ � |= or �x ∧ � |= ¬ .

While formula definability extends propositional symbol definability (since every propositional symbol y can bealso viewed as the formula y), it can be recovered from it easily:

Lemma 7. Let �, ∈ PROPPS and X ⊆ PS. Let z be a (fresh) propositional symbol of PS \ (X ∪ Var(�) ∪ Var()).X �� if and only if X ��∧(⇔z) z.

Proof. The proof comes straightforwardly from the following equivalence: for any �x, �x ∧ � |= or �x ∧ � |= ¬ isequivalent to �x ∧ � ∧ ( ⇔ z) |= z or �x ∧ � ∧ ( ⇔ z) |= ¬z. �

Thus, there is no gap of generality between propositional symbol definability and formula definability; also, inthe rest of the paper, for the sake of simplicity we restrict to propositional symbol definability without any loss ofgenerality.

3.2. Characterizations of definability

The proof of Theorem 5 gives a first, semantical, expression of a definition of y on X in � (when it makessense, i.e., when X �� y holds), namely, any formula from PROPX whose set of models is {�x|�x ∧ � |= y}. The nextresults aim at giving more syntactical characterizations, which will provide us with some practical ways of computingdefinitions.

Before presenting them, we need to recall a few basic notions and results about independence and forgetting (see[2] for more details). Let X be a subset of PS. A formula � ∈ PROPPS is independent of X if and only if there existsa formula � s.t. � ≡ � holds and Var(�)∩X = ∅. When X = {x}, we say that � is independent of x. It can be easilyshown [2] that � is independent of X if and only if � is independent of each propositional symbol of X. The set ofpropositional symbols on which a formula � depends is denoted by DepVar(�). For instance, if � = a ∧ (b ∨ ¬b)

then DepVar(�) = {a}.Let � ∈ PROPPS and X ⊆ PS. The forgetting of X in �, denoted ∃X.�, is the formula from PROPPS inductively

defined as follows [10]:

• ∃∅.� = �,• ∃{x}.� = �x←1 ∨ �x←0,• ∃{x} ∪ Y.� = ∃Y.(∃{x}.�).

For instance, with � = (¬a ∨ b) ∧ (a ∨ c), we have ∃{a}.� ≡ b ∨ c.Clearly enough, ∃X.� corresponds to a quantified Boolean formula, usually with free variables (∃ is second-order

quantification, i.e., it bears on propositional atoms).It can be shown [2] that ∃X.� is the logically strongest consequence of � that is independent of X (up to logical

equivalence). Thus, if ϕ is independent of X, then � |= ϕ if and only if ∃X.� |= ϕ. Accordingly, � is independent ofX if and only if � ≡ ∃X.� holds.

Page 8: On propositional definability

998 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Now, the projection of a formula � on a set of propositional symbols X is the result of forgetting everything in �

except X:

Proj(�,X) = ∃(Var(�) \ X

).�.

Taking advantage of the notion of projection, the following result gives a characterization of the definitions of apropositional symbol y definable in terms of a set of propositional symbols X in a formula �.

Theorem 8. Let � ∈ PROPPS and X ⊆ PS. Let �X ∈ PROPX and y ∈ PS. �X is a definition of y on X in � if andonly if

Proj(� ∧ y,X) |= �X |= ¬Proj(� ∧ ¬y,X).

Proof. We have � |= �X ⇔ y if and only if � |= �X ⇐ y and � |= �X ⇒ y if and only if � ∧ ¬y |= ¬�X and� ∧ y |= �X if and only if ∃(PS \ X).(� ∧ ¬y) |= ¬�X and ∃(PS \ X).(� ∧ y) |= �X (since �X is independent ofPS \ X) if and only if Proj(� ∧ y,X) |= �X |= ¬Proj(� ∧ ¬y,X). �

As a direct corollary, we obtain the following characterizations of the strongest and weakest definitions of y, aswell as a further characterization of definability:

Corollary 9. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS.

• If X �� y then DefX,l� (y) ≡ Proj(� ∧ y,X).

• If X �� y then DefX,u� (y) ≡ ¬Proj(� ∧ ¬y,X).

• X �� y if and only if Proj(� ∧ y,X) |= ¬Proj(� ∧ ¬y,X).

Example 3 (continued). Here are the weakest and the strongest definitions (up to logical equivalence) of d4 on{l, d25, d100} in �:

• Def{l,d25,d100},l� (d4) ≡ (l ∨ d100).

• Def{l,d25,d100},u� (d4) ≡ (d25 ∧ d100) ∨ (l ∧ ¬d25 ∨ ¬d100).

Theorem 8 shows that definability is related to the notions of weakest sufficient condition and strongest necessarycondition from [11]. Indeed, let X ⊆ PS and y ∈ PS. A formula � of PROPX is a strongest necessary condition (SNC)of y on X given � if � |= y ⇒ � holds (i.e., � is a necessary condition (NC) of y on X given �), and for any formula of PROPX, if � |= y ⇒ holds, then � |= � ⇒ holds. � ∈ PROPX is a weakest sufficient condition (WSC) ofy on X given � if � |= � ⇒ y holds (i.e., � is a sufficient condition (SC) of y on X given �), and for any formula of PROPX, if � |= ⇒ y holds, then � |= ⇒ � holds. Note that both the strongest necessary and the weakestsufficient conditions of y on X are unique up to �-equivalence [11] (but not up to logical equivalence in the generalcase).

The following theorem shows how SNC and WSC can be characterized using the notion of projection. It extendsTheorem 2 from [11] by relaxing the assumption that y ∈ Var(�) and y /∈ X, and focus on the logically strongest(resp. weakest) SNC (resp. WSC) of y on X w.r.t. �, up to logical equivalence:

Theorem 10. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS.

• Proj(� ∧ y,X) is (up to logical equivalence) the logically strongest SNC of y on X given �.• ¬Proj(� ∧ ¬y,X) is (up to logical equivalence) the logically weakest WSC of y on X given �.

Proof. We just prove the first point (the second one is similar by duality between SNC and WSC). Let �X be anSNC of y on X given �. By definition, we have � |= y ⇒ �X . This is equivalent to � ∧ y |= �X , and equivalentagain to Proj(� ∧ y,X) |= �X since Var(�X) ⊆ X. Hence every SNC of y on X given � is a logical consequence ofProj(� ∧ y,X). It remains to show that Proj(� ∧ y,X) is an NC of y on X given �, which is easy since by definition

Page 9: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 999

of forgetting, � ∧ y |= Proj(� ∧ y,X) for any X and Proj(� ∧ y,X) is independent of every symbol which does notbelong to X. �

From this theorem, one can show that Theorem 8 generalizes Proposition 2 from [11] by providing not only acharacterization of definability in terms of SNC and WSC, but also a characterization of all the definitions of y on X

w.r.t. � in terms of SNC and WSC.Finally, the following lemma shows that, when checking whether X �� Y , every propositional symbol can be

forgotten from � except the definiens X and the definiendum Y :

Lemma 11. Let � ∈ PROPPS and X,Y ⊆ PS. X �� Y if and only if X �Proj(�,X∪Y) Y .

Proof.

(⇒) Let y ∈ Y . We have X �� y if and only if there exists a formula s.t. Var() ⊆ X and � |= ( ⇔ y). Clearlyenough, ( ⇔ y) is independent of every propositional symbol which does not occur in X ∪ {y}. Especially,( ⇔ y) is independent of Var(�) \ (X ∪ {y}). Since Proj(�,X ∪ {y}) = ∃(Var(�) \ (X ∪ {y})).� is themost general consequence of � that is independent of Var(�) \ (X ∪ {y}), we have � |= ( ⇔ y) if andonly if Proj(�,X ∪ {y}) |= ( ⇔ y). Hence, X �Proj(�,(X∪{y}) {y}. This is true for any y ∈ Y , hence we haveX �Proj(�,X∪Y) Y .

(⇐) As explained in Section 3.1, �� is monotonic in � in the sense that, for every X,Y,�,�′, if X �� Y and�′ |= �, then X ��′ Y . The fact that ∃(Var(�) \ (X ∪ Y)).� is a logical consequence of � completes theproof. �

A practical interest of this lemma lies in the fact that Proj(�,X ∪ Y) may belong to a fragment which is com-putationally easier than � for the definability issues. For instance, consider � = (a ∨ (¬b ∧ c)) ∧ (a ∧ (¬a ∨ d)),X = {b, c} and Y = {d}. While � belongs to the NNF fragment for which DEFINABILITY is not tractable (unless P =NP) (see Theorem 22), Proj(�,X∪Y) = ∃{a}.� belongs to the DNNF fragment for which DEFINABILITY is tractable(see Lemma 27).

3.3. Minimal definability

In many AI applications (some of them will be presented in Sections 6 and 7), one is interested in pointing out aset of propositional symbols X in terms of which � defines every symbol of a given formula �. Indeed, it is enoughto assign truth values to the symbols from such a set X to determine the truth value of the propositional symbols ofinterest. Thus, one is especially interested in the minimal sets X:

Definition 12 (base). Let � ∈ PROPPS and X,Y ⊆ PS. X is a minimal defining family, or for short a base, for Y w.r.t.�, if and only if X �� Y holds and there is no proper subset X′ of X such that X′ �� Y . The set of all bases for Y

w.r.t. � is denoted by BS�(Y ).

Example 3 (continued). � = {d400 ⇒ l, (d100 ∧ ¬d400) ⇒ ¬l, (d4 ∧ ¬d100) ⇒ l,¬d4 ⇒ ¬l, d100 ⇔ (d4 ∧d25)}. {d4, d25} is a base for d100; the two sets {d4, d100, d400} and {d4, d25, d400} are bases for l; � definesd4 in terms of {l, d100, d400}, but not minimally, since {l, d100} is a base for d4; the latter also is a base for{d4, d100, d400}.

The following results can be derived easily (we give them without proofs):

(1) ∃Y ∈ BS�(X) such that Y ⊆ X (and, a fortiori, we have BS�(X) �= ∅);(2) BS� is antimonotonic, i.e., ∀X,Y ⊆ PS, if X ⊆ Y then BS�(X) � BS�(Y ), where � is the partial order defined

by S1 � S2 if and only if ∀A ∈ S2 ∃B ∈ S1 such that B ⊆ A.(3) BS�(X) = {∅} if and only if for all x ∈ X we have � |= x or � |= ¬x.(4) ∀B ∈ BS�(X), we have B ⊆ Var(�) ∪ X.

Page 10: On propositional definability

1000 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

As to defining a set of propositional symbols, not only we know (from the definition) that X �� Y if and only if∀y ∈ Y,X �� y, but the following theorem shows that the set of all bases for a set of propositional symbols can becomputed from the set of all bases for propositional symbols taken individually by performing pointwise unions andthen minimizing the obtained sets.3

Theorem 13. Let � ∈ PROPPS and Y = {y1, . . . , yp} ⊆ PS.

BS�(Y ) = min

({p⋃

i=1

Bi | Bi ∈ BS�

({yi})}

,⊆)

.

Proof. Let X ⊆ PS; we prove that X �� Y if and only if ∃X1, . . . ,Xp ⊆ PS s.t. X = X1 ∪ · · · ∪ Xp and Xi �� yi forevery i ∈ {1, . . . , p}. Then the theorem follows immediately.

(⇒) X �� {y1, . . . , yp} means that X �� yi for every i ∈ {1, . . . , p}. Therefore, taking Xi = X for every i provesthe result.

(⇐) Assume that ∃X1, . . . ,Xp such that X = X1 ∪ · · · ∪ Xp and Xi �� yi for every i ∈ {1, . . . , p}. Since Xi �� yi

and Xi ⊆ X, we have X �� yi for every i ∈ {1, . . . , p}. Therefore X �� {y1, . . . , yp}. �Consequently, it will be enough to compute sets of bases for single propositional symbols only. Note however that

a similar result does not hold for shortest bases (in terms of cardinality), i.e., a shortest base for {x, y} cannot alwaysbe written as the union of a shortest base for {x} and a shortest base for {y}.

Note also that it is not the case in general that Var(DefVar(�)∪{y},l� (y)) (or Var(DefVar(�)∪{y},u

� (y))) belongs toBS�({y}); such sets are defining sets but they are not necessarily minimal w.r.t. ⊆ (just consider � = a ⇔ b andy = b as a counter-example). This still fails to hold if we consider only the variables � is not independent of (i.e., ifwe replace Var by DepVar in the previous statement) (the same counter-example works).

Note finally that there is no guarantee in the general case that the number of bases for {y} w.r.t. � is polynomial in|�|; for instance, for the following formula � (equivalent to a Horn CNF formula), {y} has 2n + 1 bases:

� =((

n∧i=1

xi

)⇔ y

)∧

n∧i=1

(xi ⇔ x′i ).

3.4. Undefinable propositional symbols

Because X �� X trivially holds, such instances of the definability relation are typically of little interest. In thetheory of relational databases, functional dependencies of the form X → X are said to be trivial. In the following, apropositional symbol for which every definition in � is trivial in this way is said to be undefinable.

Definition 14 (undefinable propositional symbols). Let � ∈ PROPPS and y ∈ PS. y is undefinable in � if and only ifVar(�) \ {y} ��� y. Otherwise, y is said to be definable in �.

We have the following easy connection between undefinable symbols and bases:

Lemma 15. Let � ∈ PROPPS and y ∈ PS. y is undefinable in � if and only if BS�({y}) = {{y}}.

Proof.

(1 ⇒ 2) If y is undefinable in �, then Var(�) \ {y} ��� y. As a consequence, ∅ ��� y. Hence, {y} is a base for y w.r.t.�. Now, let B ∈ BS�({y}). We have B ⊆ Var(�)∪{y}. If y /∈ B , then B ⊆ Var(�) \ {y}, but this contradictsthe fact that Var(�) \ {y} ��� y. Hence, y ∈ B , and therefore BS�({y}) = {{y}}.

3 The operator ∗ such that BS�({x, y}) = BS�({x}) ∗ BS�({y}) is sometimes called “unionist product” [23]; it is commutative, associative andidempotent—and as a consequence, it makes sense to write BS�(X) = ∗x∈XBS�({x}) = min({∪x∈XBx |Bx ∈ BS�({x})},⊆).

Page 11: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1001

(2 ⇒ 1) If BS�({y}) = {{y}}, then for every X ⊆ PS, we have X �� y if and only if y ∈ X, which concludes theproof. �

3.5. Necessary and relevant propositional symbols

Given a formula � and a set Y of propositional symbols, all propositional symbols in Var(�) can be classifiedaccording to their usefulness for defining Y . The most (resp. the least) important ones are the propositional symbolswhich are necessary (resp. irrelevant) for defining Y , defined as those symbols which belong to all bases for Y (resp.to none of the bases for Y ). Computing necessary propositional symbols in a preliminary step can also prove valuablefor improving the computation of the set of all bases for Y w.r.t. �.

Definition 16 (necessary and relevant propositional symbols). Let � ∈ PROPPS, Y ⊆ PS and x ∈ PS.

• x is a necessary propositional symbol for Y w.r.t. � if and only if x belongs to all bases for Y w.r.t. �.• x is a relevant propositional symbol for Y w.r.t. � if and only if x belongs to at least one base for Y w.r.t. �

(otherwise, x is an irrelevant symbol for Y w.r.t. �).

Since both Y and � are finite, the set of all bases for Y w.r.t. � is never empty (Y �� Y always holds). Asa consequence, any necessary propositional symbol for Y is a relevant propositional symbol for Y . Moreover, it isobvious that any propositional symbol x is relevant to itself whenever � �|= x and � �|= ¬x. The following results aresimple characterizations of necessary and relevant propositional symbols:

Lemma 17. Let � ∈ PROPPS, Y ⊆ PS and x ∈ PS.

(1) x is necessary for Y w.r.t. � if and only if x ∈ Y and x is undefinable in �.(2) x is relevant for Y w.r.t. � if and only if it is relevant for some y ∈ Y w.r.t. �.(3) x is necessary for Y w.r.t. � if and only if it is necessary for some y ∈ Y w.r.t. �.

Proof.

(1, ⇒) Assume that x is necessary for Y w.r.t. �. Since Y �� Y , there exists a B ∈ BS�(Y ) such that B ⊆ Y .Therefore, since x ∈ B , we have x ∈ Y . Now, suppose that x is definable in �, which means that there existsZ ⊆ Var(�) such that x /∈ Z and Z �� x. Let B ∈ BS�(Y ) and B ′ = (B \ {x}) ∪ Z. From what precedes, wehave B ′ �� Y , therefore there is a B ′′ ∈ BS�(Y ) such that B ′′ ⊆ B ′, and since x does not belong to B ′′, itcannot be necessary for Y w.r.t. �.

(1, ⇐) Assume that x ∈ Y and x is undefinable in �. x being undefinable in � is equivalent to BS�({x}) = {{x}},therefore, as a consequence of Theorem 13 and the fact that x ∈ Y , any B ∈ BS�(Y ) contains x, which meansthat x is necessary for Y w.r.t. �.

(2, ⇒) If x is relevant for Y w.r.t. � then there is a B ∈ BS�(Y ) containing x, and by Theorem 13, there is a y ∈ Y

and a B ′ ∈ BS�({y}) such that y ∈ B ′; hence x is relevant for y w.r.t. �.(2, ⇐) Immediate consequence of Theorem 13.

(3) Comes easily from point (1): x is necessary for Y = {y1, . . . , yp} w.r.t. � if and only if ∃i ∈ 1 . . . p, x = yi andx is undefinable in � if and only if ∃i ∈ 1 . . . p (x = yi and x is undefinable in �) if and only if ∃i ∈ 1 . . . p,x is necessary for yi w.r.t. �. �

Point (1) expresses that the propositional symbols necessary for Var(�)—hence the “key propositional symbols”,by analogy with data bases, are all those that cannot be defined otherwise. Point (2) expresses that it is enough toconsider the relation “being relevant for” between propositional symbols instead of sets of propositional symbols.Point (3) expresses the same result for the relation “being necessary for”.

As a direct corollary, we obtain the following easy connection between necessary symbols and undefinable ones:

Corollary 18. Let � ∈ PROPPS and y ∈ PS. y is undefinable in � if and only if y is necessary for {y} w.r.t. �.

Page 12: On propositional definability

1002 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Example 3 (continued).

• BS�(Var(�)) = {{d4, d25, d400}, {l, d25, d100}, {l, d4, d25}}; therefore, only d25 is necessary for Var(�) w.r.t.�; furthermore, BS�({d25}) = {{d25}} and d25 is undefinable in �.

• BS�({l, d4, d100, d400}) = {{d4, d100, d400}, {d4, d25, d400}, {l, d100, d400}, {l, d4, d25}, {l, d4, d100}};therefore, no propositional symbol is necessary for {l, d4, d100, d400} w.r.t. �, and all propositional symbolsof Var(�) are relevant for {l, d4, d100, d400} w.r.t. �.

Note that the relation “being relevant for” between single propositional symbols is not symmetric.4 For instance,let � = (c ⇔ (a ∨ b)). a is relevant for c, but c is not relevant for a because c does not belong to any base for a.What we can only say is that c is useful for determining the truth value of a in some specific situations (here, when b

is false). This calls for another notion than definability, namely conditional independence [1].

3.6. Unambiguous definability

In the beginning of this section we wrote that definability imposes that some propositional symbols are fixedwhenever some other propositional symbols are fixed as well, or in other terms, that the value of y is a functionof the values of the variables in X. Formally, this is not entirely true, as we can see on the following example: let� = (a ⇒ b)∧ ((a ⇔ b) ⇔ c), X = {a, b}, and y = c. Clearly, X �� y. Is the value of c unambiguously defined fromthe values of a and b? No, because of the situation �x where a is true and b false. This situation being inconsistentwith �, it trivially holds that � ∧ �x |= y and � ∧ �x |= ¬y, thus in this situation the value of y is not unambiguouslydefined, and we cannot formally say that the value of y is a function of the values of a and b. However, in practice,this makes little difference provided that � is interpreted as a hard constraint (that is, any countermodel of � is animpossible world that does not need to be considered): in this case, we can safely neglect those �x-worlds that areinconsistent with �, and say that in every possible situation, the value of y is a function of the values of a and b. Still,in some contexts (especially reasoning about action and change—see Section 6), it is important to know whether suchinconsistent X-assignments exist or not.

Definition 19. Let � ∈ PROPPS and X ⊆ PS. We say that � is strongly X-consistent if and only if for every �x ∈ �X ,�x ∧ � is consistent. We say that � unambiguously defines Y in terms of X if and only if � is strongly X-consistentand X �� Y .

Requiring � to be strongly X-consistent has a strong impact on the characterization of explicit definitions. Indeed,the strong X-consistency of � is a necessary and sufficient condition for the unicity (up to logical equivalence) ofexplicit definitions on X in �:

Theorem 20. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS such that X �� y. Then � is strongly X-consistent if and only iffor any two definitions ϕ, ψ of y on X in �, we have ϕ ≡ ψ .

Proof.

(⇒) Assume there exist two non-equivalent formulas ϕ and ψ of PROPX such that (a) � |= y ⇔ ϕ and (b) � |= y ⇔ψ . (a) and (b) imply (c) � |= ϕ ⇔ ψ . Since ϕ and ψ are not logically equivalent, there exists a �x ∈ �X suchthat �x |= ¬(ϕ ⇔ ψ), which, together with (c), implies that �x ∧ � is inconsistent, therefore � is not stronglyX-consistent.

(⇐) Assume � is not strongly X-consistent. Let then be �x ∈ �X such that �x ∧� is inconsistent. Let ϕ be a definitionof y on X in �. If �x |= ϕ (respectively, �x |= ¬ϕ), then let ψ be the formula of PROPX, unique up to logicalequivalence, whose set of models are exactly the models of ϕ except �x (respectively, the models of ϕ plus �x). ψ

is also a definition of y on X in �, and ψ is not logically equivalent to ϕ. �4 The relation “being necessary for” between single propositional symbols is of no interest since y �= x is never necessary for x, and x is necessary

for x if and only if x is undefinable.

Page 13: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1003

4. Computational aspects

4.1. Definability

The following result is the restriction to propositional logic of a property, which holds in first-order logic, and is dueto Padoa [24]. It consists of an entailment-based characterization of (implicit) definability and is useful for identifyingtractable restrictions of definability in the propositional case. We give a simple proof which holds for propositionallogic: for any � and any X ⊆ PS, let rename(�,X) be the formula obtained by replacing in � in a uniform way everypropositional symbol z from Var(�) \ X by a new propositional symbol z′. We have:

Theorem 21 (Padoa’s method). (See [24].) If y /∈ X, then X �� y if and only if (� ∧ rename(�,X)) |= y ⇒ y′.

Proof. From Theorem 8, we get that X �� y if and only if Proj(� ∧y,X) |= ¬Proj(� ∧¬y,X). Equivalently, X ��

y if and only if ∃(PS \ X).(� ∧ y) ∧ ∃(PS \ X).(� ∧ ¬y) is unsatisfiable. Since quantified variables are dummy ones,when y /∈ X, ∃(PS\X).(�∧y)∧∃(PS\X).(�∧¬y) is equivalent to ∃(PS\X).(�∧y)∧∃(PS′ \X′).(rename(�,X)∧¬y′) where for any subset Z of PS we have Z′ = {x′ | x ∈ Z}. This quantified Boolean formula is also equivalent tothe following prenex one: ∃(PS \ X) ∪ (PS′ \ X′).(� ∧ y ∧ rename(�,X) ∧ ¬y′), which is unsatisfiable if and only if� ∧ y ∧ rename(�,X) ∧ ¬y′ is unsatisfiable if and only if (� ∧ rename(�,X)) |= y ⇒ y′. �

Accordingly, whenever y does not belong to X, checking definability comes down to a standard deduction check.Since X �� y trivially holds in the remaining case (i.e., y ∈ X), we can conclude that a set-membership test plus adeduction check are always sufficient to decide definability.

We now give the complexity of definability in the general case, as well as in some restricted cases:

Theorem 22. DEFINABILITY is coNP-complete even under the restriction when � is a Blake formula.

Proof.

• Membership: Membership of DEFINABILITY to coNP comes directly from Theorem 21 which gives a polyno-mial reduction from DEFINABILITY to UNSAT, which is in coNP and coNP is well-known as closed under suchreductions.

• Hardness: As to hardness, let us exhibit a polynomial reduction from CNF-UNSAT to the restriction of DEFIN-ABILITY to the Blake fragment: let ϕ = ∧m

i=1 γi be a CNF formula from PROPPS such that Var(ϕ) = {x1, . . . , xn};w.l.o.g., we assume that ϕ does not contain any clause implied by another clause (if it is not the case, wefirst remove every properly implied clause from it; this can be easily achieved in polynomial time). To ϕ

we associate in polynomial time the formula � = ∧mi=1(γi ∨ new ∨ y) ∧ (γi ∨ ¬new ∨ ¬y) where new is a

fresh variable from PS \ (Var(ϕ) ∪ {y}). We take advantage of the following property, which results directlyfrom the correctness of resolution-based prime implicates algorithms (like Tison’s one [18]): a CNF formula� contains all its prime implicates if and only if whenever two clauses from it have a resolvent δ, thereexists a clause ε ∈ � s.t. ε |= δ. By construction, every binary resolvent from clauses of � is tautologous,hence implied by any clause of �. As a consequence, � contains all its prime implicates, and since it doesnot contain properly implied clauses, it is a prime implicates formula. Now, from Theorem 8, we have thatX �� y if and only if Proj(� ∧ y,X) |= ¬Proj(� ∧ ¬y,X) if and only if ∃X.(� ∧ y) ∧ ∃X.(� ∧ ¬y) is un-satisfiable. With X = Var(ϕ) ∪ {new}, we have that X �� y if and only if ∃X.(

∧mi=1(γi ∨ new ∨ y) ∧ (γi ∨

¬new ∨ ¬y) ∧ y) ∧ ∃X.(∧m

i=1(γi ∨ new ∨ y) ∧ (γi ∨ ¬new ∨ ¬y) ∧ ¬y) is unsatisfiable. The latter formulais equivalent to ∃X.(

∧mi=1((γi ∨ ¬new) ∧ y)) ∧ ∃X.(

∧mi=1((γi ∨ new) ∧ ¬y)), which is itself equivalent to∧m

i=1(γi ∨ ¬new) ∧ ∧mi=1(γi ∨ new) since X = {y} and y /∈ Var(ϕ) ∪ {new}. But this formula is also equiva-

lent to ϕ (it is enough to compute all its resolvents over new and remove the implied clauses to get ϕ). Hence ϕ isunsatisfiable if and only if X �� y and this completes the proof. �

This theorem generalizes Theorem 2.2 from [22]: we relax here the (useless) assumption that � is a CNF formulafor proving the membership to coNP and constrain � to belong to the Blake fragment for the hardness part.

Page 14: On propositional definability

1004 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Interestingly, it shows that constraining � to belong to a propositional fragment that is tractable for SAT (as it isthe case for IP) does not necessarily lead a tractable restriction of DEFINABILITY.

An easier proof of coNP-completeness when � is not constrained as a Blake formula can be obtained, observingthat if ϕ,ψ ∈ PROPPS, and y ∈ PS \ X, with X = Var(ϕ) ∪ Var(ψ), then ϕ and ψ are logically equivalent if and onlyif � = (y ⇒ ϕ) ∧ (ψ ⇒ y) defines y in terms of X. By the way, this also shows the existence of theories � for whichdeciding the definability of a symbol y is computationally hard, but when one knows that the symbol is definable in�, computing an explicit definition of it is easy (ϕ is such a definition).

We also identified the complexity of the minimal definability problem:

Theorem 23. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS. Checking whether X is a minimal defining family for y

(MINIMAL DEFINING FAMILY) w.r.t. � is BH2-complete.

Proof.

• Membership: X is a minimal defining family for y w.r.t. � if and only if X �� y and ∀X′ ⊂ X, X′ ��� y.Now, ∀X′ ⊂ X,X′ ��� y holds if and only if ∀x ∈ X, X \ {x} ��� y. Thus MINIMAL DEFINING FAMILY is theintersection of a language in coNP and a language in NP (since the intersection of a linear number of a languagesin NP is in NP), which proves membership to BH2.

• Hardness: let ϕ and ψ be two propositional formulas; we associate to them in polynomial time the tupleL(〈ϕ,ψ〉) = 〈�,X,y〉 where– � = ((¬ψ ∧ x) ⇒ y) ∧ ((¬ψ ∧ ¬x) ⇒ ¬y) ∧ ((¬ϕ) ⇒ y);– X = {x};– x and y are new propositional symbols, not appearing in ϕ or ψ .It is easy to check that {x} �� y if and only if ψ is unsatisfiable or ϕ is unsatisfiable. Now, ∅ �� y if and only if ϕ

is unsatisfiable. This means that {x} is a minimal defining family for y w.r.t. � if and only if ψ is unsatisfiable andϕ is satisfiable, i.e., if and only if 〈ϕ,ψ〉 is an instance of SAT-UNSAT. Thus L is a polynomial (Karp) reductionfrom SAT-UNSAT to MINIMAL DEFINING FAMILY. �

When � is such that deciding whether X �� y holds for any X ⊆ PS and y ∈ PS is tractable, deciding whetherX is a minimal defining family for y w.r.t. � for any X ⊆ PS and y ∈ PS is tractable as well (since X is a minimaldefining family for y w.r.t. � if and only if X �� y and ∀x ∈ X, X \ {x} ��� y).

On the other hand, as Theorem 23 suggests it, when X �� y is known to hold, deciding whether X is a minimaldefining family for y w.r.t. � remains computationally hard (unless P = NP):

Theorem 24. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS such that X �� y. Checking whether X is a minimal definingfamily for y w.r.t. � is NP-complete.

Proof.

• Membership: Membership consists in checking that ∀x ∈ X, X \ {x} ��� y, which requires to solve Card(X)

(independent) instances of DEFINABILITY. Since DEFINABILITY is in NP, this is also the case of the problemunder consideration.

• Hardness: By reduction from SAT. Let ϕ ∈ PROPPS such that Var(ϕ) = {x1, . . . , xn} a non-empty set. To ϕ

we associate in polynomial time � = (ϕ ∧ ∧ni=1(xi ⇔ x′

i )) ⇔ y (where x′1, . . . , x

′n are fresh atoms from

PS \ {x1, . . . , xn, y}) and X = {x1, . . . , xn, x′1, . . . , x

′n}. By construction ϕ ∧ ∧n

i=1(xi ⇔ x′i ) is a definition of

y on X in �, hence X �� y. Now, ϕ is satisfiable if and only if X is a minimal defining family for y w.r.t. �. In-deed if ϕ satisfiable then ϕ ∧ ∧n

i=1(xi ⇔ x′i ) depends on all the variables in X (i.e., there does not exist a formula

ψ such that ψ ≡ ϕ ∧∧ni=1(xi ⇔ x′

i ) and Var(ψ) ⊂ X). Therefore, all variables in X are necessary for defining y.This means that there does not exist a definition of y on a proper subset of X in �, hence X is a minimal definingfamily for y w.r.t. �. If ϕ is unsatisfiable, then � ≡ ¬y and ∅ �� y, hence X is not a minimal defining family fory w.r.t. �. �

Page 15: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1005

Since the transformation from formula definability to propositional symbol definability given by Lemma 7 can beachieved in polynomial time and since propositional symbol definability is a restriction of formula definability, thesecomplexity results apply as well to formula definability.

Now, some tractable restrictions for DEFINABILITY (hence for MINIMAL DEFINING FAMILY) can be easily derivedfrom Theorem 21. We first need to make precise the conditions under which such restrictions are based:

Definition 25 (stability conditions). Let C be a propositional fragment, i.e., a subset of PROPPS.

• C is stable by expansion for partial renaming if and only if for every � ∈ C and for every X ⊆ PS, we have� ∧ rename(�,X) ∈ C.

• C is stable by conditioning if and only if for every � ∈ C and γ is a satisfiable conjunction of literals, then theconditioning �γ of � by γ also belongs to C.

Theorem 26. Let C be a propositional fragment satisfying the stability conditions listed in Definition 25. C is tractablefor SAT if and only if the restriction of DEFINABILITY when � belongs to C is tractable.

Proof. Let us first show that if C is tractable for SAT then the restriction of DEFINABILITY is tractable. The key isTheorem 21; there are two cases: if y ∈ X (which can be obviously decided in polynomial time), then any � definesy in terms of X; otherwise, Theorem 21 shows that X �� y if and only if � ∧ rename(�,X) |= y ⇒ y′. This isequivalent to determine whether (� ∧ rename(�,X))γ is inconsistent where γ is y ∧ ¬y′. By construction, sucha formula (� ∧ rename(�,X))γ belongs to C whenever � belongs to C, because C is stable by conditioning andexpansion by partial renaming; hence the satisfiability of it can be decided in polynomial time.

Conversely, if the restriction of DEFINABILITY when � belongs to C is tractable then deciding whether ∅ �� new(with new ∈ Ps \ Var(�)) can be achieved in polynomial time. But ∅ �� new if and only if � is unsatisfiable. Hencethe satisfiability of � can be decided in polynomial time. �

Note that stability by expansion with partial renaming is strictly less demanding than stability by (bounded) con-junction; for instance, the class of renamable Horn CNF formulas is stable by expansion for partial renaming, but it isnot stable by bounded conjunction.

Interestingly, some quite general propositional fragments satisfy the stability conditions given in Definition 25.This is the case for the class of q-Horn formulas (which includes both Krom CNF formulas, Horn CNF formulasand renamable Horn CNF formulas as specific cases) [6] and the class of Decomposable Negation Normal Form(DNNF) formulas (which includes several other important fragments, namely the DNF formulas and the OrderedBinary Decision Diagrams, OBDD<) [7,9].

Lemma 27. The restrictions of DEFINABILITY for which � is a q-Horn CNF formula or a DNNF formula are in P.

Proof. It is known that the class of q-Horn CNF formulas is tractable for SAT [6]; and it is obvious that it is stable byconditioning; now, stability by expansion with partial renaming comes from the fact that if V is a q-Horn renamingfor �, then the set of symbols V ∪ {rename(x,∅) | x ∈ V \ X} is a q-Horn renaming for � ∧ rename(�,X). Finally,as to the DNNF class, the result comes immediately from Propositions 4.1 and 5.1 from [25]. �

Lemma 27 generalizes Theorem 3.1 and Corollary 3.2 from [22], which concern Horn CNF formulas, as wellas Theorem 7.1 and Corollary 7.2 from [15], which concern q-Horn CNF formulas. It does not generalize Theorem3.5 and Corollary 3.6 from [22] (resp. Theorem 7.3 and Corollary 7.4 from [15]), showing the tractability of therestrictions of DEFINABILITY when � is equivalent to a Horn CNF formula (resp. a q-Horn CNF formula) but isgiven by its (disjunctively interpreted) Horn (resp. q-Horn) envelope.

Note that Theorem 21 can prove helpful for deciding in polynomial time whether X �� y under restrictions on �

that are outside the scope of Lemma 27. For instance, if � = (ϕ ⇒ y)∧ (y ⇒ ψ) where ϕ,ψ are Horn CNF formulassuch that y /∈ Var(ϕ) ∪ Var(ψ), then X �� y can be decided in polynomial time since it amounts to determiningwhether ϕ |= ψ holds. However, � is neither a q-Horn CNF formula, nor a DNNF one.

Page 16: On propositional definability

1006 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

It is interesting to observe that the stability conditions given in Definition 25 are not satisfied by some propositionalfragments that are tractable for SAT; for instance the Blake fragment (formulas in prime implicates normal form) doesnot satisfy any of them.

Now, what about the complexity of unambiguous definability? Checking that � is strongly X-consistent beingsignificantly harder than checking definability, this carries over to unambiguous definability:

Theorem 28. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS.

• Deciding whether � is strongly X-consistent is �p

2 -complete.• Deciding whether � unambiguously defines y in terms of X is �

p

2 -complete.

Proof. Membership is easy in both cases. For deciding whether � is strongly X-consistent, hardness comes from thistrivial reduction from QBF2,∀: ∃A∀B� is a valid instance of QBF2,∀ if and only if � is strongly A-consistent. As fordeciding whether � unambiguously defines y in terms of X, it suffices to remark that � unambiguously defines X interms of X if and only if � is strongly X-consistent. �

Finally, knowing that � is strongly X-consistent does not change the complexity of definability:

Theorem 29. Let � ∈ PROPPS, X ⊆ PS and y ∈ PS. Given that � is strongly X-consistent, deciding whether X �� y

is coNP-complete.

Proof. Membership is obvious. Hardness comes from the following reduction from UNSAT: let ϕ be a propositionalformula and z a fresh variable, not appearing in ϕ; then ϕ ∈ UNSAT if and only if ϕ ∨ z |= z, that is, if ∅ �ϕ∨z z, andclearly, ϕ ∨ z is strongly ∅-consistent, because ϕ ∨ z is consistent. �4.2. Undefinability, necessity and relevance

From Theorem 21, we can easily derive the following characterization of undefinable propositional symbols, whichis surprisingly simple:

Lemma 30. Let � ∈ PROPPS and y ∈ PS. y is undefinable in � if and only if �y←0 ∧ �y←1 is satisfiable.

Proof. By definition, we have that y is undefinable in � if and only if Var(�) \ {y} ��� y. Since y /∈ Var(�) \ {y},from Theorem 21, we get that y is undefinable in � if and only if (� ∧ rename(�,Var(�) \ {y})) �|= (y ∨ ¬y′).This is equivalent to state that (� ∧ rename(�,Var(�) \ {y})) ∧ ¬y ∧ y′ is satisfiable. This is again equivalentto state that the conditioning of � ∧ rename(�,Var(�) \ {y}) by the satisfiable conjunction of literals ¬y ∧ y′ issatisfiable. Now since y′ (resp. y) does not occur in � (resp. rename(�,Var(�)\ {y})), this conditioning is equivalentto �¬y ∧ rename(�,Var(�) \ {y})y′ . Since rename(�,Var(�) \ {y})y′ is equivalent to �y (since y is the uniquesymbol that has been renamed), we obtain that y is undefinable in � if and only if �¬y ∧ �y is satisfiable. �

Necessary propositional symbols can be characterized by means of prime implicants in this simple and elegantway:

Lemma 31. Let � ∈ PROPPS and x ∈ PS. x is definable in � if and only if every prime implicant of � contains x or¬x.

Proof. The prime implicants of �, or equivalently of (¬x ∧ �x←0) ∨ (x ∧ �x←1), that contain neither x nor ¬x, arethe prime implicants of �x←0 ∧ �x←1, see e.g., [8]. Since the latter formula is unsatisfiable whenever x is definablein � (cf. Lemma 30), every prime implicant of � contains x or ¬x in this situation (and only if x is definable in�). �

We have also derived the following complexity results:

Page 17: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1007

Theorem 32. Let � ∈ PROPPS, X,Y ⊆ PS, and x, y ∈ PS.

(1) Deciding whether y is undefinable in � (UNDEFINABILITY) is NP-complete.(2) Deciding whether x is necessary for Y w.r.t. � (NECESSITY) is NP-complete. Hardness still holds if Y is a

singleton.(3) Deciding whether x is relevant to Y w.r.t. � (RELEVANCE) is in �

p

2 and both NP-hard and coNP-hard, hencenot in NP ∪ coNP (unless the polynomial hierarchy collapses to the first level). Hardness still holds if Y is asingleton.5

Proof.

(1) Membership is a corollary of Lemma 30. Hardness comes from the following polynomial reduction from SAT: forany propositional formula ϕ, let � = ϕ ∨ y where y /∈ Var(ϕ); now, ϕ is satisfiable if and only if y is undefinablein �.

(2) Membership is a corollary of Point (1) above and Point (2) of Lemma 17. Hardness is a consequence of Point (1)above and the equivalence between (1) and (4) in Corollary 18.

(3) Membership is easy: guess B ⊆ Var(�) ∪ Y and check using a linear number of calls to an NP oracle that B is aminimal defining family for Y w.r.t. �. NP-hardness comes from the following polynomial reduction from SAT:for any propositional formula ϕ, let � = ϕ ∧ y where y /∈ Var(ϕ) and let Y = {y}; now, ϕ is satisfiable if and onlyif y is relevant to Y w.r.t. �. coNP-hardness comes from the following polynomial reduction from UNSAT: forany propositional formula ϕ over X = {x1, . . . , xn}, let � = (z ⇔ y) ∧ (((z ∧ ϕ) ∨ x) ⇔ y); if ϕ is unsatisfiable,then x is relevant to Y = {y} w.r.t. � since {x} is a base for Y w.r.t. � in such a case; if ϕ is satisfiable, then x

is not relevant to Y w.r.t. �; indeed, � does not define Y in terms of X ∪ {x}: let �x be any X-model of ϕ; wehave �x ∧ ¬x ∧ � ≡ (z ⇔ y) showing that instantiating z is necessary to derive the truth value of y. Hence, everybase for Y w.r.t. � must contain z; since, by construction, {z} is a base for Y w.r.t. �, we conclude that {z} is theunique base for Y w.r.t. � when ϕ is satisfiable, and this is enough to conclude the proof. �

From the definition of undefinable symbols and Lemma 17, it immediately follows that the restrictions of UNDE-FINABILITY and of NECESSITY for which � satisfies the stability conditions listed in Definition 25 are also in P. Suchrestrictions also make the complexity of RELEVANCE belonging to NP.

4.3. Computing explicit definitions

Theorem 8 and its corollary give us several ways of computing explicit definitions. In particular, they show thatwhen X �� y, then the strongest definition of y on X in � is Proj(� ∧ y,X), or equivalently in the case y /∈ X,Proj(�y←1,X).

Such a characterization proves particularly helpful when � is from a propositional fragment allowing polytimeforgetting and conditioning [26]. As a consequence of Theorem 8, we get:

Lemma 33. Let C be any propositional fragment, which is stable by conditioning and enables polytime forgetting (i.e.,there exists a polytime algorithm for deriving a formula from C equivalent to Proj(�,X) for any formula � ∈ C anda set of symbols X). Then for any � ∈ C, X ⊆ PS and y ∈ PS such that X �� y, an explicit definition �X of y on X

in � can be computed in time polynomial in |�| + |X|.

Proof. If y ∈ X then y ⇔ y is an explicit definition of y on X in �. Otherwise, from Theorem 8, we have thatProj(�y←1,X) is an explicit definition of y on X in �. Under the assumptions of the lemma, a propositional formulaequivalent to Proj(�y←1,X) can be computed in polynomial time. �

Among the influential propositional fragments enabling both operations in polynomial time are the DNNF one[7,9] and the prime implicates one (see [26]). For instance, Proj(� ∧ y,X) can be computed efficiently by selecting

5 We conjecture that this problem is �p2 -complete.

Page 18: On propositional definability

1008 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

from the set IP(�) of prime implicates of � ∧ y those belonging to PROPX (see e.g., Lemma 8 from [27]). Once thisformula has been computed, the truth value of y for any �x ∈ �X can be computed in linear time as the truth valueof Proj(� ∧ y,X)y←1. It is interesting to note that, while both fragments enable the computation of definitions ofpolynomial size when � is known to define y in terms of X, the restrictions of DEFINABILITY they induce do not havethe same complexity (unless P = NP). Thus determining whether X �� y is tractable when � is a DNNF formula andcoNP-complete when � is a prime implicates formula (see Lemma 27 and Theorem 22).

The OBDD< fragment [28,29], a famous subset of DNNF, can also be considered, provided that the variables to beforgotten (i.e., all the variables except those of X) are the final variables w.r.t. the total, strict ordering < on variables,associated to the fragment [30,31]. Accordingly, the previous corollary completes some results reported in [22] (resp.in [15]), showing that when � is a Horn CNF formula (resp. a q-Horn CNF formula), then every variable y has anexplicit definition in � that is equivalent to a positive conjunction of literals (resp. a conjunction of literals or a clause)(hence, is of polynomial size w.r.t. the input).

While the possibility to compute an explicit definition �X of y on X in � in polynomial time in some restrictedcases (and to determine in polynomial time that no such definition exists otherwise), ensures that the size of thisdefinition is polynomially bounded, this cannot be guaranteed in the general case, unless P = NP (this is a directconsequence of Theorem 22).

Actually, the situation is even computationally worse in the general case, since we can prove that there is no wayto compute definitions in polynomial space in the general case (under the usual assumptions of complexity theory).

Theorem 34. Let � be a formula from PROPPS. Let X ⊆ PS and let y ∈ PS. In the general case, the size of any explicitdefinition �X of y in � is not polynomially bounded in |�| + |X| unless NP ∩ coNP ⊆ P/poly.

Proof. We exploit a close connection between the definability problem and the interpolation one.Let ϕ, ψ be two formulas from PROPPS. A formula α from PROPPS is an interpolant of 〈ϕ,ψ〉 if and only if

Var(α) ⊆ Var(ϕ) ∩ Var(ψ) and ϕ |= α and α |= ψ hold.Indeed, it is known that in the general case the size of any interpolant α of 〈ϕ,ψ〉 is not polynomially bounded in

|ϕ| + |ψ | unless NP ∩ coNP ⊆ P/poly [32].To every pair 〈ϕ,ψ〉, we can associate in polynomial time the pair 〈�,new〉 where � = (ψ ⇒ new) ∧ (new ⇒ ϕ),

and new ∈ PS \ (Var(ϕ) ∪ Var(ψ)). The point is that ϕ |= ψ if and only if X �� new. Moreover, every interpolant of〈ϕ,ψ〉 is a definition of new on Var(ϕ) ∩ Var(ψ) w.r.t. � and the converse also holds. Indeed, from Craig’s interpola-tion theorem in propositional logic, ϕ |= ψ holds if and only if there exists an interpolant of 〈ϕ,ψ〉. Now:

• If way. Let �X be any explicit definition of new on X = Var(ϕ)∩Var(ψ) w.r.t. �. We have � |= new ⇔ �X . Thisis equivalent to state that (1) (ψ ⇒ new) ∧ (new ⇒ ϕ) |= new ⇒ �X , and (2) (ψ ⇒ new) ∧ (new ⇒ ϕ) |= �X ⇒new. (1) is equivalent to (ψ ⇒ new)∧ (new ⇒ ϕ)∧new∧¬�X is unsatisfiable, or equivalently to new∧ϕ∧¬�X

is unsatisfiable. Since new /∈ Var(ψ) ∪ Var(ϕ), we have new /∈ X. Accordingly, (1) is equivalent to ϕ ∧ ¬�X isunsatisfiable, i.e., ϕ |= �X . From (2), it is easy to derive in a similar way that �X |= ψ . Hence, any �X is aninterpolant of 〈ϕ,ψ〉.

• Only-if way. Let αX be any interpolant of 〈ϕ,ψ〉. By definition, we have |= (ϕ ⇒ αX)∧ (αX ⇒ ψ). Subsequently,� ≡ (ψ ⇒ new) ∧ (new ⇒ ϕ) ∧ (ϕ ⇒ αX) ∧ (αX ⇒ ψ). We immediately obtain that � |= new ⇔ αX . Thus,X �� new and every interpolant of 〈ϕ,ψ〉 is an explicit definition of new on X = Var(ϕ) ∩ Var(ψ) w.r.t. �. �

4.4. Computing a base

In this section, we present an algorithm for generating a base X for a propositional symbol y w.r.t. a formula �,if any. X will be required to be contained in a fixed set of “acceptable” propositional symbols V ∗. We called such abase a V ∗-base. The role of V ∗ is to focus on interesting bases, only; for instance, in a discriminability problem, V ∗will be the set of testable propositional symbols. In particular, if one wants to know whether y is undefinable or not in�, then V ∗ is set to Var(�) \ {y}.

This algorithm (described by the function Find-A-Base below) is a greedy algorithm which considers all thepropositional symbols of V ∗ in any order (nevertheless the use of heuristics for determining this order may reduce the

Page 19: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1009

search time) and throw them away when they are not necessary for forming a base from the current set of acceptablepropositional symbols. The inputs of Find-A-Base are V ∗, y and � and its output is a subset of V ∗ or “failure”.

This algorithm calls a function Defines which checks whether a given subset of propositional symbols defines y

w.r.t. �. How � is represented and how the function Defines is implemented will be discussed separately.

if not Defines(V ∗, y, �)return “failure”else X ← V ∗

for x ∈ V ∗ doif Defines(X \ {x}, y, �)X ← X \ {x}end if

end forreturn X

end if

The following easy lemma states that the algorithm Find-A-Base is correct:

Lemma 35. Provided that Defines(X,y,�) returns true if and only if X �� y, Find-A-Base returns a V ∗-basefor y w.r.t. � if there exists such a base, “failure” otherwise.

Proof. Straightforward. �This algorithm can readily be extended to an algorithm for generating a base X for a set Y of propositional symbols

w.r.t. a formula �. It suffices to replace y by Y within each call to Defines, and to extend the latter function to suchsets Y (this is obvious given the definition of implicit definability).

It can also be extended to an algorithm for deriving all V ∗-bases for a set Y , through a judicious way to search thewhole set 2V ∗

(see the set enumeration tree algorithm in [33]). This task is clearly more computationally expensivethan computing a single base, especially due to the number of such bases (which can be exponential, as explainedbefore); however, as Theorem 22 suggests it, its computational cost is not solely due to the number of bases:

Theorem 36. Unless P = NP, there exists no polynomial time algorithm for computing a V ∗-base for a propositionalsymbol y w.r.t. a CNF formula �.

Proof. Let ϕ be a CNF formula and let y /∈ Var(ϕ); let V ∗ = Var(ϕ) ∪ {y}; let � = ϕ ∧ y. If ϕ is unsatisfiable (resp.satisfiable), then ∅ (resp. {y}) is the unique V ∗-base for y w.r.t. �. If a polynomial time algorithm for computing aV ∗-base for y existed, then after running it on y and �, there would be two possibilities: either the computed base is∅ and in this case, ϕ is unsatisfiable, or it is {y} and in this case ϕ is satisfiable. But this would be a polynomial timealgorithm for deciding whether a CNF formula ϕ is satisfiable. Hence, SAT would belong to P. �

Since Theorem 36 also holds when y has a single V ∗-base w.r.t. �, it strengthens Theorem 3.1 from [34] showingthat there exists no polynomial total time (i.e., polynomial in the size of the input plus the size of the output) forcomputing all the minimal functional dependencies which hold in �, unless P = NP when � is a CNF formula.6

In practice, the task of deriving all V ∗-bases for a set Y w.r.t. � can be improved in some situations by computingfirst the set of all necessary variables and the set of all relevant variables for Y w.r.t. �; all irrelevant variables can beremoved from V ∗ before running the algorithm, and subsets of 2V ∗

which do not contain all necessary variables canbe skipped during the search.

6 In the same paper, the authors also showed that, when � is given by the set of its models (over Var(�)) this task is polynomially equivalent tothe problem of dualizing a positive theory (or, equivalently, of computing the transversals of a hypergraph), for which no polynomial time algorithmis known but a pseudo-polynomial algorithm exists.

Page 20: On propositional definability

1010 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Clearly enough, the simple algorithm Find-A-Base above does not run in polynomial time in the worst case(since this is not the case for the function Defines, unless P = NP). This coheres with Theorem 36 showing that nosuch algorithm exists, unless P = NP.

Now, there are several possible ways to implement the function Defines depending on the propositional frag-ment � belongs to. If � is a CNF formula, then one can easily implement Defines by taking advantage of a SAT

solver (and many such solvers with impressive performances are available nowadays). In the case when the syntacticrestrictions on � makes definability polynomial, then the search for a V ∗-base is itself polynomial because it consistsin |V ∗| + 1 definability tests. Additionally, when � belongs to a propositional fragment satisfying the conditionslisted in Definition 25, and X is a base for y w.r.t. �, the truth value of y of can be computed in time polynomial in|�| + |X| for every X-world �x. Indeed, checking whether ��x,¬y is satisfiable can be done in polynomial time and thetruth value of this test gives the truth value of y.

5. Definability and hypothesis discriminability

In this section, we investigate a notion which is closely related to definability and which also has many practicalapplications ranging from fault isolation in diagnosis to decision under partial observability. Intuitively, given a setof propositional formulas H = {h1 . . . hn}, which represent mutually exclusive and exhaustive hypotheses w.r.t. aknowledge base � (i.e., ∀h,h′ ∈ H , if h �= h′ then � |= ¬(h ∧ h′) and � |= ∨n

i=1 hi ) and a set X of availablebinary tests (encoded as propositional symbols), X discriminates H w.r.t. � if the knowledge of the truth values ofpropositional symbols of X helps finding out which one of the hi is true.

Definition 37 (discrimination).

• The input of a discrimination problem is any triple 〈�,X,H 〉 which consists of a consistent formula �, a setof test variables X s.t. X ⊆ Var(�) and a set H = {h1, . . . , hn} of formulas which are mutually exclusive andexhaustive w.r.t. �.

• X discriminates H w.r.t. � if and only if ∀�x ∈ �X ∃h ∈ H s.t. �x ∧ � |= h.• X discriminates minimally H w.r.t. � if and only if X discriminates H w.r.t. � and no proper subset of X does it.

There are many contexts (including diagnosis and decision under uncertainty) where one wishes to discrimi-nate among a set of hypotheses hi, i = 1, . . . , n, given a set of available tests. Let us illustrate it, focusing on theconsistency-based diagnosis setting [35] (things are similar in the abductive diagnosis setting with respect to thediscrimination issue).

Definition 38 (minimal diagnosis). (See [35].) Let 〈SD, COMPS, OBS〉 be the input of a diagnosis problem (SD isa conjunction of propositional formulas representing the system description, COMPS is a set of symbols denotingthe components of the system and OBS is a conjunction of literals representing the initial observations). A minimalconsistency-based diagnosis for 〈SD, COMPS, OBS〉 is a minimal subset � of COMPS such that SD∧OBS∧AB(�)

is consistent, where AB(�) is the formula∧

c∈� ABc ∧ ∧c∈COMPS\� ¬ABc (each ABc is a propositional symbol

meaning that the corresponding component c is “abnormal”, i.e., it does not work properly).

Definition 39 (fault isolation). Let 〈SD, COMPS, OBS〉 be the input of a diagnosis problem, and TB = {t1, . . . , tn} atest base over some of the propositional symbols of the system (set of available measures); we have TB ⊆ Var(SD ∪OBS). The fault isolation problem is the discrimination problem defined by � = SD∧OBS, TB, and HYP = {AB(�) |� is a minimal consistency-based diagnosis for 〈SD,COMPS,OBS〉} ∪ {∧� ¬AB(�) | � is a minimal consistency-based diagnosis for 〈SD,COMPS,OBS〉}.

By construction, HYP is a set of mutually exclusive and exhaustive hypotheses w.r.t. �.Interestingly, there is a direct link between hypothesis discriminability and definability:

Page 21: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1011

Theorem 40. Let 〈�,X,H = {h1, . . . , hn}〉 be a discrimination problem. Let �′ = � ∧∧ni=1(hi ⇔ hi

new), where eachhi

new ∈ PS \ Var({�} ∪ H) is a new symbol. Then X discriminates H w.r.t. � if and only if �′ defines Hnew = {hinew |

i ∈ 1 . . . n} in terms of X.

Proof.

(⇒) If ∀�x ∈ �X ∃h�x ∈ H s.t. �x ∧ � |= h�x , then ∀�x ∈ �X (∃h�x ∈ H s.t. �x ∧ � |= h�x and ∀h ∈ H \ {h�x}, we have�x ∧ � |= ¬h; indeed, � |= ¬h�x ∨ ¬h for every h ∈ H \ {h�x} since H contains mutually exclusive hypothesesgiven �. Thus, ∀�x ∈ �X ∀h ∈ H (�x ∧ � |= h or �x ∧ � |= ¬h) holds. Hence �′ defines Hnew in terms of X.

(⇐) If �′ defines Hnew in terms of X then ∀h ∈ H ∀�x ∈ �X (�x ∧ � |= h or �x ∧ � |= ¬h) holds. Equivalently,∀�x ∈ �X ∀h ∈ H (�x ∧ � |= h or �x ∧ � |= ¬h) holds. Assume that ∀�x ∈ �X ∀h ∈ H �x ∧ � |= ¬h. Since H

is exhaustive given �, this is possible only if � is unsatisfiable. In such a case, X trivially discriminates H

w.r.t. �. In the remaining case, we have that ∀�x ∈ �X ∃h ∈ H �x ∧ � |= h. Hence X trivially discriminates H

w.r.t. �. �Clearly enough, one can take advantage of this polynomial reduction and the results reported in the previous sec-

tions to compute discriminating sets and minimal discriminating sets. Thus, when dealing with mutually exclusive andexhaustive sets of hypotheses, bases can be used to design minimal test inputs [36,37] in order to isolate faulty compo-nents in model-based diagnosis (in this case hypotheses correspond to candidate diagnoses, and testable propositionalsymbols correspond most often to available measurements). Note that McIlraith’s notions of relevant or necessarytests [37] have some counterparts in our framework (for instance, a necessary test corresponds to a propositional sym-bol without which the hypothesis space cannot be discriminated). Lastly, the algorithm for computing bases describedbefore can be used to design conditional test policies (where tests are performed sequentially and conditioned by theoutcomes of previous tests—see [38] for the case of mutually exclusive hypotheses).

Conversely, the definability problem can be also reduced to the hypothesis discriminability problem (in presence ofmutually exclusive hypotheses). Indeed, a consistent formula � defines y in terms of X if and only if X discriminatesH = {y,¬y} w.r.t. �. Since both reductions are polytime ones, this is enough to show that deciding whether X

discriminates H w.r.t. � (HYPOTHESIS DISCRIMINABILITY) is a coNP-complete problem.

6. Propositional definability and reasoning about action and change

6.1. Determinism, executability, and successor state axioms

In this section, we show that definability is also closely related to several issues pertaining to reasoning about actionand change.

Let F be a finite set of fluents (i.e., a subset of PS). Define Ft = {ft | f ∈ F } and Ft+1 = {ft+1 | f ∈ F }, two setsof fluents indexed by time points. Let �α be a propositional action theory describing action α, that is, a formula ofPROPFt∪Ft+1 , such that ( �ft , �f ′

t+1) |= �α holds if and only if �f ′ is a possible successor state of �f by α. The transition

function for α is the binary relation Rα on �F defined by Rα( �f , �f ′) iff ( �ft , �f ′t+1) |= �α . Then:

• α is deterministic if for every �f ∈ �F there is at most one �f ′ ∈ �F such that Rα( �f , �f ′).• α is fully executable if for every �f ∈ �F there is a �f ′ ∈ �F such that Rα( �f , �f ′).

Now, it is easy to check that determinism and full executability are expressed simply within the notions of defin-ability and strong consistency:

Lemma 41.

• α is deterministic if and only if Ft ��α Ft+1.• α is fully executable if and only if �α is strongly Ft -consistent.

Proof. Straightforward. �

Page 22: On propositional definability

1012 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

Putting these two points together, α is deterministic and fully executable if and only if Ft+1 is unambiguouslydefined from Ft w.r.t. �α .

Furthermore, even when an action α is not “fully” deterministic, it may be deterministic for some fluents. Let ussay that α is deterministic for f if and only if for any �f ∈ �F and any two states �f ′

1,�f ′2 ∈ �F such that Rα( �f , �f ′

1)

and Rα( �f , �f ′2) then �f ′

1 and �f ′2 give the same truth value to f . Clearly, definability allows for identifying the fluents

for which α is deterministic. Moreover, when α is deterministic for f , any definition of ft+1 on Ft in �α correspondsto a successor state axiom [39]. (See also the final example of [11].)

Example 42. Let α, β and γ be the actions defined by the following theories:

�α = (at+1 ⇔ ¬at ) ∧ (at ⇒ bt+1) ∧ (bt ⇒ bt+1) ∧ ((¬at ∧ ¬bt ) ⇒ ¬bt+1

),

�β = (at+1 ⇔ ¬at ) ∧ (at ⇒ bt+1) ∧ (bt ⇒ (at+1 ∧ bt+1)

) ∧ ((¬at ∧ ¬bt ) ⇒ ¬bt+1

),

�γ = (at+1 ⇔ ¬at ) ∧ (at ⇒ bt+1) ∧ (bt ⇒ bt+1).

{at , bt } ��α {at+1, bt+1} holds, therefore α is deterministic, and �α is strongly {at , bt }-consistent, therefore α is fullyexecutable. The successor state axiom for b corresponds to the definition of b (it is unique up to logical equivalence,due to Theorem 20): bt+1 ≡ (at ∨ bt ).

�β is not strongly {at , bt }-consistent, because at ∧ bt ∧ �β is inconsistent. Therefore, β is not fully executable(but it is deterministic). There are two non-equivalent definitions of fluents at time t + 1, hence two successor stateaxioms for b: bt+1 ≡ (at ∨ bt ) and bt+1 ≡ (at ⇔ ¬bt ).

{at , bt } ��γ {at+1, bt+1} does not hold, therefore γ is not deterministic (but it is fully executable). However, it isdeterministic for a since {at , bt } ��γ at+1 holds.

Note that usually, we are given initially a set of causal rules from which, using some completion process (e.g.,[40,41]), we compute successor state axioms and then finally �α . Computing successor state axioms as definitions isthe reverse process of the latter completion process: from an action theory already compiled in its propositional form�α , we find the successor state axioms (and then possibly a compact description of the effects of α by causal rules).

Due to the connections made precise by Lemma 41, many notions and results of the paper apply to reasoning aboutaction.

For instance, as a direct consequence of Theorem 20 and Lemma 41, when an action is fully executable anddeterministic for f , there exists only one successor state axiom for f (up to logical equivalence)—it is indeed thecase for α, but not for β in Example 42.

Definability proves also useful for characterizing regression. Given a propositional formula ψ ∈ PROPPS, the (de-ductive) regression of ψ by α is the formula reg(ψ,α) (unique up to logical equivalence) such that Mod(reg(ψ,α)) =⋃

�f ′|=ψR−1

α ( �f ′). The abductive regression of ψ by α is the formula Reg(ψ,α) (unique up to logical equivalence)

such that Mod(Reg(ψ,α)) = { �f | Rα( �f ) ⊆ Mod(ψ)}. While we have Reg(ψ,α) |= reg(ψ,α) in the general case,reg(ψ,α) and Reg(ψ,α) are equivalent when α is deterministic [41].

For any formula ϕ from PROPF , let us note ϕt the formula from PROPFt obtained by substituting in a uniformway in ϕ every symbol a by at ; we have:

Theorem 43. Let α be a deterministic and fully executable action. For any fluent f ∈ F and any formula ψ ∈ PROPF ,reg(ψ,α)t (or equivalently Reg(ψ,α)t ) is equivalent to any definition of zt+1 on Ft in �α ∧ (ψt+1 ⇔ zt+1), where z

is a fresh symbol (not in F ).

Proof. First observe that from Theorem 20 and Lemma 41, it makes sense to consider any definition (since all ofthem are equivalent). Now, we have reg(ψ,α)t ≡ Proj(�α ∧ ψt+1,Ft ) (see Proposition 5 in [41]). The latter formulais also equivalent to Proj(�α ∧ (ψt+1 ⇔ zt+1) ∧ zt+1,Ft ). Finally, Theorem 8 concludes the proof. �

This result can be generalized to actions that are not fully executable. We omit it for the sake of brevity, as well asapplications of definability to progression and planning.

Page 23: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1013

6.2. Ramification

Another role of definability in reasoning about change is in the handling of ramification, or indirect action effects.A way to address the well-known problem consists in finding out fluents that can be derived from primitive ones(called a frame) within the knowledge base, and to apply change on reduced world descriptions (composed of primitivefluents, only) [42]. Many formalisms for reasoning about change, adhere to this approach that has been implementedin various planning systems (e.g., in the early system BUILD [43]).

Let us describe more formally the role of definability for dealing with the ramification problem. Let F be a set offluents, and � be a propositional formula expressing some constraints on the values that fluents may take (at any timepoint). Finding a partition of F between a set FP of primary fluents and a set FD of derived fluents comes down tofind a base for F with respect to �. Clearly, several choices are generally possible, since BS�(F ) is generally not asingleton. The goal being to come up with action descriptions that are as concise as possible, a good heuristics consistsin choosing a base of F of minimum cardinality.

7. Yet another application to AI: Automated reasoning

The notion of definability proves valuable in automated reasoning for several tasks. For instance, identifying func-tionally dependent propositional symbols is a way for finding out variable orderings that may prevent the OBDDrepresentation of a formula from an exponential size blowup [44].

Identifying definability relations between variables can also prove useful for the satisfiability issue. [45] have shownhow definability can be exploited in local search for the satisfiability problem. The idea is to concentrate the searchon undefinable variables, and to handle the remaining ones by exploiting definability relations. They reported someempirical results showing their algorithm DAGSAT valuable. [46] considered the role of definability relations (whatthey called gates) to reduce the search space explored by complete DPLL-like algorithms for SAT. In a nutshell, theidea is that a definable variable should not be elected by a branching rule before the variables from a base for it havebeen assigned. Accordingly, the undefinable variables of the input CNF formula � should be considered first. Sincedeciding definability relations is a coNP-complete problem, they considered only those relations that can be discoveredthrough (linear time) unit propagation of literals (such relations include equivalent literals which have been consideredin several other papers, see e.g., [47]); in [46], the explicit definitions of variables which are discovered take the formof a conjunction of literals or a clause (depending on the sign of the propagated literal); interestingly, once the variablesoccurring in an explicit definition of y have been assigned, unit propagation in � proves enough to get y assignedas well. The resulting set of functional dependencies induces a “relevance” graph whose set of vertices is Var(�)

and the set of arcs contains (x, z) whenever one of the found definitions of variable z bears on variable x. When noundefinable variables occur in � (or the CNF formulas obtained by conditioning and simplifying � at subsequentsteps of the algorithm), the corresponding “relevance graph” contains no source (i.e., a node of incoming degree 0);then polynomial time heuristics for approximating a minimal cycle cutset of the graph are used, and the variables fromthe resulting set (also known as a strong backdoor) are assigned first. This approach exhibited interesting performanceson some benchmarks used during the SAT’02 and SAT’03 competitions, and appeared as the best performer on hand-made instances at the SAT’03 competition. [48] also reported on the possible advantages and drawbacks of takingadvantage of such “independent (i.e., undefinable) variable selection” heuristics.

8. Other related work

As evoked previously, propositional definability is closely related to the notions of strongest necessary and weak-est sufficient conditions and to the notion of functional dependencies in propositional logic. In this section, we makeprecise the main differences between the contribution of the present paper and the (closest) related ones from the liter-ature. Before concluding the paper, we also briefly present some other related work, where definability is consideredin more complex logical settings than classical propositional logic.

Page 24: On propositional definability

1014 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

8.1. Functional dependencies

The closest work to our own one is described in three papers by Ibaraki, Kogan and Makino [15,22,34]. In thosepapers, Ibaraki, Kogan and Makino presented a number of results related to functional dependencies.

In [15,22], they reported many very interesting results about issues that we mainly ignored here. Among them is thecondensation issue: the basic idea comes from the observation that when X �� y and y /∈ X, then � can be simplifiedby “removing” y (i.e., forgetting y in �), while keeping track of an explicit definition of y on X in �; at the semanticallevel, no loss of information results from such a process; condensing � consists in repeating it in an iterative way,unless reaching a formula without any non-trivial functional dependency. While the result of the condensing procedureis not unique in general (it depends on the functional dependency chosen at each step), Ibaraki, Kogan and Makinohave shown that it is unique when � is a Horn CNF formula or more generally a q-Horn CNF formula (given assuch or by its corresponding envelope), and that the condensing process can be achieved in polynomial time in such acase. In [34], the authors considered the problem of computing all the minimal functional dependencies which hold in�. Among other things, they showed that there exists an incrementally polynomial algorithm for achieving this goalwhen � is a Horn CNF formula, or more generally, a q-Horn CNF formula, while the problem is equivalent to theproblem of dualizing a positive theory when � is equivalent to a Horn CNF formula (resp. q-Horn CNF formula) butis given by the Horn (resp. q-Horn) envelope of its models.

A major difference with our present work is that Ibaraki, Kogan and Makino mainly focused on Horn and q-Hornformulas, while our results are mainly about (unconstrained) propositional formulas. Actually, the few results from[15,22,34] which are related to unconstrained propositional formulas have been exhaustively listed in Sections 3.1,4.1, and 4.3. Some of our results generalize their results (e.g., our Theorem 26 gives more tractable classes for the(minimal) definability problem than just the Horn or q-Horn one), and some other results complete them (e.g., theresults presented in Section 4.3—about the computation of explicit definitions—address the general case and, again,give other tractable classes for this issue than just the Horn or q-Horn one).

8.2. Strongest necessary and weakest sufficient conditions

The work by Lin [11] is concerned with strongest necessary and weakest sufficient conditions.In Section 3.2, we have shown close connections between definability and strongest necessary (SNC)/weakest

sufficient conditions (WSC). While Proposition 2 from [11] characterizes definability in terms of WSC and SNC, wehave shown how to characterize all the definitions of y on X in � in terms of SNC and WSC.

First, Theorem 10 shows how SNC and WSC can be characterized using the notion of projection. It extendsTheorem 2 from [11] by relaxing the assumption that y ∈ Var(�) and y /∈ X, and focus on the logically strongest (resp.weakest) SNC Proj(� ∧ y,X) (resp. WSC) ¬Proj(� ∧ ¬y,X) of y on X w.r.t. �, up to logical equivalence. ThenTheorem 8 shows that �X is a definition of y on X in � if and only if Proj(� ∧ y,X) |= �X |= ¬Proj(� ∧ ¬y,X).

8.3. Definability in other logical settings

Since Padoa and Beth, there has been a considerable amount of work on definability and interpolation in variousclasses of logics. A logic is said to have the projective Beth definability property if and only if implicit definabilityequals explicit definability. As pointed out in [5], implicit definability being a semantical (model-theoretic) conceptwhereas explicit definability is a syntactic (proof-theoretic) concept, to say that both forms of definability coincidein a given logic is a good indication that there is a good balance between syntax and semantics in the logic. Thereare two main streams of works: definability in fragments of first-order logic, and definability in propositional modallogics. We briefly discuss these two streams of work, by pointing to some of the most relevant references. A compre-hensive review can be found in Chapter 2 of [5]. See also the excellent book [49] for connections with second orderquantification in many propositional logics.

Definability in predicate logic starts with Padoa’s work and, later on, Beth’s theorem. The latter [4] shows thatfirst-order logic has the definability property. The question is now whether given fragments of first-order logic stillhave the property. For instance, the k-variable fragment of first-order logic fails to satisfy it [50,51], as well as in alarge number of first-order modal logics (e.g., [52]), while it holds in intuitionistic predicate logic [53,54]. Definabilityin fragments of first-order logic also has an impact on the database community (e.g., [55]).

Page 25: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1015

As for propositional logics, a large number thereof satisfy the definability property. For instance, Kreisel [56]proves that this is the case for any logic between classical propositional logic and intuitionistic propositional logic.A large number of works has concentrated on modal logics (e.g. [57–59]), and especially (and more AI related) ondescription logics [60,61] (the latter paper focuses on computational issues; especially, they give bounds on the sizeof explicit definitions).

Our work, focusing on classical propositional logic, is not of the same nature as most of the abovementioned works(apart of the works about description logics). Our focus is on the computational issues of the problems related todefinability, as well as on the applications to artificial intelligence problems.

9. Conclusion

This paper is centered on definability in standard propositional logic and reports a number of results issued from ourcomputation-oriented investigation of this notion. Especially, we gave several characterization results, and complexityresults for definability and related notions. We also presented a number of applications of such results in several AIproblems, including hypothesis discrimination, reasoning about actions and automated reasoning.

This work calls for a number of perspectives. First, an alternative way of characterizing logical definability (andrelated notions) would consist in expressing it in epistemic logic, remarking that for any propositional formula � ∈PROPPS, X ⊆ PS, and y ∈ PS, we have X �� y holds if and only if (K� ∧ ∧

x∈X(Kx ∨ K¬x)) ⇒ (Ky ∨ K¬y)

is a theorem of S5. From this we can derive characterizations for other notions, such as minimal defining families,undefinable variables, etc. The results stated in the paper would then be easily reformulated (in different terms) in thissetting.

Second, the notion of definability studied in this paper is rather strong, and it would be worth to relaxing the notionof definability. Doing so is not easy if the background knowledge � is still expressed by a mere propositional formula;now, if instead of � we have a probability distribution over �, expressed succinctly for instance by a Bayesiannetwork N whose induced probability distribution is pN , then definability becomes itself a probabilistic, decision-theoretic notion: defining δ(N,X,y) = ∑

�x∈�XpN(�x).max(pN(y|�x),pN(¬y|�x)), then δ(N,X,y) can be interpreted

as the prior probability of guessing the right value of y after observing the values of variables in X, and is probablythe most natural generalization of definability (obviously, � defines y in terms of X in the usual way if and only ifδ(N,X,y) = 1). Thus, a natural decision problem in this setting would be: given a Bayesian network N , a set X ofvariables, a variable y and α ∈ [0,1], determine whether δ(N,X,y) � α. Another notion of probabilistic definabilityarises when the probabilistic background knowledge is expressed by a set of constraints in probabilistic logic [62]—inthis case we do not have a single probability distribution but a set of probability distributions, and the latter notionmust be updated accordingly. A computational investigation of these probabilistic notions of definability is left forfurther research.

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments. The second author has beenpartly supported by the IUT de Lens, the Université d’Artois, the Région Nord/Pas-de-Calais, the IRCICA consortiumand by the European Community FEDER Program.

References

[1] J. Lang, P. Liberatore, P. Marquis, Conditional independence in propositional logic, Artificial Intelligence 141 (1–2) (2002) 79–121.[2] J. Lang, P. Liberatore, P. Marquis, Propositional independence—formula-variable independence and forgetting, Journal of Artificial Intelli-

gence Research 18 (2003) 391–443.[3] W. Craig, Three uses of the Herbrand–Gentzen theorem in relating model theory and proof theory, Journal of Symbolic Logic 22 (1957)

269–285.[4] E. Beth, On Padoa’s method in the theory of definition, Indagationes Mathematicae 15 (1953) 330–339.[5] E. Hoogland, Definability and interpolation—model-theoretic investigations, Ph.D. thesis, Institute for Logic, Language and Computation,

University of Amsterdam, 2001.[6] E. Boros, Y. Crama, P. Hammer, Polynomial-time inference of all valid implications for Horn and related formulae, Annals of Mathematics

and Artificial Intelligence 1 (1990) 21–32.[7] A. Darwiche, Compiling devices into decomposable negation normal form, in: Proc. of IJCAI’99, 1999, pp. 284–289.

Page 26: On propositional definability

1016 J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017

[8] P. Marquis, Consequence finding algorithms, in: Handbook on Defeasible Reasoning and Uncertainty Management Systems, 5, Kluwer Aca-demic Publisher, 2000, pp. 41–145, Chapter 2.

[9] A. Darwiche, Decomposable negation normal form, Journal of the Association for Computing Machinery 48 (4) (2001) 608–647.[10] F. Lin, R. Reiter, Forget it! in: Proc. of the AAAI Fall Symposium on Relevance, New Orleans, 1994, pp. 154–159.[11] F. Lin, On the strongest necessary and weakest sufficient conditions, Artificial Intelligence 128 (2001) 143–159.[12] M. Krom, The decision problem for formulas in prenex conjunctive normal form with binary disjunctions, Journal of Symbolic Logic 35

(1970) 210–216.[13] A. Horn, On sentences which are true of direct unions of algebras, Journal of Symbolic Logic 16 (1951) 14–21.[14] H. Lewis, Renaming a set of clauses as a Horn set, Journal of the Association for Computing Machinery 25 (1978) 134–135.[15] T. Ibaraki, A. Kogan, K. Makino, On functional dependencies in q-Horn theories, Artificial Intelligence 131 (1–2) (2001) 171–187.[16] A. Blake, Canonical expressions in boolean algebra, Ph.D. thesis, University of Chicago, Chicago, IL, 1937.[17] E. Boros, P. Hammer, X. Sun, Recognition of q-Horn formulae in linear time, Discrete Applied Mathematics 55 (1994) 1–13.[18] P. Tison, Generalization of consensus theory and application to the minimization of boolean functions, IEEE Transactions on Electronic

Computers EC-16 (1967) 446–456.[19] C.H. Papadimitriou, Computational Complexity, Addison-Wesley, 1994.[20] W. Armstrong, Dependency structures of database relationships, in: Proc. of IFIP’74, 1974, pp. 580–583.[21] R. Fagin, Functional dependencies in a relational database and propositional logic, IBM Journal of Research and Development 21 (1977)

534–544.[22] T. Ibaraki, A. Kogan, K. Makino, Functional dependencies in Horn theories, Artificial Intelligence 108 (1–2) (1999) 1–30.[23] T. Castell, Computation of prime implicates and prime implicants by a variant of the Davis and Putnam procedure, in: Proc. of ICTAI’96,

IEEE Computer Society, Washington, DC, USA, 1996, p. 428.[24] A. Padoa, Essai d’une théorie algébrique des nombres entiers, précédé d’une introduction logique à une théorie déductive quelconque, in:

Bibliothèque du Congrès International de Philosophie, Paris, 1903, pp. 309–365.[25] A. Darwiche, P. Marquis, A knowledge compilation map, Journal of Artificial Intelligence Research 17 (2002) 229–264.[26] A. Darwiche, P. Marquis, A perspective on knowledge compilation, in: Proc. of IJCAI’01, 2001, pp. 175–182.[27] G. Lakemeyer, A logical account of relevance, in: Proc. of IJCAI’95, 1995, pp. 853–859.[28] S. Akers, Binary decision diagrams, IEEE Transactions on Computers C-27 (6) (1978) 509–516.[29] R. Bryant, Graph-based algorithms for boolean function manipulation, IEEE Transactions on Computers C-35 (8) 677–692.[30] S. Coste-Marquis, D. Le Berre, F. Letombe, P. Marquis, Propositional fragments for knowledge compilation and quantified boolean formulae,

in: Proc. of AAAI’05, 2005, pp. 288–293.[31] H. Fargier, P. Marquis, On the use of partially ordered decision graphs in knowledge compilation and quantified boolean formulae, in: Proc.

of AAAI’06, 2006, pp. 42–47.[32] D. Mundici, Tautologies with a unique Craig interpolant, uniform vs. non-uniform complexity, Annals of Pure and Applied Logic 27 (1974)

265–273.[33] R. Rymon, An SE-tree-based prime implicant generation algorithm, Annals of Mathematics and Artificial Intelligence 11 (1994) 329–349,

special issue on model-based diagnosis.[34] T. Ibaraki, A. Kogan, K. Makino, Inferring minimal functional dependencies in Horn and q-Horn theories, Annals of Mathematics and Artificial

Intelligence 38 (2003) 233–255.[35] R. Reiter, A theory of diagnosis from first principles, Artificial Intelligence 32 (1987) 57–95.[36] P. Struss, Testing for the discrimination of diagnoses, in: Proc. of DX’94, 1994, pp. 312–320.[37] S. McIlraith, Generating tests using abduction, in: Proc. of KR’94, 1994, pp. 449–460.[38] J. Lang, Planning to discriminate diagnoses, in: Proc. of DX’97, 1997, pp. 135–139.[39] R. Reiter, Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems, MIT Press, 2001.[40] E. Giunchiglia, V. Lifschitz, An action language based on causal explanation: Preliminary report, in: Proc. of AAAI’98, 1998, pp. 623–630.[41] J. Lang, F. Lin, P. Marquis, Causal theories of action: a computational core, in: Proc. of IJCAI’03, 2003, pp. 1073–1078.[42] V. Lifschitz, Frames in the space of situations (research note), Artificial Intelligence 46 (1990) 365–376.[43] S. Fahlman, A planning system for robot construction tasks, Artificial Intelligence 5 (1974) 1–49.[44] A.J. Hu, D.L. Dill, Reducing BDD size by exploiting functional dependencies, in: Proc. of 30th ACM/IEEE Design Automation Conference,

1993, pp. 266–271.[45] H. Kautz, D. McAllester, B. Selman, Exploiting variable dependency in local search (abstract), in: Proc. of IJCAI’97 (poster session), 1997,

p. 57.[46] É. Grégoire, R. Ostrowski, B. Mazure, L. Sais, Automatic extraction of functional dependencies, in: Proc. of SAT’04 (Selected Papers), in:

Lecture Notes in Computer Science, vol. 3542, Springer, 2005, pp. 122–132.[47] D. Le Berre, Exploiting the real power of unit propagation lookahead, in: Proc. of SAT’01, in: Electronic Notes in Discrete Mathematics,

vol. 9, Elsevier Science Publishers, 2001.[48] E. Giunchiglia, M. Maratea, A. Tacchella, Dependent and independent variables in propositional satisfiability, in: Proc. of JELIA’02, 2002,

pp. 296–307.[49] S. Ghilardi, M. Zawadowski, Sheaves, Games, and Model Completions, Trends in Logic—Studia Logica Library, vol. 14, Kluwer Academic

Publishers, Dordrecht, 2002, a categorical approach to nonclassical propositional logics.[50] I. Sain, Beth’s and Craig’s properties via epimorphisms and amalgamation in algebraic logic, in: Algebraic Logic and Universal Algebra in

Computer Science, in: Lecture Notes in Computer Science, vol. 24, Springer, 1990, pp. 209–225.[51] I. Hodkinson, Finite variable logics, Bulletin of the European Association for Theoretical Computer Science 51 (1993) 111–140.

Page 27: On propositional definability

J. Lang, P. Marquis / Artificial Intelligence 172 (2008) 991–1017 1017

[52] K. Fine, Failure of the interpolation lemma in quantified modal logic, Journal of Symbolic Logic 44 (2) (1979) 201–206.[53] K. Schütte, Der Interpolationssatz der intuitionistischen Prädikatenlogic, Mathematische Annalen 148 (1962) 192–200.[54] D. Gabbay, Semantic proof of Craig’s interpolation theorem for intuitionistic logic and extensions, in: Proc. of Logic Colloquium’69, 1971,

pp. 403–410.[55] L. Segoufin, V. Vianu, Views and queries: determinacy and rewriting, in: Proc. of PODS’05, 2005, pp. 49–60.[56] G. Kreisel, Explicit definability in intuitionistic logic, Journal of Symbolic Logic 25 (1960) 389–390.[57] L. Maksimova, An analog of Beth’s theorem in normal extensions of the modal logic K4, Siberian Mathematical Journal 33 (6) (1993)

1052–1065.[58] A. Urquhart, Beth’s definability theorem in relevant logics, in: E. Orlowska (Ed.), Logic at Work: Essays Dedicated to the Memory of Helena

Rasiowa, Springer-Verlag, 1999, pp. 229–234.[59] L. Maksimova, Definability and interpolation in non-classical logics, Studia Logica 82 (2) (2006) 271–291.[60] F. Baader, W. Nutt, Basic description logics, in: Description Logic Handbook, 2003, pp. 43–95.[61] B. ten Cate, W. Conradie, M. Marx, Y. Venema, Definitorially complete description logics, in: Proc. of KR’06, 2006, pp. 79–89.[62] N. Nilsson, Probabilistic logic, Artificial Intelligence 28 (1987) 71–87.