Definite Clause Grammarpjh/modules/current/02630/... · Prolog’s in-built grammar rule notation...
Transcript of Definite Clause Grammarpjh/modules/current/02630/... · Prolog’s in-built grammar rule notation...
06-25433 – Logic Programming
Definite Clause Grammar
Context-Free Grammar (CFG) is introduced as a way of writing rules about structured knowledge. Definite Clause Grammar (DCG) is Prolog’s in-built notation for writing CFGs.
06-25433 – Logic Programming
12 - Definite Clause Grammar 1
This lecture …
Writing DCGs is introduced showing:
– the basic framework;
– embedding calls to “ordinary” Prolog;
– building structures.
DCGs suffer from problems with left-recursive rules.
DCGs are a general programming tool with applications beyond language parsing.
06-25433 – Logic Programming
12 - Definite Clause Grammar 2
Type testing by scanning
Imagine we have a Prolog term and want to decide if it is:
• integer (e.g. 1, 123, …)
• atom (e.g. a, abc, aBC12)
• Variable (e.g. Butter, _123)
Also, assume we have a list of the terms as individual atoms, e.g. [' 1 ', ' 2 ', ' 3 ']
[a, ' B ', ' C ', ' 1 ', ' 2 ']
06-25433 – Logic Programming
12 - Definite Clause Grammar 3
What do we know about atoms, integers
and variables?
(For the time being), an atom begins with a lowercase letter and can be followed by any upper or lowercase letter, digit or ‘_’:
atom ::= lowercase other_symb
other_symb ::= lowercase other_symb |
uppercase other_symb |
digit other_symb |
"" % i.e. nothing
06-25433 – Logic Programming
12 - Definite Clause Grammar 4
Writing this in Prolog
term(atom) -->
lower_case, remaining_terms.
lower_case -->
[Letter],
{ Letter @>= 'a',
Letter @=< 'z' }.
… continued
Code within { … } is
treated as “normal”
Prolog code.
@>=, @>, @=< and
@< test term equality
or precedence.
06-25433 – Logic Programming
12 - Definite Clause Grammar 5
Writing this in Prolog
remaining_terms -->
( lower_case ;
upper_case ;
under_score ;
digit ),
remaining_terms.
remaining_terms -->
[].
“;” is another way of
expressing OR-
choice. It is best used
only when the options
are deterministic.
06-25433 – Logic Programming
12 - Definite Clause Grammar 6
What does this mean?
lower_case -->
[Letter],
{ Letter @>= 'a',
Letter @=< 'z' }.
is Prolog short-hand for writing:
lower_case([Letter|S], S) :-
Letter @>= 'a',
Letter @=< 'z'.
06-25433 – Logic Programming
12 - Definite Clause Grammar 7
What does this mean?
term(atom) -->
lower_case, remaining_terms.
is Prolog short-hand for writing:
term(atom, S0, S) :-
lower_case(S0, S1),
remaining_terms(S1, S).
06-25433 – Logic Programming
12 - Definite Clause Grammar 8
What does this mean?
remaining_terms(S0, S) :-
( lower_case(S0, S1)
;
upper_case(S0, S1)
;
under_score(S0, S1)
;
digit(S0, S1) ),
remaining_terms(S1, S).
remaining_terms(S, S).
06-25433 – Logic Programming
12 - Definite Clause Grammar 9
What is Prolog doing?
Meta-interpreting
This means writing code in one form and compiling it (automatically) into another, runable, form.
Meta-interpreting is usually used to allow domain experts to write knowledge in a user-friendly way but compile it into machine-friendly code.
This is similar to how we transformed formulas in logic.
06-25433 – Logic Programming
12 - Definite Clause Grammar 10
Prolog’s in-built
grammar rule notation
Definite Clause Grammar (DCG) is an in-built notation that looks like a CFG. DCGs can be executed as Prolog programs. This means that DCGs run exactly like Prolog: top-down and depth-first. (DCGs can also be used as a rule base to be used by another Prolog program – e.g. a chart parser.)
06-25433 – Logic Programming
12 - Definite Clause Grammar 11
A first anatomy of DCGs - 1
A rule is written:
left_hand --> right_side1, right_side2, dict_entry.
We can write words directly into rules as follows:
left_hand -->
right_side1,
[noddy],
right_side2.
06-25433 – Logic Programming
12 - Definite Clause Grammar 12
A first anatomy of DCGs - 2
Dictionary entries are written as:
dict_entry --> [the].
dict_entry --> [river,avon].
06-25433 – Logic Programming
12 - Definite Clause Grammar 13
What a DCG is compiled into - 1
Our rules become:
left_hand(S0, S) :-
right_side1(S0, S1),
right_side2(S1, S2),
dict_entry(S2, S).
left_hand(S0, S) :-
right_side1(S0, S1),
‘C’(S1,noddy, S2)
right_side2(S2, S).
06-25433 – Logic Programming
12 - Definite Clause Grammar 14
What a DCG is compiled into - 2
Dictionary entries become:
dict_entry(S0, S) :-
‘C’(S0, the, S).
and there is an in-built fact:
‘C’([Token|S], Token, S).
06-25433 – Logic Programming
12 - Definite Clause Grammar 15
Using DCG in a program checker
One of the strengths of declarative languages such as Prolog and Haskell is the ease with which programs can be written to manipulate other programs – or themselves.
This program checks that there are clauses for every subgoal in a program.
06-25433 – Logic Programming
12 - Definite Clause Grammar 16
The general idea
Given a clause such as:
read_text(Current_Word) :-
look_up(Current_Word),
read(Next_Word),
read_text(Next_Word).
check there is a rule or fact for each subgoal (unless the subgoal is built-in, like read/1).
06-25433 – Logic Programming
12 - Definite Clause Grammar 17
Design
At the highest level:
1. Open a file, read in a program to a list and close the file;
2. Parse each clause, listing goals (heads) and subgoals (from the bodies of rules);
3. Check that each subgoal has a definition and report to the user.
06-25433 – Logic Programming
12 - Definite Clause Grammar 18
Open a file, read in a program to a list
and close the file The code for this is on the WWW and described in the notes.
It is fairly straightforward for someone who knows how to open, read and close files in another language.
The important point is the output is a list of clauses: [skills(fred,jones,C++),
(happy_student(_6016):-
module_reg(_6016,prolog))]
06-25433 – Logic Programming
12 - Definite Clause Grammar 19
Parse each clause, listing goals (heads)
and subgoals (from the bodies of rules) This is easy using DCG:
clause(Goals0, Goals,
Sub_Goals, Sub_Goals) -->
[Fact],
{
% check this isn’t a rule
Fact \= (_ :- _),
% extract the fact as a goal
add_goal(Fact, Goals0, Goals)
}.
06-25433 – Logic Programming
12 - Definite Clause Grammar 20
Parse each clause, listing goals (heads)
and subgoals (from the bodies of rules) This is easy using DCG:
clause(Goals0, Goals,
Sub_Goals0, Sub_Goals) -->
[(Head :- Body)],
{
% extract the head as a goal
add_goal(Head, Goals0, Goals),
% extract the subgoals
body(Sub_Goals0, Sub_Goals, Body)
}.
06-25433 – Logic Programming
12 - Definite Clause Grammar 21
Processing bodies
The body of a Prolog rule is a conjunction of terms:
body(Sub_Goals0, Sub_Goals,
(Body, Bodies)) :-
add_goal(Body, Sub_Goals0,
Sub_Goals1),
body(Sub_Goals1, Sub_Goals, Bodies).
body(Sub_Goals0, Sub_Goals, Body) :-
Body \= (_,_),
add_goal(Body, Sub_Goals0,
Sub_Goals).
06-25433 – Logic Programming
12 - Definite Clause Grammar 22
Parsing clauses
This follows a very common pattern in parsing with DCGs:
clauses(Goals0, Goals,
Sub_Goals0, Sub_Goals) -->
clause(Goals0, Goals1,
Sub_Goals0, Sub_Goals1),
clauses(Goals1, Goals,
Sub_Goals1, Sub_Goals).
clauses(Goals, Goals,
Sub_Goals, Sub_Goals) --> [].
06-25433 – Logic Programming
12 - Definite Clause Grammar 23
Check that each subgoal has a definition
and report to the user For each subgoal, check that there is a corresponding goal in the Goal list.
This is very similar to checking history lists. The main checking code is:
% subgoal is not ins goal list
not_member(Goals, Sub_Goal/Arity),
% check subgoal is not a built-in
functor(Predicate,S ub_Goal,Arity),
\+ predicate_property(Predicate,built_in)
06-25433 – Logic Programming
12 - Definite Clause Grammar 24
The basic idea of
Context-Free Grammar - 1
A CFG has several parts:
grammar rule
grammar rule
grammar
rule left-hand
right-hand
left-hand symbol
06-25433 – Logic Programming
12 - Definite Clause Grammar 25
The basic idea of
Context-Free Grammar - 2
right-hand symbol
right-hand symbol
right-hand
dictionary entry
dictionary entry
dictionary
06-25433 – Logic Programming
12 - Definite Clause Grammar 26
Context-Free Grammar (CFG) - 1
Context-free grammar is a formalism for writing rules that describe things that are structured.
This is a grammar for a sentence:
S NP VP
NP determiner noun
VP verb PP
PP preposition NP
We will use
abbreviations in
our grammar: prep
and det.
06-25433 – Logic Programming
12 - Definite Clause Grammar 27
Context-Free Grammar (CFG) - 2
and this is the lexicon:
determiner the
noun cat
noun mat
preposition on
verb sat
06-25433 – Logic Programming
12 - Definite Clause Grammar 28
Context-Free Grammar (CFG) - 3
Applying these rules we get:
S
NP VP
det noun verb PP
the cat sat prep NP
on det noun
the mat
06-25433 – Logic Programming
12 - Definite Clause Grammar 29
A Definite Clause Grammar (DCG) - 1
DCG allows us to write CFGs in Prolog that look almost exactly like CFGs:
s --> np, vp. np --> det, noun.
vp --> verb, pp. pp --> prep, np.
det --> [the]. noun --> [cat].
noun --> [mat]. prep --> [on].
verb --> [sat].
06-25433 – Logic Programming
12 - Definite Clause Grammar 30
Definite Clause Grammar (DCG) - 2
We can add extra arguments to DCGs - e.g. to make a [syntax] phrase structure tree:
s(s(NP, VP)) --> np(NP), vp(VP).
np(np(Det, Noun)) --> det(Det),
noun(Noun).
etc.
det(det(the)) --> [the].
noun(noun(cat)) --> [cat].
etc. Demo 2
06-25433 – Logic Programming
12 - Definite Clause Grammar 31
Problems with Prolog’s
depth-first search
As with all Prolog programs, left-recursive rules will give problems:
% left recursive
np(np(NP1, NP2)) -->
np(NP1),
noun(NP2).
np(np(Det)) -->
det(Det).
det(det(the)) --> [the].
noun(noun(car)) --> [car].
06-25433 – Logic Programming
12 - Definite Clause Grammar 32
Working around left-recursive rules - 1
As with all Prolog programs, left-recursive rules will give problems:
Method 1 - remove left-recursive rule by renaming:
np(np(NP1, NP2)) -->
np1(NP1),
noun(NP2).
np1(np(Det)) --> det(Det).
det(det(the)) --> [the].
noun(noun(car)) --> [car].
06-25433 – Logic Programming
12 - Definite Clause Grammar 33
Working around left-recursive rules - 2
Method 2
– Keep a list of points in the parsing
– Examine the list to ensure that you’re not repeating a point.
np(np(NP1,NP2),History0,History,S0,S) :-
\+ memb(entry(np, S0), History0),
np(NP1, [entry(np, S0)|History0],
History1, S0, S1),
noun(NP2, [entry(noun,S1)|History1],
History, S1, S).
06-25433 – Logic Programming
12 - Definite Clause Grammar 34
Working around left-recursive rules - 3
Method 2 (continued)
np(np(Det), History0, History) -->
det(Det, History0, History).
06-25433 – Logic Programming
12 - Definite Clause Grammar 35
Summary
DCGs are a powerful and convenient implementation of CFG in Prolog.
They allow rules which specify
– structure building
– arbitrary embedded Prolog code
Because DCGs use Prolog’s depth-first search, they have problems with left-recursive rules – but this can be eliminated.
DCGs are more than a language parsing tool – they have uses in a wide variety of programs.