Natural Language Understanding Understanding NL (infinite language) means determining the meaning of...

23
Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is used. It requires an analysis of the sentence on several different levels: Syntactic Semantic Pragmatic Discourse Syntactic: Syntax (grammar) of the sentence is checked. Syntax is a tool for describing the structure of sentences in the language. Semantics: denotes the ‘literal’ meaning

Transcript of Natural Language Understanding Understanding NL (infinite language) means determining the meaning of...

Page 1: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Natural Language Understanding

• Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is used.• It requires an analysis of the sentence on several different levels:

• Syntactic• Semantic• Pragmatic• Discourse

Syntactic: Syntax (grammar) of the sentence is checked. Syntax is a tool for describing the structure of sentences in the language.Semantics: denotes the ‘literal’ meaning we ascribe to a sentence.

Page 2: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Pragmatics: refers to intended meaning of a sentence. How sentences are used in a different contexts and how context affects the interpretation of the sentence.Discourse: refers to conversation between two or more individuals.

Basic parsing TechniquesContext free grammars

S NP, VPVP verb, NPNP det, NPNP det, noun, NPNP det, adj*, NP

• One can have top-down or bottom-up parsers.

Page 3: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Simple Transition Networks

• More convenient for visualizing the grammar (CFG)• It consists of nodes and labeled arcs. • Network for NP.

det noun pop

NP NP1 NP2

Adj

• Arcs are labeled by word category.• Starting at a given node, we can traverse an arc if the current word in the sentence is in the category on the arc.

Page 4: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• This network recognizes the same set of sentences as the following CFG.

NP det, NP1NP1 adj, NP1NP1 noun

• Simple Transition Network formalism is not powerful enough to describe all languages that can be described by CFG. • Recursive grammar can’t be defined by Transition Network.Recursive Transition Networks (RTN)• To get the descriptive power of CFG, you need a notion of recursion in the network grammar.• In RTN which is like a simple Transition Network except that it allows arc labels that refer to other networks rather than word categories.

Page 5: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• The network for simple English sentences can be expressed as

NP verb NP pop

S S1 S2 S3

• Uppercase labels refers to networks.• The arc from S to S1 can be followed only if the NP network can be successfully traversed to a pop arc.• RTN might allow to have an arc labeled with its own name.• Any language generated by CFG can be generated by RTN and vice verse. Thus they are equivalent in their generative capacity.

Page 6: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Implementation of RTN in Prolog Vocabulary for RTN can be stored as a set of facts as:

word_type (“ an“, det ).word_type (“ the“, det ).word_type (“man“, noun ).word_type (“apple “,noun ).word_type (“ eats“, verb ).

The top level clause for RTN is called run and is as follows:

run :- set_state(s0), writeln(“Enter your sentence), readln(Sent), analyze(Sent), writeln(“Your sentence is syntactically correct”), clear_state.

run :- writeln(“Your sentence is wrong syntactically”), clear_state.

Page 7: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

set_state(S) :- assert(current_state(S)). /* Initialize current state to s0 */

analyze (S) :- S = “.”, final_state(_).analyze (S) :- get_state(NS), !, transition(NS, S, S1),

analyze (S1).

get_state(S) :- current_state(S).

/* transition states are listed as */transition (s0, A, B) :- check_np(np, A, B),

set_state(s1).transition(s1, A, B) :- get_token(A, W, B),

word_type(W, verb), set_state(s2).

Page 8: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

transition (s2, A, B) :- check_np(np, A, B), set_state(s3).

transition (np, A, B) :- get_token(A, W, B), word_type(W, det), set_state(np1).

transition (np1, A, B) :- get_token(A, W, B), word_type(W, noun),

set_state(np2).transition (np1, A, B) :- get_token(A, W, B),

word_type(W, adj), set_state(np1).

check_np(np2, A, B) :- A = B, !.check_np(St, A, B) :- transition(St, A, C),

get_state(Ns), check_np(Ns, C, B), !.

Page 9: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

final_state(s3) :- current_state(s3).

/* get a token from the sentence */ get_token(A, W, B) :- ?

clear_state :- retract(current_state(_)), fail.clear_state.

Query: ?- runthe man eats an apple.Yes.?- runman eat apple.Yes. (This is actually wrong but we have not taken care of number agreement into account)

Page 10: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• These formalisms are limited in the following ways.• They could only accept or reject a sentence rather than producing a analysis of the structure of the sentence.

• Augmenting RTN formalism which involves both generalizing the network notation and introducing more information about words in the structure, collecting information and testing features while paring becomes augmented transition network (ATN).• Similar kind of augmentation & extension can be made to CFG called Definite Clause Grammar.

Recording sentence Structure while parsing in ATN.• Collect the structure of legal sentence in order to further analyze them.

Page 11: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• For instance we can identify one particular noun phase as the syntactic subject (SUBJ) of a sentence and another as the syntactic object of the verb (OBJ)• Within noun phrase we might identify the det structure, adjective, the head noun and so on.• Thus the sentence “Jack found a bag” might be represented by the structure

(S ( SUBJ (NP NAMEjack)

MAIN-V foundTENSE PASTOBJ (NP DET a

HEAD bag

))

• Such a structure is created using RTN parser by allowing each network to have a set of registers.

Page 12: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• Registers are local to each network. Thus each time a new network is pushed and new set of empty registers are created.• When the network is popped, the registers disappear.• Registers can be set to the values and these can be retrieved from registers.• NP network has registers names such as, DET, ADJS, HEAD and NUM.• Registers are set by actions that can be specified on the arc.• When an arc is followed, the actions associated with it are executed.• The most common act involves setting the register to a certain value.• When a pop arc is followed, all the registers set in the current network are automatically collected to form a structure consisting of the network followed by a list of the registers with their values.

Page 13: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• When a category arc, such as name or verb etc. is followed, the word in the input is put into a special variable named as *.

nameS1 S2

• Thus the plausible action on the arc from S1 to S2 would be to set NAME register to the current word.• It is written as

NAME *

Page 14: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• The push arcs such as NP must be treated differently. • Typically many words will be used by the network called using push arc. • The network used in push would have set of registers that capture the structure of the constituent that was parsed.• The structure built by the pushed network is returned in the value *.

NPS S1

• Thus the action on the arc from S to S1 might be SUBJ *

Page 15: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

• Therefore, a RTN with registers and tests with actions on those registers, is an Augmented Transition network (ATN).

NP verb NP pop

S: S S1 S2 S3

jump

det noun pop

NP: NP NP1 NP2

adj

name

Page 16: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Arc Test ActionsNP/1 none DET *

NUM NUM*

NP/2 none NAME *

NUM NUM*

NP1/1 NUM NUM* {then action is taken HEAD

* otherwise it fails } NUM NUM

NUM*

NP1/2 none ADJS Append (ADJS, *)

S/1 none SUBJ *

S1/1 NUMSUBJ NUM* {then action is taken MAIN_V

* otherwise it fails } NUM NUM SUBJ NUM*

S2/1 none OBJJ *

Page 17: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Notations:• NUM* - is the NUM register of the structure in *• NUMSUBJ - is the NUB register of the structure in SUBJ

• The values of the registers are often viewed as sets and the intersection () and union () of sets are allowed to combine the values of different registers.• For the registers that may take a list of values, an append function is permitted.• Append (ADJS,*) returns the list that is the list in the register ADJS with the value of * appended on the end.• The sentence, with the word positions indicated, is as follows:

1 The 2 dogs 3 love 4 john 5.

Page 18: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

A simple Lexicon:

Word Representationdogs (NOUN ROOT dog

NUM {3p} )dog (NOUN ROOT dog

NUM {3s} )the (DET ROOT the

NUM (3s, 3p} )Love (VERBROOT love

NUM {3p} )John (NAME ROOT john )

Page 19: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Trace of S NetworkStep Node Position Arc followed Registers1 S 1 S / 1 -2 NP 1 NP / 1 [ DET the,

NUM {3s, 3p}]3 NP1 2 NP / 1

{check {3s, 3p } {3p} }[HEAD dogs, NUM {3p}]

4 NP2 3 NP2 / 1 return structure{NP { DET the

HEAD dogs NUM {3p} }

}5 - 3 S / 1 succeeds SUBJ

{NP { DET the HEAD dogsNUM {3p} } }

S1 3 S1 / 1{check {3p } {3p} } [MAIN_V love,

NUM {3p}]

Page 20: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

6 S2 4 S2 / 1 OBJ *7 NP 4 NP / 2 { NAME john

NUM {3p} }

8 NP2 5 NP2 / 1 return structureOBJ {NP { NAME john

NUM {3p} }}

9 S3 5 S3 / 1 returnsucceeds {S {SUBJ

{NP { DET the HEAD dogs, NUM {3p}

}MAIN_V love,

NUM {3p}]OBJ

{NP { NAME john NUM {3p} }

} )

Page 21: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

Implementation of ATN in Prolog• Database clauses will be used to store and read the registers.

run :- set_state (s0) /* initialize state */,write (“ATN analyses your sent”), nl,write (“Please type in your sentence”),readln (Sent), analyse (Sent),write (“Your sent is syntactically correct), nl , clear_dbase.

run :- write(“ your sent is syntactically wrong”), clear_dbase.

clear_dbase :- retract(_), fail.clear_dbase.

Page 22: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

analyse (S) :- S = “.”, final_state(_). analyse(S) :- current_state(N_state), !,

transition(N_state, S, S1), !, analyse(S1).

/* main transistions */transition(s0, A, B) :- get_token(A, W, B),

word_type(W, verb), asserta(type_reg (“QUEST”)), asserta(verb_reg(W)), set_state(s2).

transition(s0, A, B) :- check_np(np, A, B), asserta(type_reg (“DECL”),build_phrase(np, STR), assert(subj_reg (STR)),

set_state(s1).

Page 23: Natural Language Understanding Understanding NL (infinite language) means determining the meaning of the sentence with respect to contest in which it is.

/* NP transition */check_np(np2, C, B) :- B = C, !.check_np(np3, C, B) :- B = C, !.check_np(St, A, B) :- transition(St, A, C),

get_state(N), check_np(N, C, B).

/* Build Phrases */

build_phrase(np, STR) :- det_reg(DET), adj_reg(ADJ), noun_reg(NOUN), get_template(np, T), fill_template(T, AUX, T1), fill_template(T1, VERB, STR).