Natural Language Processing Syntax. Syntactic structure John likes Mary PN VtVt NP VP S DetPNVtVt NP...

15
Natural Language Natural Language Processing Syntax Processing Syntax

Transcript of Natural Language Processing Syntax. Syntactic structure John likes Mary PN VtVt NP VP S DetPNVtVt NP...

Natural Language Processing Natural Language Processing SyntaxSyntax

Syntactic structureSyntactic structure

John likes Mary

PN PNVt

NPNP

VP

S

Det PNVt

NP

NPVP

S

Every man likes Mary

Noun

Syntactic structureSyntactic structure

PN Det Noun Vt Det Noun PN

John hates some films

each woman has many friends

Every man likes Mary

The farmer owns a car

Syntactic structureSyntactic structure

Every man likes Mary

Det PNVt

NP

Noun

NP

VP

S

ParsingParsing

Parsing is the process of recovering Parsing is the process of recovering the the

Two main strategies:Two main strategies:– Top-down parsingTop-down parsing– Bottom-up parsingBottom-up parsing

Bottom-up parsingBottom-up parsing

Following this strategy, the analysis Following this strategy, the analysis starts at the level of the worlds and starts at the level of the worlds and proceeds upwards to the higher proceeds upwards to the higher levels.levels.

The transformational modelThe transformational modelOne of the major achievements in the field of theoretical One of the major achievements in the field of theoretical linguistics has been the development of the linguistics has been the development of the Transformational ModelTransformational Model (TM) by Chomsky [Cho65]. (TM) by Chomsky [Cho65].

The TM is an attempt to represent, through the use of The TM is an attempt to represent, through the use of mathematical abstraction, the mathematical abstraction, the Linguistic CompetenceLinguistic Competence of an of an individual.individual.

The linguistic competence is the implicit knowledge that The linguistic competence is the implicit knowledge that every adult has about his/her own language. every adult has about his/her own language.

Chomsky [Cho57] defines a language as a finite/infinite set Chomsky [Cho57] defines a language as a finite/infinite set of well-formed strings of symbols taken from a finite of well-formed strings of symbols taken from a finite vocabulary, called the vocabulary, called the lexiconlexicon. The well-formed condition is . The well-formed condition is defined by the grammar of the language.defined by the grammar of the language.

The transformational modelThe transformational modelA grammar is a finite specification of the sentences of a A grammar is a finite specification of the sentences of a language: it may consist of an explicit account of every language: it may consist of an explicit account of every sentence in the language (if the language is finite) or a set sentence in the language (if the language is finite) or a set of generative rules with the capability of producing of generative rules with the capability of producing all and all and onlyonly the grammatical sentences of the language they the grammatical sentences of the language they define.define.

Chomsky showed that a natural language, like English, Chomsky showed that a natural language, like English, cannot be properly represented by a finite-state grammar. cannot be properly represented by a finite-state grammar. He realized that a context-free grammar did not had the He realized that a context-free grammar did not had the power to define a natural language. power to define a natural language.

In [cho65], Chomsky proposed the transformational model, In [cho65], Chomsky proposed the transformational model, as the first representation of linguistic competence .as the first representation of linguistic competence .

The transformational modelThe transformational model

BaseComponent

PhonologicalComponent

SemanticComponent

Transformational Component

Sounds

Meanings

SurfaceStructure

Syntactic Component

The transformational modelThe transformational modelIn the core of the syntactic component we have In the core of the syntactic component we have two structures: the deep structure and the two structures: the deep structure and the surface structure.surface structure.The The deep structuredeep structure contains all the information contains all the information pertinent to the semantic interpretation of the pertinent to the semantic interpretation of the sentence.sentence.The The surface structuresurface structure captures all relevant captures all relevant information for the phonological interpretation of information for the phonological interpretation of the sentence.the sentence.The base component is comprising a context-free The base component is comprising a context-free grammar and a lexicon. This component grammar and a lexicon. This component generates the generates the deep structuredeep structure of the sentence. of the sentence.

The transformational modelThe transformational modelThe transformational component consists of a set The transformational component consists of a set of rewrite rules or transformations, that are of rewrite rules or transformations, that are applied to the deep structure , rearranging its applied to the deep structure , rearranging its constituents, and adding, deleting or replacing constituents, and adding, deleting or replacing elements, until the sentence obtains its final form elements, until the sentence obtains its final form or surface structure.or surface structure.

This process of transformation is based on the This process of transformation is based on the assumption that transformations do not modify assumption that transformations do not modify meaning.meaning.

The transformational modelThe transformational modelSince Chomsky´s edition of its first paper [cho57], Since Chomsky´s edition of its first paper [cho57], many linguists have worked on developing many linguists have worked on developing transformation rules which give a correct account transformation rules which give a correct account of English. This work has been a source of of English. This work has been a source of disagreement and controversy among linguists.disagreement and controversy among linguists.

A set of accepted transformations is listed below:A set of accepted transformations is listed below:

Number-AgreementNumber-Agreement

SD:SD: NP NP VV RESTREST

11 22 3 3

SC:SC: 1,[num] 2,[num]1,[num] 2,[num] 3 3

The transformational modelThe transformational model

IMPERATIVE-FORMATIONIMPERATIVE-FORMATION

SD:SD: you you VV RESTREST

11 22 3 3

SC:SC: 0 0 22 3 3

YES-NO QUESTION-FORMATIONYES-NO QUESTION-FORMATION

SD:SD: NP NP AUXAUX V V RESTREST

11 2 2 3 3 4 4

SC:SC: 2 2 1 1 3 3 4 4

The transformational modelThe transformational model

THERE-INSERTIONTHERE-INSERTION

SD:SD: NPNP VV RESTREST

11 22 3 3

SC: [There+2]SC: [There+2] 11 3 3

PASSIVE-FORMATIONPASSIVE-FORMATION

SD:SD: NP NP V V NP NPRESTREST

11 2 2 3 3 4 4

SC:SC: 3 3 [BE+2] [BY+1] [BE+2] [BY+1] 4 4

The transformational modelThe transformational model

DO-INSERTIONDO-INSERTION

SD:SD: QQ NP NP VV RESTREST

11 2 2 33 4 4

SC: SC: 00 [DO+2] [DO+2] 33 4 4

DATIVE-MOVEMENTDATIVE-MOVEMENT

SD:SD: NP NP V V NP1 NP1 to to NP2 RESTNP2 REST

11 2 2 3 3 4 4 5 5 6 6

SC:SC: 1 1 2 5 2 5 0 0 3 3 6 6