Center for PersonKommunikation P.1 Do you need speech in your project? Speech recognition? Speech...

22
P. 1 Center for PersonKommunikation Do you need speech in your project? • Speech recognition? • Speech synthesis? • English or Danish or … (who are going to be the test persons in your user trials?) • Multilingual system?
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    225
  • download

    0

Transcript of Center for PersonKommunikation P.1 Do you need speech in your project? Speech recognition? Speech...

P.1

Center for PersonKommunikation

Do you need speech in your project?

• Speech recognition?

• Speech synthesis?

• English or Danish or … (who are going to be the test persons in your user trials?)

• Multilingual system?

P.2

Center for PersonKommunikation

Available speech software

• javax.speech, jsapi, jsapi-compliant speech engines, e.g. IBMs Viavoice

• MS SAPI & SAPI compliant speech engines, e.g. MS Whisper

---------------------------------------• HTK: graphVite (recogniser)• CPK SLANG (recogniser, open source, network service)• Danish Speech Synthesis (network service)

P.3

Center for PersonKommunikation

Project proposals 1

• User Interaction Paradigms on Portable Devices – Lars Bo Larsen – Speech recognition required

• A search Engine for images on the web - Lars Bo Larsen – Speech recognition can be considered (“Voice-enabled HTML”)

• 3D scanner and Face Animator – Henning Nielsen– Speech rec/synth. not relevant (?)

• Outdoor Navigation System for Blind Pedestrians – Ove Andersen– Speech synthesis required, recognition to be considered

• Electronic Reception Desk– Ove Andersen – Speech recognition/synthesis obvious

P.4

Center for PersonKommunikation

Project proposals 2

• A decision support system for assessing critically ill patients. – Steve Rees, Steen Andreassen

• Decision support system for advice on antibiotic therapy – Steen Andreassen– Speech not obvious, however the user interface may include speech

• Multi Modal Mediator - Gael Rosset – Speech recognition/synthesis required VoiceXML?

• Beyond WAP - Gael Rosset– Speech not obvious

• An Internet based decision support tool for diabetes patients– Speech not obvious, however the user interface may include speech

P.5

Center for PersonKommunikation

Natural Language ProcessingTom Brøndsted, CPK

• Symbols on the slides:

– this point may be brought up at the examination!

– this will NOT be brought up at the examination!

• Linguistic terms (noun, verb, nounphrase, verbphrase etc.) are explained in

– http://www.sil.org/LINGUISTICS/glossary

!

P.6

Center for PersonKommunikation

Dialogue System (text)James Allen: Natural Language Understanding, 1995

P.7

Center for PersonKommunikation

Recognition and parsing

decoding parsingLanguagemodel

grammar

vocabulary lexicon

speech text

text Semantic representation

P.8

Center for PersonKommunikation

MM1

• Chomsky: Types of grammars used in NLP

• Young: Types of grammars used in speech recognition

• Winograd: lexical ambiguity, structural ambiguity: What the simple grammar types can be used for: postponed to MM2

P.9

Center for PersonKommunikation

Background for NLP

• Questions brought up by N. Chomsky in the 1950’ies:– Can a natural language like English be described (“parsed”, “compiled”) with

the same methods as used for formal/artificial (programming) languages in computer science?

– Can we use simple finite state grammars or context-free grammars for the description of English?

– Or does linguistics need to invent an own and more powerful grammar type for the description of natural languages?

• Offshoots: “The Chomsky Hierarchy of Grammars”, “Natural Language Processing”, “Generative Transformational Grammar”,

P.10

Center for PersonKommunikation

Chomsky: Grammar Theory 0

• Some key extracts/quotations from ”Syntactic Structures”

– A language is a (infinite) set of sentences, each finite in length and constructed out of a finite set of elements.

– A grammar is a device that separate the grammatical sequences from the ungrammatical sentences and generates the structures of the grammatical ones.

– A grammar is a reconstruction of the native speaker’s competence, his ability to generate (produce and understand) an infinite number of sentences

– A grammar is a theory of a language. It must comply with the empiristic axioms: The theory must be adequate and simple.

P.11

Center for PersonKommunikation

Chomsky: Grammar Theory 1

English Native SpeakerEnglish Grammar/Language Theory

Have you a book on modern music?The book seems interesting.…...

Sentence parsable!Sentence parsable!…..

The grammar must generate (“parse”) ALL sentences acceptable to the native speaker and ….

!

P.12

Center for PersonKommunikation

Chomsky: Grammar Theory 2

English Native Speaker

According to my intuition this sentence is

OK!OK!

...

English Grammar/Language Theory

1) Colorles green ideas sleep furiously.2) Have you a book on modern music?…

… the grammar must generate NOTHING BUT sentences acceptable to the native speaker and ...

Random sentence generation:

!

P.13

Center for PersonKommunikation

Chomsky: Grammar Theory 3

Grammar A

Grammar B

Set of Sentences

generated byA and B

Preferable grammar

(equivalent grammars)

… the grammar must be as SIMPLE (e.g. “small”) as possible

Language l

!

P.14

Center for PersonKommunikation

Chomsky: Grammar Theory 4

What’s in the “Black Box”? What type of grammar can “generate” a natural language like English?

– A Finite State Grammar without/with loops? • (No! “Syntactic Structures” pp. 18 ff.)

– A Phrase Structure Grammar?• (No! “Syntactic Structures” pp. 26 ff.)

– A Transformational Grammar?• (Yes/Maybe! According to “Syntactic Structures” pp 34 ff. BUT

“Generative Transformational Grammar” has turned out to be a “blind alley” in computational linguistics)

?

!

!

!

P.15

Center for PersonKommunikation

Chomsky: Hierarchy of Grammars

• Type 3: Regular Grammars

– Equivalent to finite state automata, finite state transition networks, Markov models (probabilistic type).

• Type 2: Context free Grammars

– E.g. recursive transition networks (RTNs), phrase structure grammars (PSGs). Unification grammars where attributes take values drawn from a finite table.

• Type 1: Context sensitive Grammars

– Augmented transition networks (ATNs), transformational grammars, some unification grammars

• Type 0: Unrestricted Grammars

!

!

P.16

Center for PersonKommunikation

Finite State Grammar

• Structure:– Directed Graph/Transition Network structure

– All transitions are terminals

– The terminal symbols are either words or POS (word class) names like Noun, Verb, Pronoun.

– The network structure may involve loops (iterations) and “empty” transitions (jumps, skips)

4

nodes

transition

loopj

jump

!

P.17

Center for PersonKommunikation

Recursive Transition Network Grammar

• Structure:– A SET of named Directed Graph/Transition Network structureS

– Transitions are terminals or NON-TERMINALS

– Terminal symbols/loops/jumps -> see FSN -slide

– A non-terminal symbol is the name of a network in the set included in the RTN

X

Jump

Xa bEquivalent BNF/PSGX -> a bX -> a X b

AnBn-problem, “Syntactic Structures”, p. 30

!

P.18

Center for PersonKommunikation

What’s wrong with FSNs & RTNs according to Chomsky?

• FSNs without loops can only generate a finite set of sentences. English is an infinite set

• FSNs with loops generate infinite sets of sentences but cannot describe AnBn sequences found in constructions with “respectively”.

• RTNs (PSGs/BNFs) generate infinite sets of sentences, can describe AnBn sequences, but applied to English a huge number of symbols is required (conflict with simplicity)

!

P.19

Center for PersonKommunikation

Generative Transformational Grammar

P.20

Center for PersonKommunikation

Young et al.: Grammar types in speech reocgnition

• Level Building [obsolete] (Young p. 8 f.):– finite state grammar without loops

• Viterbi/Token passing (Young p. 8 f, p. 11 ff.)– finite state grammar with loops

– context-free grammar provided that every non-terminal refers to a unique instance of a sub-network (No recursions!)

Conclusion: Decoding algorithms used in modern speech recognition technology can only be applied on the weakest grammar type within the Chomsky Hierarchy

P.21

Center for PersonKommunikation

Exercise 1

• The following extremely simple grammar generates all (typed) English sentences. What is wrong with it according to the Chomsky theory?

• Describe the concepts “native speaker, “intuition”, “generate” as used in the Chomsky theory.

char

Lexicon: char=a,b,c,…z,’;’,’.’,’?’, …etc

P.22

Center for PersonKommunikation

Exercise 2

Consider the "correct" (or "grammatical") ansi-C printf sequences in I and compare them with the "false" (or "ungrammatical") ones in II :

I printf("%d %s",integer,string)

printf("%s %d %d",string,integer,integer)

printf("%d",integer)

etc.

II printf("%d %s",integer,string,string)

printf("%d %s",integer)

printf("%d %s",string,integer)

etc.

Is it possible to design a regular (finite state) grammar that generates the correct sequences in I without generating II? If not, can a context-free grammar be designed that meet these conditions?