A very short introduction to Natural Language Generation
Kees van Deemter
Computing Science
University of Aberdeen
Text
Language Technology
Natural Language Understanding
Natural Language Generation
Speech Recognition
Speech Synthesis
Text
Meaning
Speech Speech
First: NLG from a practical perspective Goal:
Use computers to express information in human-accessible form
Input: some non-linguistic representation of information (e.g.,
tables in database, logical formulas, JAVA code, ...) Output:
documents, reports, explanations, help messages, ... in some human language (Chinese, English, Dutch)
Knowledge sources required: knowledge of language and of the domain; maybe of the
intended audience as well
Example System: FoG
Function: Produces textual weather reports in English and French
Input: Graphical/numerical weather depiction
User: Environment Canada (Canadian Weather Service)
Developer: CoGenTex. [Kitteridge, Goldberg and Driedger 1994.]
Status: Fielded, in operational use since 1992
FoG: Input
FoG: Output
Example System: Dial Your Disc (DYD) Function:
Context-sensitive descriptions of Mozart’s instrumental music
Input: Music database + history of interaction
Target user: Music industry, customers for music-on-demand
Developer: Philips Electronics (Nat Lab – IPO, Eindhoven; 1993-6)
[Van Deemter & Odijk 1995] Status:
Methods reused in GOALGETTER and other systems
Example System: Dial Your Disc (DYD)
User composes a home-made CD Speech interface tells system what type of music
the user would like to add to the CD. E.g., “I’d like some piano music”. “I’m interested in solo
performances”. “piano”, “solo” System chooses one composition with solo piano.
The music starts. After a while, a text is spoken The second time a piano sonata is selected, the
following text might be generated:
Example System: Dial Your Disc (DYD)
Example of approximate output, in its most elaborate form:
“The following+ composition+, from which you are going to hear a fragment+ of part three+, was written+ by Mozart in the beginning+ of seventeen+ seventy+ five+, in Munich+. The work is also+ a sonata+ in f+, like the preceding+ composition, but now+ for piano+. The KV+ number of this work is K. two+ eight+ zero+. This sonata+ consists of three+ parts+: allegro assai+, adagio+, and presto+. The presto lasts two+ minutes+ forty+ five+ seconds+. This presto is located on track six+ of first+ CD+ of volume seventeen+. The piano+ is played by Mitsuko Uchida+. The recording+ of the sonata+ was made+ in the Henry Wood+ Hall in London+, England, in the eighties+. The quality+ of its recording is DDD+. The following+ is a fragment+ of the third+ part+.” [A fragment follows] Each “+” marks a pitch accent on the preceding word
When to use NLG?When
there are many potential documents to be written, differing according to the context (user, situation, language)
there are some general principles behind document design.
Why is NLG hard? NLG involves many choices, e.g. which
content to include, what order to say it in, what words to use.
Linguistics does not yet provide us with a ready-made, precise theory about how to make such choices to produce coherent text
Why does choice matter?
The Serbian Prime Minister, Zoran Djindjic, has been assassinated in the capital, Belgrade.
The pro-reform, pro-Western leader was shot in the stomach and in the back outside government offices at around 1300 (1200 gmt), and died of his wounds in hospital.
(BBC news, UK edition, 12/3/03)
Tasks and Architecture in NLG (Reiter 1994)
Content Determination
Document Structuring
Aggregation
Lexicalisation
Generation of Referring Expressions
Linguistic Realisation
Physical Realisation
Document Planning
Micro-planning
Surface Realisation
Second perspective: NLG as a branch of linguistics
NLG systems map ideas to words Surely, this is linguistic territory!
If linguists cannot say how the different stories about James Sportler differ, then who can?
An NLG program might be seen as a model of language production (in terms of its output; the human production process may be very different)
NLG is the smaller twin brother of NL Understanding
NLG poses deep theoretical problems about language and communication
NLG has great potential for applications
This course: Generation of Referring Expressions
Top Related