Determining query types
description
Transcript of Determining query types
1
Determining query types
by analysing intonation
2
Overview
Using prosodic features of utterances Generating set of prosodic labels with
which test utterances are annotated Trying to determine which class the
utterances belong to– action, problem, connect, who, info, other
3
Contents Motivation Corpus Prosody System architecture
– pitch extraction– segmentation– prosodic labelling– label sequences (n-grams)
Results Conclusions
4
Motivation
Linguists (Crystal, Searle) found relationship between– prosody and utterance type (question, comman
d…)– prosody and attitude
Edinburgh maptask group (Taylor, Wright) found prosody help distinguish utterance types
5
British Telecom corpus Callers dial 100, requesting
– alarm calls, collect calls– codes, numbers– connection problems– …
8000 calls: first utterance only Annotation
– call types: by BT– prosody: by me
6
Call types in BT corpusPrimary move type Question. Ask from top to bottom. Stop when you can answer 'yes'
Prob Is there only a description of a problem or situation?
Who Is it a request about who to contact. ( e.g. which BT contact point or number to call?)
Info Is it a request for information or advice (e.g. about BT services, number or account information,the state of the network, general knowledge, or time)?
Connect Is it a request to be connected to another agent, service, person or organisation?
Action Is it a request for operator action (e.g. named service; change to BT records or customer serviceoptions; initiation of a BT process such as line test; report a fault)
Other Everything else
7
Prosody
má mà (lexical tone) yés yès (word-level intonation)
Now is the time for | all good men to |
come to the | aid of the | party
8
Simplified architecturepitch extractor / octave error correction
segmenterclustering
- 1 0
4 0
9 0
1 4 0
1 9 0
2 4 0
2 9 0
- 1 9 1 9
centroid LM
utterance classifier
draw layers thingy!
9
Pitch extraction
“Yes, Manchester please”
10
Octave error correction
11
Simplified architecturepitch extractor / octave error correction
segmenterclustering
- 1 0
4 0
9 0
1 4 0
1 9 0
2 4 0
2 9 0
- 1 9 1 9
centroid LM
utterance classifier
draw layers thingy!
12
Data points for one segment showing line of best fit
-10
10
30
50
70
90
110
130
150
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
sample number
pit
ch (
Hz)
duration penalty prevents very short segments
also minimum and maximum segment lengths
13
Varying the duration penalty
14
Minimum segment length
Yeah, could I book a wake-up call please
15
Simplified architecturepitch extractor / octave error correction
segmenterclustering
- 1 0
4 0
9 0
1 4 0
1 9 0
2 4 0
2 9 0
- 1 9 1 9
centroid LM
utterance classifier
draw layers thingy!
16
Assigning labels Each segment in training corpus has features
– duration– gradient– mid-point frequency
Clustering algorithm (K-means) places segments in feature space
Prosodic labels assigned to segments, based on cluster membership
17
18
2-D data points arranged in 15 clusters
19
Label trajectories
-50
0
50
100
150
200
250
300
350
400
0 20 40 60 80 100
time (samples)
pitc
h (H
z)
pitch correction offduration penalty 0maximum segmentlength
50
minimum segmentlength
20
number ofcentroids
20
normalization off
on to clustering now: discretization
20
More trajectory schemes
-50
0
50
100
150
200
250
300
350
400
0 50 100 150 200 250 300 350
time (samples)
pitc
h (H
z)
no maximum
-150
-100
-50
0
50
100
150
200
250
0 50 100 150 200 250 300 350 400
time (samples)
with normalization
0
50
100
150
200
250
300
350
0 50 100 150 200 250
pitc
h (H
z)
-50
0
50
100
150
200
250
300
350
400
450
-10 10 30 50 70 90 110 130 150
10 clusters
40 clusters
21
70 clusters
-50
0
50
100
150
200
250
300
350
400
450
500
0 20 40 60 80 100 120 140 160
tim e (s am ple s )
pit
ch
(H
z)
22
Simplified architecturepitch extractor / octave error correction
segmenterclustering
- 1 0
4 0
9 0
1 4 0
1 9 0
2 4 0
2 9 0
- 1 9 1 9
centroid LM
utterance classifier
draw layers thingy!
23
Label sequences N-gram collocation model used
– 台中 vs 台 and 中– label sequence e.g. [4;11;13;1] statistically mor
e useful than individual labels Association of label sequences with each cl
ass in training data computed Then estimate test data classes using maxi
mum likelihood model
24
Results Correct classification around 1/3
– correct classification by chance around 1/4 But changing parameters does affect results Some optimum parameters
– 20 clusters (prosodic labels)– only label sequences seen 4 times used– sequences of 4 labels best, performance
degrades with 5-grams
25
Conclusions
Psycholinguistic experiment showed humans find same task difficult
Prosody cannot be used by itself to classify utterances
But, in combination with a lexical model, could be of use
26
Introducing Linguistics
What do linguists do? Grammar, and other aspects of language Relationships between languages How is linguistics used in the real world?
27
What do linguists do? They don’t necessarily “learn languages”
– Linguist and 語言學 are confusing terms They are often interested in the structure of
languages. They might– specialize in one language, or a group of langua
ges– compare different languages– study features shared by all languages
28
Many linguists study grammar Syntax
– the way words are arranged to make sentences– John had lunch / *John lunch had
Morphology– the way words are modified to fit the circumstances– John had lunch / *John have lunch
Linguists study– what people actually say– not what they “should” say!
29
The sort of things linguists look at in syntax
Syntax (the way words are arranged to make sentences)– John saw the girl with the telescope– 爸爸給小明買鹹蛋超人– Me and Dad went to the toyshop– Dad bought an Ultraman for John and I
30
And in morphology… Affixation: hardly used in Chinese
– My son has 73 Ultramen– 我 (? 的 ) 兒子有 73 只鹹蛋超人 (* 們 )
Compounding– rare in English: greenhouse, blackbird– productive in Chinese
» Verb-object compounds: 開車 , 幫忙» Resultative compounds: 來得及 , 跑不掉» Stump compounds: 交大
31
Phonology: the sounds of a language
How good is ㄅㄆㄇㄈ at representing the sounds of Chinese?– 雄 is xiong in 韓愈拼音 , vs ㄒㄩㄥ .– 嗯 and 恩 are the same in ㄅㄆㄇㄈ , n vs en
in Pinyin Has 台灣國語 lost the sounds ㄓㄔㄕ ? Why do we sometimes hear 禮拜ㄕ ?
32
Historical linguistics How languages are related
– Language families» Indo-European, Sino-Tibetan…
– Areal linguistics» Greek, Bulgarian
– Mostly borrowed words; also shared grammatical features» Chinese, Korean, Japanese
How language changes over time– sounds: poor vs paw, suit. – vocab: 咖啡 , 颱風 . Calque: 摩天大樓 , skyscraper, gratte-ciel– grammar: Did you eat yet? Adversative passive 被
33
Sociolinguistics Diglossia: “high” and “lo
w” prestige languages– The role of Mandarin and T
aiwanese in a bilingual society
– The changing role of English in Taiwan society: borrowing, or showing off?
– case and size: code-switching, or lexicalized Chinese words?
Ta-hsüeh-shih-ching Ta-hsüeh-shih-ching Ta-hsüeh-shih-ching
34
Applications for linguistics Speech disorders Forensic linguistics
– Accent detection– Style verification (eg police style)
Language teaching Computational applications
– Machine translation– Speech recognition and synthesis– Language identification