technology Children Voices Against Bullying in...
Transcript of technology Children Voices Against Bullying in...
1
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
L2 F - Spoken Language Systems Laboratory
Children Voices Against Bullying in Schools
Luís Caldas de [email protected]
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
2Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Outline
• Spoken Language Systems Laboratory of INESC-ID• Bullying and FearNot!• Acoustics of children voices• Voice building and TTS System• Experimental evaluation• Results• Conclusion and Future Work
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
3Luís Caldas de Oliveira - Spoken Language Systems Laboratory
About L2F
• HistoryWork on speech processing for Portuguese since the 90s. L2F was created in 2001.
• GoalBring together several groups in the area of spoken language processing for European Portuguese, united by the problem we want to solve, not by the technology we share.
• MissionCreating technology to bridge the gap between natural spoken language and the underlying semantic information.
• Interdisciplinary backgroundSignal processing, natural language processing, linguistics, etc.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
4Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Prioritary Lines of Activity
• Semantic processing of multimedia contents– Follow up of ALERT project: continued research on segmentation,
recognition, topic indexation, summarization– Automatic closed captioning
• Spoken dialogue systems and intelligent multimodal interfaces– Domotics: "intelligent" rooms controllable by voice– Telephone-based information systems;– Voices for synthetic characters.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
5Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Companies working with L2F
• Vodafone Portugal• RTP (public broadcasting)• Promosoft (banking solutions)• Priberam (law databases)• Edisoft (security and defense industry)• Tecmic (fleet management solutions)• Ano (local government solutions)• CPC HS (health systems)• Microsoft Language Development
Center
• Repeated oppression, psychological or physical, of a less powerful person by a more powerful person – David Farrington (1993)
Bullying
FearNot!
Interaction
F0 vs Age
• Lee, S., Potamianos, A. and Narayanan, S., Acoustics of children’s speech: Developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am., 105:1455–1468, Mar. 1999.
girl boy
woman
Target
436 subjects (ages 5 to 18) 56 adults
Formant Scaling
• Lee, S., Potamianos, A. and Narayanan, S., Acoustics of children’s speech: Developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am., 105:1455–1468, Mar. 1999.
girl
boy
woman
Target
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
11Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Voice Building
• English and German voices. • 12 voices from 4 speakers.• Characters speak with TTS-Voice.• Matching voice characteristics.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
12Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Recordings
• Language models generate around 5000 different utterances for each language.
• A greedy algorithm was used to automatically select a representative sub-set (550 utterances).
• 2 native German speakers and 2 native British speakers recorded the sentences.
• The recordings were modified to generate multiple child-like voices.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
13Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Normalisation
Voice Building Architecture
AudioSegmentationUtterances Labels Audio
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
14Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Segmentation
• Segmented by our own segmentation tool adapted to British English
• Gender dependent models were trained using the British English WSJ corpus
• 85% / 84% of accuracy for female and male speakers, respectively
• A speaker adaptation procedure was performed 2 times, using the canonical word pronunciations for segmentation
• 3rd iteration: a pronunciation graph was provided for canonical pronunciations with alternative pronunciations using post-lexical rules
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
15Luís Caldas de Oliveira - Spoken Language Systems Laboratory
XML Representation
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
16Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Speech-Engine Architecture
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
17Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Speech-Engine
• No explicit Prosody-Model for Duration and F0• Duration and F0 are predicted during runtime • Use as a feature for the context matching segment pre-
selection• Segment Selection:
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
18Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Character with original voice
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
19Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Character with modified voice
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
20Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Character with modified voice
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
21Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Experimental Evaluation
• Subjects divided into 2 categories:– Audio only– Audio and video
• Rating of 6 items (Likert scale):– Overall sound quality– Naturalness– Sounds like boy/girl?– Sounds like bully/victim?
• 8 different versions of each stimuli:– 2 original voices– 2 modified voices– 2 synthesised voices– 2 modified synthesized voices.
[1] Johnson et al., Limited domain synthesis of expressive military speech for animated characters, IEEE Workshop on Speech Synthesis, 2002.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
22Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Results
• The presence of video result in a better rating on the overall perceive quality: 3.42 (p<0.005) vs 3.70 (p<0.00001).
• The presence of the animated character made the voices more believable especially for the victim (3.68, p<0.00001).
• The modified voices had the same rating in overall quality as the unmodified voices for the audio only test (3.42, p<0.04) but were better rated when played in video clips (3.82, p<0.00001 vs 3.59, p<0.009).
• The results for the overall quality of both the modified and unmodified recording were above 4 (4.45, p<0.00001)
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
23Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Conclusions and Future Work
• Limited domain synthesis allowed us to produce voices for 3D animated characters with almost natural speech quality
• Although there was no story context in our evaluation, the video of the animated characters influenced positively the perceived overall quality and intonation
• Additional voices need to be generated• Some segmentation and concatenation problems need to
be corrected
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technologyfrom seed
24Luís Caldas de Oliveira - Spoken Language Systems Laboratory
Thank youObrigado
L2 F - Spoken Language Systems Laboratory