Carnegie Mellon University THE ROBOTICS INSTITUTE ......Carnegie Mellon University THE ROBOTICS...

1
Carnegie Mellon University THE ROBOTICS INSTITUTE Thesis Defense Kevin A. Lenzo Friday, December 9, 2016 GHC 6501 9:30 a.m. Alan W. Black Chair Jack Mostow Alex Rudnicky Julia Hirschberg Columbia University Thesis Committee Improving Prosody through Analysis by Synthesis Abstract An itera)ve modelbased method is proposed for improving linguis)c structure, segmenta)on, and prosodic annota)ons that correspond to the delivery of each u:erance as regularized across the data. For each itera)on, the training u:erances are resynthized according to the exis)ng symbolic annota)on. Values of various features and subgraph structures are "twiddled:" each is perturbed based on the features and constraints of the model. Twiddled u:erances are evaluated using an objec)ve func)on appropriate to the type of perturba)on and compared with the unmodified, resynthesized u:erance. The instance with least error is assigned as the current annota)on, and the en)re process is repeated. At each itera)on, the model is re es)mated, and the distribu)ons and annota)ons regularize across the corpus. As a result, the annota)ons have more accurate and effec)ve distribu)ons, which leads to improved control and expressiveness given the features of the model.

Transcript of Carnegie Mellon University THE ROBOTICS INSTITUTE ......Carnegie Mellon University THE ROBOTICS...

Page 1: Carnegie Mellon University THE ROBOTICS INSTITUTE ......Carnegie Mellon University THE ROBOTICS INSTITUTE Thesis DefenseKevin A. Lenzo Friday, December 9, 2016 GHC 6501 9:30 a.m. Alan

Carnegie Mellon University THE ROBOTICS INSTITUTE

Thesis DefenseKevin A. Lenzo

Friday, December 9, 2016 GHC 65019:30 a.m.

Alan W. Black Chair

Jack Mostow

Alex Rudnicky

Julia Hirschberg Columbia University

Thesis Committee

Improving Prosody through Analysis by Synthesis

Abstract An   itera)ve  model-­‐based  method   is  proposed   for   improving   linguis)c   structure,   segmenta)on,  and  prosodic  annota)ons  that  correspond  to  the  delivery  of  each  u:erance  as  regularized  across  the   data.   For   each   itera)on,   the   training   u:erances   are   resynthized   according   to   the   exis)ng  symbolic  annota)on.  Values  of  various  features  and  subgraph  structures  are  "twiddled:"  each  is  perturbed   based   on   the   features   and   constraints   of   the   model.   Twiddled   u:erances   are  evaluated   using   an   objec)ve   func)on   appropriate   to   the   type   of   perturba)on   and   compared  with   the  unmodified,   resynthesized  u:erance.   The   instance  with   least   error   is   assigned  as   the  current   annota)on,   and   the   en)re   process   is   repeated.   At   each   itera)on,   the   model   is   re-­‐es)mated,   and   the  distribu)ons  and  annota)ons   regularize  across   the   corpus.  As  a   result,   the  annota)ons  have  more  accurate  and  effec)ve  distribu)ons,  which  leads  to  improved  control  and  expressiveness  given  the  features  of  the  model.