MixPoet: Diverse Poetry Generation via Learning ...
Transcript of MixPoet: Diverse Poetry Generation via Learning ...
MixPoet: Diverse Poetry Generation via Learning
Controllable Mixed Latent Space
Xiaoyuan Yi1, Ruoyu Li2, Cheng Yang3, Wenhao Li1, Maosong Sun1*
1Tsinghua University2 6ESTATES PTE LTD
3Beijing University of Posts and Telecommunications
AAAI 2020
CONTENTS
Background & Motivation
Methodology
Experiment & Conclusion
Jiuge (九歌) System
01
• 04
• 03
• 02
Elegant Expressions
Colorful Contents
Diverse Styles
01 Background & Motivation
Poetry
• Human Writing Mechanism
• Computational Creativity
• Humanizing AI
Exploration
Application• Entertainments
• Advertising
• Poetry Education
• Literary Research
…
01 Background & Motivation
(Yan, 2016 ;Wang et al., 2016; Yang et al., 2018)
?
Primary Criteria
of
Poetry Quality
Diversity
(Zhang and Lapata, 2014; Hopkins and Kiela, 2017)1Fluency
Rhyme Rhythm
2Topic
Relevance
(Ghazvininejad et al., 2016; Yi et al. 2018; Li et al. 2018)3Context
Coherence
Poems generated with different topic words
should be distinguishable from each other.
With the same topic word, the model
should be able to generate distinct poems.
Inter-Topic
Diversity
Intra-Topic
Diversity
01 Background & Motivation
Most Frequent K Words
Acc
um
ula
tive
Fre
quen
cyExisting models tend to
• remember common patterns in the corpus
• produce repetitive and generic contents
The most frequent 20 words account for
20% of all generated contents!
Input keyword:desolation Input keyword:autumn lake
Distinct input keywords
• Repetitive words
• Identical lines
01 Background & Motivation
Human-authored poems are highly diverse!
Each human poet has his/her own styles.
One possible solution: modeling each individual poet.
(Wei et al., 2018; Tikhonov and Yamshchikov, 2018)
• Sparsity of data
• Inconsistency of a poet’s style
01 Background & Motivation
Diverse styles
Living experience
Historical background
Love experienceGender
School of literary
Factors
Composition manners of human poets
Thoughts Feelings Expressions…
塞上长城空自许,镜中衰鬓已先斑。陆游南宋《书愤五首·其一》
黄沙百战穿金甲,不破楼兰终不还王昌龄盛唐《从军行七首·其四》
人闲桂花落,夜静春山空。王维 《鸟鸣涧》
孤城向水闭,独鸟背人飞。刘长卿《馀干旅舍》
01 Background & Motivation
Model each influence factor!
• Gather poems that share the same factor value
• Focus on the characteristics of each poem
instead of each poet
Sparse data
Changeable styles of one poet
Model each individual poet.
factor 1
factor 2
factor 3
factor M style 2
style 4
style 1
style 3
style 5
style N
CONTENTS
Background & Motivation
Methodology
Experiment & Conclusion
Jiuge (九歌) System
01
• 04
• 03
• 02
02 Methodology
𝑝 𝑥, 𝑦, 𝑧 𝑤 = 𝑝 𝑦 𝑤 ∗ 𝑝 𝑧 𝑤, 𝑦 ∗ 𝑝(𝑥|𝑤, 𝑦, 𝑧)
Overview
poem𝑥
user-specified keyword𝑤
𝑀 𝑦1, … , 𝑦𝑖 , … , 𝑦𝑀factors
• Poetry style is tightly coupled with
semantics! (Embler,1967)
Each factor 𝑦𝑖 is discretized into 𝐾𝑖 classes
ෑ
𝑖=1
𝑚
𝐾𝑖 factor mixtures
New distinctive styles!
to capture factor-related properties
𝑧 latent variable
𝑝 𝑥, 𝑦 𝑤 = න𝑧
𝑝 𝑥, 𝑦, 𝑧 𝑤 𝑑𝑧
latent space!
Previous work:
[𝑧; 𝑦] (Hu et al., 2017)• Interpretability & controllability
user
02 Methodology
𝑝 𝑥, 𝑦, 𝑧 𝑤 = 𝑝 𝑦 𝑤 ∗ 𝑝 𝑧 𝑤, 𝑦 ∗ 𝑝(𝑥|𝑤, 𝑦, 𝑧)
Figure1: A graphical illustration of MixPoet
specify𝑤
Overview
infer
𝑝 𝑦 𝑤𝑧 sample 𝑥
𝑤
determine
determine
𝑦
specify
different mixtures of 𝑦𝑖 intra-topic diversity √
different 𝑤 inferred 𝑦𝑖 inter-topic diversity √
+
+
02 Methodology
Semi-Supervised Conditional VAE
For labelled data
log 𝑝 𝑥, 𝑦 𝑤 ≥ −𝐿(𝑥, 𝑦, 𝑤)
= 𝐸𝑞 𝑧 𝑥, 𝑤, 𝑦 log 𝑝 𝑥 𝑧, 𝑤, 𝑦 − 𝐾𝐿[𝑞(𝑧|𝑥, 𝑤, 𝑦)| 𝑝 𝑧 𝑤, 𝑦 + log 𝑝(𝑦|𝑤)
log 𝑝 𝑥 𝑤 ≥ −𝑈(𝑥,𝑤)
= 𝐸𝑞 𝑦 𝑥,𝑤 −𝐿 𝑥, 𝑦, 𝑤 + 𝐻(𝑞(𝑦|𝑥, 𝑤))
Decoder Recognition Network
Prior Network
Classifier
For unlabelled data (following (Kingma et al. 2014))
Classifier
𝐿 = 𝐸𝑝𝑙(𝑥,𝑤,𝑦) 𝐿 𝑥,𝑤, 𝑦 − 𝛼 ∗ log 𝑞 𝑦 𝑥,𝑤 + 𝛽 ∗ 𝐸𝑝𝑢 𝑥,𝑤 [𝑈(𝑥, 𝑤)]
Total loss
02 Methodology
Latent Space Mixture
To incorporate multiple factors 𝑧 = [𝑧1, 𝑧2, … , 𝑧𝑀]
(1) independence of influence factors 𝑦1, 𝑦2
(2) conditional independence of subspaces 𝑧1, 𝑧2
e.g., M = 2
𝑝 𝑧 𝑤, 𝑦 = 𝑝 𝑧1 𝑤, 𝑦1 𝑝(𝑧2|𝑤, 𝑦2)
Assume:
Independently specify the values of different factor
A latent space which mixes the properties of multiple factors!
𝑧1 𝑧2
𝑧
𝑧2𝑧1
02 Methodology
(1) Mixture for Isotropic Gaussian Space (MixPoet-IG)
Assume 𝑧~𝑁(𝜇, 𝜎2𝐼) 𝐾𝐿[𝑞(𝑧|𝑥, 𝑤, 𝑦)| 𝑝 𝑧 𝑤, 𝑦
= 𝐾𝐿[𝑞(𝑧1|𝑥, 𝑤, 𝑦1)| 𝑝 𝑧1 𝑤, 𝑦1+𝐾𝐿[𝑞(𝑧2|𝑥, 𝑤, 𝑦2)| 𝑝 𝑧2 𝑤, 𝑦2
Analytically minimize
these two KL terms!
Independent dimensions of the latent variable Not expressive enough(Dilokthanakul et al. 2017)
A complex latent space with
(1) independent subspaces (𝑧1, 𝑧2)
(2) entangled internal dimensions of each subspace
Independent control of each factor
Expressive and distinguishable latent representations
02 Methodology
(2) Adversarial Mixture for Universal Space (MixPoet-AUS)
Universal Approximator (Makhzani et al. 2015)
𝑞 𝑧 𝑐, 𝜂 = 𝛿(𝑧 − 𝑓(𝑐, 𝜂))
𝜂~𝑁(0,1)sample 𝑓transformation
e.g., MLP𝑤, 𝑦1condition
𝑧1~𝑞(𝑧1|𝑤, 𝑦1)
Independent samples of 𝑧1, 𝑧2 from learned arbitrary complex distributions!get√
𝜂~𝑁(0,1)
simple distributioncondition
A learned arbitrary complex distribution
= 𝐸𝑞 𝑧 𝑥, 𝑤, 𝑦1, 𝑦2 [log𝑞 𝑧 𝑥, 𝑤, 𝑦1, 𝑦2
𝑝 𝑧1 𝑤, 𝑦1 𝑝(𝑧2|𝑤, 𝑦2)]
≈ 𝐸𝑞 𝑧 𝑥, 𝑤, 𝑦1, 𝑦2 [log𝐶(𝑧, 𝑦1, 𝑦2)
1 − 𝐶( 𝑧1; 𝑧2 , 𝑦1, 𝑦2)]
𝐾𝐿[𝑞(𝑧|𝑥, 𝑤, 𝑦1, 𝑦2)| 𝑝 𝑧1 𝑤, 𝑦1 𝑝(𝑧2|𝑤, 𝑦2)
02 Methodology
(2) Adversarial Mixture for Universal Space (MixPoet-AUS)
𝐾𝐿[𝑞(𝑧|𝑥, 𝑤, 𝑦)| 𝑝 𝑧 𝑤, 𝑦 No analytical form!
Density Ratio Loss (Rosca et al. 2017)
Conditional latent discriminator
02 Methodology
(2) Adversarial Mixture for Universal Space (MixPoet-AUS)
𝑧1~𝑞(𝑧1|𝑤, 𝑦1)
𝑧2~𝑞(𝑧2|𝑤, 𝑦2)𝑧~𝑞(𝑧|𝑥, 𝑤, 𝑦1, 𝑦2) 𝑧
𝑧1
𝑧2
𝑧2𝑧1
Use adversarial training to
minimize the ratio loss
𝐶
𝑧 𝑧2𝑧1
𝑦1, 𝑦2
1/0
When the discriminator is cheated
𝐶(∙) ≈ 0.5
𝐾𝐿[𝑞(𝑧|𝑥, 𝑤, 𝑦1, 𝑦2)| 𝑝 𝑧1 𝑤, 𝑦1 𝑝 𝑧2 𝑤, 𝑦2 ≈ 0
CONTENTS
Background & Motivation
Methodology
Experiment & Conclusion
Jiuge (九歌) System
01
• 04
• 03
• 02
03 Experiment & Analysis
Dataset
Two typical factors
Living Experience
(Dilthey 1985)
Historical Background
(Owen 1990)
Military Career (MC)
Countryside Life (CL)
Others
Prosperous Times (PT)
Troubled Times (TT)
Table 1: Statistics of labelled poems.
• 49,451 labelled poems
• 117k unlabelled poems
3 ∗ 2 = 6 styles!
03 Experiment & Analysis
Experiment Results
Figure 2: Factor control accuracy.
Table 3: Human evaluation results of quality.
Table 2: Automatic evaluation results of diversity.
controllable
to some
extent!
Acc
um
ula
tive
Fre
quen
cy
Most Frequent K Words
Figure 4 (b): Accumulative frequency of the most
frequent k words in generated poems.
03 Experiment & Analysis
Analysis
Figure 4 (a): Visualization of samples of z conditioned
on the keyword ‘spring wind’ and different mixtures.
03 Experiment & Analysis
Case Study
Figure 5: Poems generated by baseline Basic (a), USPG (b) and MixPoet with different mixtures (c, d). Repetitive
phrases are marked in red. Phrases meeting different factor classes are marked in corresponding colors.
03 Experiment & Analysis
General Comparison
Models Inter-topic
diversity
Intra-topic
diversity
Main contributor to
diversity
Interpretability of
styles
CVAE (Yang et al., 2018a) √ √ Randomness resulted by
different samples from the
latent space.
×
MRL (Yi et al., 2018) √ √ √ √ × Encourage the generation of
high-TF-IDF words.×
USPG (Yang et al., 2018b) √ √ √ √ Disentangle poetry space into
different clusters×
MixPoet (Ours) √ √ √ √ √ √ Mix different influence
factors conditioned latent
subspaces.
Different factor
mixture
MixPoet
• Best intra-topic diversity
• Satisfactory inter-topic diversity
• Interpretable and distinguishable styles
03 Experiment & Analysis
Conclusion
A novel poetry generation model – MixPoet
Semi-supervised structure
Absorb and mix multiple factors which influence human poetry composition.
Controllability of factor-related properties
More distinguishable latent space
Improve both intra-topic & inter-topic diversity
Future Work
More factors, e.g., love experience, school of literary, gender, and age
Finer-granularity discretization, e.g., more classes for each factor (even continuous values?)
Dependence of influence factors, e.g., living experience is related to gender
CONTENTS
Background & Motivation
Methodology
Experiment & Conclusion
Jiuge (九歌) System
01
• 04
• 03
• 02
Jiuge
A Chinese poetry generation system developed by THUNLP lab.
• Support most popular genres of Chinese poetry
• Multi-style choices
• Human-machine collaborative creation
• Automatic Recommendation of related human-
authored poems.
• Page View > 7 million
(九歌)
Jiuge (九歌)
https://jiuge.thunlp.cn/Online system:
Github: https://github.com/thunlp-aipoet
Source codes and data and of Jiuge systems will be
gradually released via our github page!
Thanks!
Xiaoyuan Yi PhD Student
THUNLP Lab, Tsinghua University
Mail: [email protected]
Any questions or suggestions, please feel free to email us!