Generating Haiku with Word Association Normskfirbar/DH/gaiku-yael.pdfWANs Word Association Norms -...
Transcript of Generating Haiku with Word Association Normskfirbar/DH/gaiku-yael.pdfWANs Word Association Norms -...
Gaiku
Yael Kinor
Generating Haiku with Word Association NormsArticle by Yael Netzer, David Gabay, Yoav Goldberg, Michael Elhadad
Contents- Computational Creativity
- What is Haiku?
- Where to start?
- Algorithm
- Experiment
Computational Creativity Model, simulate or replicate creativity using a computer
- Program capable of human-level creativity
- Better understand human creativity
- Enhance human creativity
Computational Creativity Challenge -
Understanding and modeling vague knowledge (beautiful, touching, funny)
Oulipo Ouvroir de littérature potentielle - “Workshop of Potential Literature"
- French writers and mathematicians- Founded in 1960- Create works using constraints
“seeking of new structures and patterns which may be used by writers in any way they enjoy."
Oulipo Methods
- N+7- Lipogram- Combinatorics- Snowball
Poetry Generation
Three properties -
- Meaningfulness
- Grammaticality
- Poeticness
How can we apply to Haiku?
Haiku俳句
Short form of Japanese poetry from 16th century
- Present tense, no judgment words
- Beauty lies in the things left unsaid
- Read between the lines
- Minimalism
- Express anything and everything
Haiku俳句
Follows three technical rules -
- Kireji = cutting word
- 17 on = syllables
- Kigo = seasonal reference
Kireji Special category of words
- Sense of closure or pause
- Emotional flavour and emphasis
- Ka, kana, -keri, -ramu, -shi, -tsu, ya
- No exact equivalent in English (dash, ellipsis)
On Phonology unit that determines syllable weight (μ)
- A long vowel is one syllable but two moras
(Aa, Ou, Ei)
- 17 moras - 3 lines of 5 / 7 / 5
We will talk about syllables
Kigo A word or phrase associated with a particular season
- Categories - seasons, plants, animals, earth,
humanity, heavens
- Early/Mid/Late of Spring/Summer/Autumn/Winter
- Saijiki
古池や蛙飛びこむ水の音
furu ike ya / kawazu tobi komu / mizu no oto
an old ponda frog jumps intothe sound of water
Basho - 17 cen.
Haiku俳句 - Kireji = cutting word
- 17 on = syllables
- Kigo = seasonal reference
Haiku俳句 To convey one's mood In seventeen
syllables Is very diffic[Haiku, defined by John Cooper Clarke]
We will use a less constraining definition..
How to write a computer program to generate Haiku?
Let’s ask Stack Overflow
We Need Words
Lexical databases
- WordNet - Thesaurus
Predictable and limited..
We Need Words
Human mind uses varied associations
- Paradigmatic - (doctor-nurse)
- Syntagmatic - (mash-potato)
- Emotional - (math-yuck)
We Need Words
Creativity requires -
- Knowledge
- Imagination
- Filtering
Bad Examples -
Haiku generator 1
Haiku Generator 2
WANs Word Association Norms
- 40 years of psychological research
- Typical associations evoked by trigger
- Stable across people, time, language
Excellent for Haiku!
University of South Florida Free Association Norms
- 5K cue words + 10K additional words
Are Haiku more associative than news or prose?
WANs
Modelling
WordNet
WANs
Modelling
WAN graph
WANs Basic definitions -
Shortest path distances -
asso_dist(w1, w2) , wordNet_dist(w1, w2)
If [ asso_dist(w1, w2) < 3 ] => asso_related(w1, w2) = 1If [ wordNet_dist(w1, w2) < 4 ] => wordNet_related(w1, w2) = 1
If [ w1, w2 in WAN ] => in_wan(w1, w2) = 1
WANs
WANs
Associativity Test
- 200 Haiku- 12 word sequences Project Gutenberg- 12 word sequences NANC newswire corpus ($$)
Generating Haiku using WANs
5 stages of creative process -
- Theme selection
- Syntactic planning
- Content selection / semantic planning
- Filtered over-generation
- Re-ranking
Algorithm
Dataset
- USF WAN - 15K words
- Haiku corpus - 3500 English Haikus
- Google N-grams - 1TB of data
- Project Gutenberg - Easier to POS tag
Algorithm
Theme Selection
1. User supplied seed word
2. Consult WAN database -
● 8 x Random walks from seed, 3 steps
● Keep all resulting words
Theme selection
Random walks -
association graph vs. WordNet graph
Algorithm
Syntactic Planning
Training stage -
1. POS-tag each Haiku
2. Extract line pattern - “DT_the JJ NN”
3. Top 40 frequent line patterns
=> Total 120 patterns
Algorithm
Syntactic Planning
Generate Haiku -
- Choose random pattern for 1st line
- Choose 2nd line
- Choose 3rd line
Algorithm
Syntactic Planning
Result - 3 line Haiku skeleton -
- Num of words in each line
- POS tags of words
- Placement of function words
Algorithm
Content Selection
For each Haiku line -
- Search match in datasets (Google, Project Gutenberg)
- Syntactic patterns
- 1st line to contain seed word
- 2nd and 3rd contain any theme word
Algorithm
Syntactic Planning
NN NN
DET_a NN of NNS
PP_in DET_the NN
alligator peara handful of whitesin the spring
avocado peara kind of bootsin the fall
pear salada season of tearsin the summer
pear treea seasoning of spicesin the fall
Algorithm
Over Generation
- Create all possible Haiku candidates!- Filter away Haikus with undesired properties
Algorithm
Re-ranking
Heuristics -
Maximize number of 2nd degree associations
*This method chooses quantity over quality..
Is computer generated poem good enough?
fishing guidesboat in the backgrounda new trip
blossomlessbut not unlovedthe old magnolia
first date —the little pileof anchovies
a holy cowa carton of milkseeking a church
blind snakeson the wet grasstombstoned terror
iced over pondI skip a rockthe entire width
Experiment Turing Test
Test of machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human.
Experiment First set (AUTO)
- 10 random Haikus from corpus- 15 computer generated Haikus -
- Human identify main words in first line- Use main word as seed - 2 main words => 2 haikus
Second set (SEL)
- 9 award-winning Haikus - 17 human-selected computer generated Haikus
Experiment - 50 adult subjects
- “Human” / “Not Human”
- 1-5 grade
Experiment
Results
(SEL) 66.7% correct (AUTO) 61.4% correct
Experiment
AUTOResults
Cherry tree / poisonous flowers lie / blooming (72.2% human)
Spring bloom / showing / the sun’s pyre(63.8% human)
Experiment
SEL Results
Early dew / the water contains / teaspoons of honey (77.2% human)
Space journey / musical instruments mythology / of similar drugs(9% human)
Can a computer generated creation capture the beauty of
a flower?
Future work
- Enrich information in WordNet
- New creative structures
- Assisting with SLI
fishing guidesboat in the backgrounda new trip
blossomlessbut not unlovedthe old magnolia
first date —the little pileof anchovies
a holy cowa carton of milkseeking a church
blind snakeson the wet grasstombstoned terror
iced over pondI skip a rockthe entire width