Toward Sequencing “Narrative DNA”: Tale Types, Motif Strings and Memetic Pathways
-
Upload
peter-wittek -
Category
Technology
-
view
396 -
download
3
description
Transcript of Toward Sequencing “Narrative DNA”: Tale Types, Motif Strings and Memetic Pathways
Sándor Darányi, Peter Wittek & László Forró†
Swedish School of Library and Information
Science University of Borås
50190 Borås, Allégatan 1, Sweden
†8220 Balatonalmádi, Remetevölgyi út 27, Hungary
Toward Sequencing “Narrative DNA”: Tale Types, Motif Strings and Memetic Pathways
Acknowledgements
• AMICUS project 2009-2012 (Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts, Netherlands Organization for Scientific Research, NWO Humanities)
2 CMN-12 Istanbul -- May 26, 2012
Structure of presentation
I. Frame of thought, concepts used
II. Experiment design
III. Results
IV. Future research directions
3 CMN-12 Istanbul -- May 26, 2012
I. Frame of thought, concepts used
• Examples of formulaity
• Standard tools in tale research
• What is a motif?
• The genetic-memetic parallel
4 CMN-12 Istanbul -- May 26, 2012
Examples of formulaity (= structure) in narrative research
• Propp (1929): Russian fairy tales have 7 actors (dramatis personae), 31 functions (types of actions) and 150 narrative elements
• Lévi-Strauss (1954): both narrative segments in myths, and myth variants, manifest canonical content transformations
• Harris (1998): Disciplinary scientific content can be expressed by sentences of abstract concepts
5 CMN-12 Istanbul -- May 26, 2012
Standard reference works for tale research
ATU: Uther, H. J. 2004. The Types of International Folktales. A Classification and Bibliography. Based on the System of Antti Aarne and Stith Thompson 1–3 (FFC 284–286). Academia Scientiarum Fennica, Helsinki.
• TALES OF MAGIC, SUPERNATURAL ADVERSARIES 300-399
• Tale type 300: The Dragon-Slayer. “A youth acquires (e.g. by exchange) three wonderful dogs [B421, B312.2]. He comes to a town where people are mourning and learns that once a year a (seven-headed) dragon [B11.2.3.1] demands a virgin as a sacrifice [B11.10, S262]. In the current year, the king's daughter has been chosen to be sacrificed, and the king offers her as a prize to her rescuer [T68.1]. The youth goes to the appointed place. While waiting to fight with the dragon, he falls into a magic sleep [D1975], during which the princess twists a ring (ribbons) into his hair; only one of her falling tears can awaken him [D1978. 2]. (…)”
AaTh: Thompson, S. 1955-1958. Motif-Index of Folk-Literature 1–6. Indiana University Press, Bloomington.
• B312. /Helpful animals obtained by purchase or gift./
• B312.1. /Helpful animals a gift./ German Grimm No. 60, 126; Irish myth: Cross; Spanish: Boggs FFC XC 40 No. 300; Icel.: Boberg, Þiðriks saga I 314--18; India: Thompson-Balys; Japanese: Ikeda.
• B312.2. /Helpful animals obtained by exchange./ *Type 300; *Hartland Perseus III 195; De Gubernatis Zool. Myth. III 36 n.--N. A. Indian: Thompson CColl II 329ff.
• B312.3. /Helpful animal(s) bequeathed to hero./ Italian Novella: Rotunda; India: Thompson-Balys; Africa (Hausa): Best Black Folk Tales 71ff., Tremearne Hausa Superstitions and Customs 374ff. No. 79; Madagascar: (Marofotsy) Renel Contes de Madagascar I 65ff. No. 9. (…)
6 CMN-12 Istanbul -- May 26, 2012
What is a motif ?
7 CMN-12 Istanbul -- May 26, 2012
Argumentation
• From ”narrative DNA” (Bruce 1996, cf. postmoderns) to ”narrative genomics ”(Malec)
• Perceived formal similarities between genetic code and ”memetic code” in tale types – Memes (Dawkins 1976):
• An idea, behavior or style that spreads from person to person within a culture
• Self-replicating unit of cultural transmission with potential significance in explaining human behavior and cultural evolution
– ”Memetic pathway”: memory engraving by frequency- (repetition-)-based content
• If pertinent, the above make DNA sequencing techniques applicable to motif sequences
• Benefits: – Memetics is short of measurable evidence – Narrative analysis is short of evolutionary modelling tools – Tale types as metadata may bridge the gap
8 CMN-12 Istanbul -- May 26, 2012
Ingredients: tale types as motif sequences
300 The Dragon-Slayer. B421 B312.2 B11.2.3.1 B11.10 S262 T68.1 D1975 D1978.2 B11.11
300A The Fight on the Bridge. T511.5.1 F601 B631 B11.2.3.2 B11.2.3.3 B11.2.3.5 B11.2.3 B11.11 B401
301 The Three Stolen Princesses. H1385.1 F102.1 N773 B631 T615 F601 F451.5.2 F92 F96
301D* The Princess's Ring T68.1 B11.11 K1933 C611 H94 L161
302 The Ogre's (Devil's)Heart in the Egg. B393 B500 D1834 R11.1 D152.2 D182.2 E710 K975.2 E711.1
302B Life Dependent on a Sword. T510 F601 T11.2 E711.10
302C* The Magic Horse. C611
303 The Twins Or Blood-Brothers. T511.5.1 T511.1.1 T512 T589.7.1 E761 R111.1.3 K1932 H83 L161
303A Brothers Seek Sisters as Wives. T69.1 D231 R11.1 E715.1 R155.1
304 The Dangerous Night-Watch. F666.1 K912 H83 N711.2 H81.1 H81.1.1 T475.2 Q481 H11.1.1
305 The Dragon's Heart-Blood as Remedy. D1500.1.7.3.3K1935
306 The Danced-out Shoes. H508.2 D1980 K625.1 D2131 F1015.1.1 H80 L161 F87
307 The Princess in the Coffin. C758.1 S223 E251 N825.2 D791.1.7 L162
310 The Maiden in the Tower (Petrosinella, Rapunzel). G279.2 S222.1 G204 R41.2 F848.1 F555 N455 D642.7 L162
311 Rescue by the Sister. R11.1 T721.5 C611 C227 C913 C920 R157.1 G561 K525
311B* The Singing Bag. K526
312 Maiden-Killer (Bluebeard). S62.1 C611 C920 K551 G551.1 G652
312D Rescue by the Brother. T511.3 F611.1
313 The Magic Flight. B261 S222 S222 S240 G465 H1104 H1113 H1154.8 H335.0.1
314 Goldener. S211 G462 B316 C611 C912 D672 B184.1.6 G461 C912
314A The Shepherd and the Three Giants. D817 L113.1.4 G500 B184.1 R222 L161
9 CMN-12 Istanbul -- May 26, 2012
Mutant screening in genetics:
Kinds of mutation 1 (DNA)
10 CMN-12 Istanbul -- May 26, 2012
Kinds of mutation 2 (chromosomes)
11 CMN-12 Istanbul -- May 26, 2012
Kinds of mutation 3 (chromosomes)
With tale types as motif sequences, we can answer questions like:
– Are there repeated motif substrings?
– Are there inverted motif substrings?
– Etc.
These are important for plot formation.
12 CMN-12 Istanbul -- May 26, 2012
The genetic-memetic parallel
Alphabet Buildup Result
Character sequence
Lexeme (?)
Lexeme (?) sequence
Meme/motif
Meme/motif sequence
Story/tale
Story/tale set (inherent but not
sufficient?)
Corpus
Alphabet Buildup Result
Nucleotide sequence
Amino acid
Amino acid sequence
Gene
Gene sequence
Chromosome
Chromosome set
(inherent but not
sufficient)
Cell
13 CMN-12 Istanbul -- May 26, 2012
II. Experiment design
• Instead of natural language (original) texts, metadata used: tale types from ATU and motifs from AaTh
– Zooming in = decreased content granularity
• Tale types as motif strings, processed as a corpus
• Binary vs. frequency-based matrix of 219 types x 1202 motifs (“Tales of magic” (types 300-749))
• Block (2-mode) clustering for motif co-occurrence analysis (HCE-3)
• Manual screening of motif strings for mutation types
• Network analysis and visualization
14 CMN-12 Istanbul -- May 26, 2012
III. Results
• Findings are indicative, however:
– Top observation unit not motifs but collocated motif co-occurrences (multiplets)
– Motif sequences show signs of recombination in the storytelling process with chromosome mutation types mostly there in a limited sample
– Motif strings form highly complex networks
CMN-12 Istanbul -- May 26, 2012 15
Motif multiplets
• Tale types are a shorthand for the originals
• Motifs sometimes co-occur in tale types, i.e. motifs are not the ultimate observation units
16 CMN-12 Istanbul -- May 26, 2012
17 CMN-12 Istanbul -- May 26, 2012
Tale type and chromosome mutations are
similar
• On a 219 types x 1202 motifs sample, hints found at insertion/deletion, duplication and, possibly, transposition, whereas the sample was not sufficient to find inversion as well
• Screening on a larger sample is necessary
18 CMN-12 Istanbul -- May 26, 2012
Tale mutant screening: The needle in the haystack
19 CMN-12 Istanbul -- May 26, 2012
IV. Future research directions
• Sequence mining is feasible - find longest common motif subsequences in motif strings as if they were DNA
20 CMN-12 Istanbul -- May 26, 2012
Motif network processing and visualization
• Network points to fitness landscape, and to evolutionary algorithms, GA and GP
• The network of motif connections by directed graphs suggests that tale plots as consolidated pathways of content help one memorize culturally engraved messages
• Identifying plot direction with graph direction, we anticipate a connection between such networks and fitness landscapes (e.g. Waddington’s epigenetic landscape)
CMN-12 Istanbul -- May 26, 2012 21
Toward storytelling as a landscape
Translate motif chains to a directed graph
Convert the directed graph into a fitness landscape
22 CMN-12 Istanbul -- May 26, 2012
Thank you for your attention!
Borås, 11.05.2012