Counting Tales: Towards a Computational Narratology
-
Upload
johnlaudun -
Category
Education
-
view
438 -
download
0
Transcript of Counting Tales: Towards a Computational Narratology
John LaudunUNIVERSITY OF LOUISIANA
[email protected]://johnlaudun.org/
Counting Tales Towards a
Computational Narratology
A Pirate in a Tree
Immigration Layers
African
Colonial French
GermanAcadianEnglish
African
German
Acadian
English
GENERAL HISTORICAL PATTERN
RAYNE IMMIGRATION
Rayne’s Location within Louisiana
• Population: 8500.• 34% of population in 2000 was African American. • African Americans tend to live in discrete neighborhoods.• Blacks and whites interact in commercial sectors but not in residential areas.• African American residential areas have a number of features that allow for the creation
and maintenance of a distinct folk culture.
Rayne, Louisiana (雷恩)
Oscar Babineaux• Lived in Rayne his entire life.
• Well known “shit talker.”
• Talking shit can include: insults, jokes, and a rhymed form of speaking known by folklorists as “toasts” ... and now we know it also includes legends.
• Shit talking has mostly been investigated as an urban phenomenon, but it would appear to have been active in rural areas since at least the days of Zora Neale Hurston’s Mules and Men.
Summary of Legend• Babineaux stops by his family home.
• People are digging in the backyard for treasure.
• Joins family in prayer..
• He and his nephew take water to diggers, encounter a pirate in a tree.
• Pirate asks for something to drink.
• They give him some.
• Pirate asks again.
• They get scared and don’t give him anything.
• Pirate threatens them.
• They run. A shovel flies through the air and lands in a tree.
Babineaux Text
Like I said my family was weird. They liked to dig for money and stuff. Said my grandfather had left us some money. And they was digging for it. So one day we went, and I was at work, so I can see, we at a country spot, like our property.
So I can see a lot of people dressed in white. So I’m curious me. I said, “Well, shit, what the hell is everybody doing out there dressed
in white? I wanna see.” So I goes out there. So they tell me, “You’re working right now, just go
home come back. You know, come back after work.”
So I goes back, man, after work. So, they all in the house. We all praying man, everyone’s on their knees praying. They got an excavator in the back yard, digging. [Laughs.] You understand? Find this money, I guess. We’re on our knees, man, we’re praying. It’s like in the pit of the summer, like here. No wind nothing.
They had a wind come through the house. That wind was so strong my aunt was holding onto the door like that, and both her legs was in the air. That’s how strong the wind was. In the house.
So they said … they picked me, my nephew — the one I was telling you that talk all that shit, and my little niece to go bring some water to the workers in back, the one that was doing the work. So we got to walking. We passed on the side of the house to bring them.
So my nephew said, “Say man you see that guy in the tree?”
I said, “Man fuck I don’t see nobody in no tree.”
He said, “Yeah man he be right there sitting on that limb.”
I said, “I don’t see nobody man.”
I’m getting scared now.
Man I don’t see nobody.
But he’s seeing this, you know.
So he said— I said, “How he look?”
“It’s a guy,” he said, “it’s a guy dressed in a pirate suit, man.”
He said, “He got a pirate hat on. He got a pirate jacket.” And he started talking to
him.
The guy in the tree started talking to him, while he’s telling me this. But the guy in
the tree is telling him: shut up don’t tell me that.
So he telling me, “Man, look he right there. You can’t see him? Look he right
there on that branch.”
He say, “He want something more to drink.”
You know, because what they had did: they’d put a bowl in the back yard, under
this tree, with some alcohol in it. You understand?
And I don’t know if it was the sun that would dissolve it, but it would be gone.
Okay, so he say he say, “Man, he want another drink.”
So I said, “Fuck man don’t tell me that .”
I wanna get back in the house.
I said, “I don’t see nobody up there.”
So we kept on walking. We went out there. We brung them some water. So on
our way back.
Look at him.
He say, “See you, you son of a bitch.”
He say, “You don’t wanna give me another drink, huh?”
He say, “You gonna be just like me.”
He say, “You see this here peg leg?”
He say, “You going to be just like me.”
He say, “For this out here y’all are going to have to lose something.”
So, man, it got kind of scared. We started walking fast. By the time we got to
the house, I broke out a run. A shovel, man, come from the back of the house. I
mean full force.That shovel stuck in that tree so deep we had to dig it out with an
axe. It stuck … you know with a shovel, it’s hard to stick a shovel into anything.
That shovel went inside the tree halfway.
Method & Madness
0
275
550
825
1100
LOH 164 LOH 165 LOH 160 LAU 14 LAU 13 LOH 157 LOH 162 ANC 88 LOH 161 LOH 159 LOH 163 LOH 162b LOH 158 ANC 91 ANC 90 ANC 89
TotalUnique
Figure 1: Graph of Length and Lexical Diversity of Oral Legend Texts
TEXT TOTAL WORDS UNIQUE WORDS PERCENTAGE
ANC 88 331 142 0.43ANC 89 153 83 0.54ANC 90 175 86 0.49ANC 91 176 105 0.60BRO-01 117 74 0.63BRO-02 67 50 0.75BRO-03 136 90 0.66BRO-04 122 79 0.65LAU 13 375 175 0.47LAU 14 655 207 0.32LOH 157 364 166 0.46LOH 158 193 106 0.55LOH 159 282 144 0.51LOH 160 761 287 0.38LOH 161 295 129 0.44LOH 162 332 144 0.43LOH 162b 194 108 0.56LOH 163 209 114 0.55LOH 164 1025 318 0.31LOH 165 905 277 0.31
Details of Lexical Diversity of Oral Texts
0
50
100
150
200
morris-02 goble-03 goble-04 davis-02 morris-01 goble-05 davis-03 goble-01 goble-02 davis-01 bridgwaters-01
TotalUnique
Figure 1: Graph of Length and Lexical Diversity of Oral History Texts
0
250
500
750
1000
1250
1500
LOH 164 LOH 160 LAU 13 LOH 162 LOH 161 LOH 163 LOH 162b ANC 91 goble-03 goble-04 BRO-04 davis-02 goble-05 goble-01 davis-01 bridgwaters-01
Total Words for Legend Unique Words for LegendTotal Words for Oral History Unique Words for Oral History
Zipf’s law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.
The most frequent word will occur approximately twice as often as the second most frequent word and three times as often as the third.
LAW
POWER
CURVE
Most Frequent Words in Legendsthe 251
and 202
was 201
he 191
it 175
to 154
a 143
they 135
that 121
i 121
there 105
of 102
in 100
said 77
you 67
had 66
this 63
so 63
on 48
but 47me 46went 45out 44him 42well 40we 40up 40money 40back 39one 38man 38got 36his 34all 34with 33where 33just 33know 32see 29
when 28them 28like 28go 27told 26she 26little 26what 25old 25now 25my 24here 23some 22about 22time 21over 21come 21at 21something 20house 19
LEXICAL DIVERSITYLEXICAL DIVERSITYLEXICAL DIVERSITY LENGTHLENGTHLENGTHCOLLECTION AVERAGE MINIMUM MAXIMUM AVERAGE MININUM MAXIMUM
Legends 0.46 0.31 0.60(0.75) 343 (67)
153 1025
Oral Histories 0.65 0.55 0.79 111 43 196
Comparison of Lengths and Lexical Diversity
Why count words?•Because we haven’t, and we should have. •In oral discourse, especially in traditional oral discourse, every
word matters, and we should have a better understanding of what each word does.
•Word counts begin to give us a baseline for understanding a greater variety of human verbal activity. (That is, let’s join up with the linguists — or, rather, they are already getting there with stylistics, wouldn’t it be nice not to get pushed out?)
Potential Problem•Word counts depend on accurate transcripts. [Computer scientists
have a saying: “Garbage in, garbage out.”] •The good news is that we have four decades of talking about this and
doing it.•What this means is that we can take our accurate transcriptions and
begin to really understand how many and which words humans need to create their reality, or alternate realities.
Why Count Words? (Redux)• We can assume that all genres will have a wide variety of forms, but, looked at
from a statistical perspective, are there some patterns there that might lead us to more interesting investigations?
• Will folktales emerge as typically longer, and will that length be a function of the necessity of setting up a more complex storyworld?
• E.g., Ray Hicks’ telling of “Jack and the Fire Dragon” is 1977 words. (See handout.)
• What about lexical diversity and frequencies?• And how do these simple matters of counting relate to morphologies and
semantic networks?
After We Count Words...• I said earlier that “in oral discourse, every word matters.” That’s not entirely true.• As we saw in the previous sets of statistics, there are a lot of words that don’t matter very
much when we think about the meaning of a story:
the 251
and 202
was 201
he 191
it 175
to 154
a 143
they 135
that 121
i 121
there 105
of 102
in 100
said 77
you 67
had 66
this 63
so 63
on 48
but 47
me 46
went 45
out 44
him 42
well 40
we 40
up 40
money 40
back 39
one 38
man 38
got 36
his 34
all 34
with 33
where 33
just 33
know 32
see 29
• Linguists call those words that contribute to the structure of an utterance but not its meaning function words.
• They often eliminate these words from consideration by using stop lists when running a computer program to assess various dimensions of a text, like calculating the lexical diversity of texts.
• A word like said, for example, is typically dropped from consideration by linguists. In my own work, said is incredibly important both for its syndetic and traditionalizing functions (Laudun 2012).
• What remains are content words.
Content Words in Context g to see . BA : There was n't money buried there ? SF : Supposedly. That 's wh
yed around his grave a lot . He was buried , still buried , where we lived . H
grave a lot . He was buried , still buried , where we lived . He was buried in
ll buried , where we lived . He was buried in the yard where I lived . They ha
m . Killed him . His wife and Billy buried him right there . That night as it
ppi . That night , supposedly , she buried her money on the other side of Jean
nd early American coins . They were buried there . They said it was Lafitte .
West was the outlaw that did that ( buried the money ) , according to the Mexi
s told the story that this money is buried in there. By the time they made the
n't remember where it was. They had buried all this money he had. What it was
tly , this money was supposed to be buried there. That money , as far as anybo
re 's always been a claim that they buried whatever they had somewhere in that
ime ago ( pause ) they claimed they buried their money. That was back when the
was seven-foot deep. You could 've buried a car in it. There was some of them
hat , well usually treasure , money buried around. These three fellows came up
ese treasure things, like you hunt buried treasure with. He had one of those
Key Words in Context allows us to see, at a glance, exactly where words occur in a sentence, but what if we want to see larger patterns. For example, what if we wanted to see what words are regularly found together: collocated.
Key words can also be said to co-occur across several texts, giving us a better sense of how texts are related to each other. Currently, this kind of analysis, often used in topic modeling, treats texts as bags of words.
Topic modeling•MALLET for LDA (Latent Dirichlet Allocation)•LSA (Latent Semantic Analysis) is built into Mac OS X, available
elsewhere.Word Statistics / KWIC / Word Placement
•Python, Python NLTK•For everything else, R.
All of this information will be in the bibliography.
Software
What Python looks like
#! /usr/bin/env python
import globimport re
files = {}for fpath in glob.glob("*.txt"): with open(fpath) as f: fixed_text = re.sub("[^a-zA-Z'-]"," ",f.read()) files[fpath] = (len(fixed_text.split()),len(set(fixed_text.split()))) print "Total Words:" , len(fixed_text.split()) print "Total Unique:",len(set(fixed_text.split()))
with open("wordstats.csv", "w") as f: for fname in files: print >> f , "%s,%s,%s"%(fname,files[fname][0],files[fname][1])
That doesn’t look like folklore studies...
Neither did Propp or Lévi-Strauss...
So what?
•Not as we currently practice it.•We have gotten very good at the ethnographic description of texts in
context. We understand motivated behavior very well.•We left behind the study of the human mind. (E.g., structuralism). •Cognitive studies, and the adjacent field of creativity studies, have
exploded in recent years and we need to make sure we are part of that conversation.
None of this looks like folklore studies!
A Future for Folklore Studies?• Performance theory / ethnomethodologies / ethnography of speaking give us some
of the most accurate accounts of human verbal behavior in the world.• Lord’s work on oral formulas is now foundational work in cognitive science.
See: David Rubin’s Memory in Oral Tradition. (Sample chapter is in packet of papers.)
• Also in packet: computer scientists working on morphology, physicists mapping of myth.
Sorry for letter-sized pages: PDFs will be in Dropbox folder.
Laudun, John. 2012. “Talking Shit” in Rayne: How Aesthetic Features Reveal Ethical Structures. Journal of American Folklore 125 (497): 304–326.