TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore

28
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore Peter M. Broadwell Postdoctoral Scholar, Digital Initiatives & Information Technology UCLA Library Timothy R. Tangherlini Professor, Scandinavian Section and Department of Asian Languages UCLA

Transcript of TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore

TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore

Peter M. BroadwellPostdoctoral Scholar,Digital Initiatives &

Information TechnologyUCLA Library

Timothy R. TangherliniProfessor, Scandinavian Section

and Department of Asian Languages

UCLA

Target Domain: The Folklore Collections of Evald Tang Kristensen

It was the old counselor from Skårupgård who came riding with four headless horses to Todbjærg church. He always drove out of the northern gate, and there by the gate was a stall. They could never keep that stall door closed. They had a farmhand who closed it once after it had sprung open. But one night, after he’d gone to bed, something came after the farmhand and it lifted his bed straight up to the rafters and crushed him quite hard. The farmhand shouted and asked it to stop lifting him up there. “No, you've tormented us, and now you’ll die…” I heard that’s how two farmhands were crushed to death. He wanted to close the door and then they never tried to close it again.

Evald Tang Kristensen, Danske Sagn vol. 4, no. 650Told by Ane Margrete Jensdatter in October 1889

The Evald Tang Kristensen Collection of Danish folklore

1.0

ETK 1.0: story view

Topic and index browsing in the Danish folklore archive

The ETK Danish folklore collection version 2.0

The 2011 Shoah Foundation Institute RIPS TeamRodrigo Mendoza Smith (ITAM), Margo Smith (Kenyon),

Anna Kuznetsova (Duke), Peter Sugihara (Bard/Columbia)

Interface to the USC Shoah Foundation Visual History

Archive

• Insert picture from iWitness – how it sucks

Shoah Foundation VHA map interface

Desired feature: place-specific topic keyword suggestion

Topic

sabotageghetto escapes organizations …

Rank

12 3...

Core approach: build a place-to-topic

co-occurrence matrix

Grinderslev

Nykøbing Mors

......

...

witch cooking

101 35

621...

... ... ...

......

Topic heat map with bar graph overlays for

“witch”

Grinderslev

Normalizing by population density

Population density based on 1901 censusPopulation density circa 2000

Story density: story-to-placementions per km2

Normalized story density:places mentioned/person/km2

Not normalized Normalized by population density

Effects of population density-basednormalization for keyword “witch”

Grinderslev

Grinderslev

Grinderslev? Really?

From the Danish Folklore nexus…

Why Grinderslev??• Site of a well-known Augustinian monastery, Grinderslev kloster,

founded in the twelfth century. • The monastery was built near a holy spring, Breum kilde, but

was abandoned in the aftermath of the Reformation.• The spring at Breum was subsequently associated with

witchcraft• In 1686, Anne Madsdatter and her sister were burned at Breum,

the last witch burning in Denmark (Bruun, 1920).• Although this episode is well known in the study of Danish

witchcraft, the persistent relationship between the area surrounding Grinderslev and stories about witchcraft has not been recognized previously, suggesting a topic for further, in-depth inquiry.

• Only a few of the stories mention Grinderslev (but rather places near Grinderslev such as Breum)

Who you gonna call?

Ministers (blue)Cunning folk (red)

Ghosts (blue)Revenants (purple)

DigDag: The Digital Atlas of Danish Historical-Administrative

Geography

A spatial query tooletkspace.scandinavian.ucla.edu/maps/spatialquery.html

Flipping it around

Term Frequency – Inverse Document Frequency

(TF-IDF)Term Frequency

Document Frequency

Locations as Documents

RF-IPF: Ranking topics in a

geographic regionRF-IPF = RF * log( |P| / |p ∈ P : t ∈ p| )RF = region frequency: the number of times topic t co-occurs with places in the region, normalized by the total number of place/topic co-occurrences in the region

|P| = total number of places mentioned in the corpus

|p ∈ P : t ∈ p| = total number of places in the corpus that co-occur in stories with the topic t

Place/topic co-occurrences in the vicinity of Grinderslev

Raw Normalized RF-IPF1. bande (to curse) 2. hale (a tail, prob of a snake)3. sølv (silver)4. tigge (to beg)5. vælte (to tip)6. læse (to read)7. flyde (to flow)8. paste (to take care of animals)9. øre (ear)10. herre (lord)11. lindorm (supernatural snake)12. stille (to place)13. østen (to the east)

1. kusk (carriage driver)2. reste (remainder)3. grønning (village green)4. rådelig (recommended)5. boel (a large farm)6. kristenblod (Christian blood)7. om kap (race or competition)8. indhylle (enshroud)9. søkke (to sink down)10. mæt (sated)11. konfirmation (confirmation)12. tjørn (hawthorn)13. mane (to conjure)

1. paste (to take care of animals)2. flyde (to flow)3. hale (tail)4. grønning (village green)5. borggård (fort)6. sølv (silver)7. bande (to curse)8. søkke (to sink down)9. herre (lord)10. læse (to read)11. mane (to conjure down)12. lindorm (supernatural snake)13. vælte (to tip over)

TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore

Peter M. BroadwellPostdoctoral Scholar,Digital Initiatives &

Information TechnologyUCLA Library

Timothy R. TangherliniProfessor, Scandinavian Section

and Department of Asian Languages

UCLA