Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
-
Upload
ralf-stockmann -
Category
Education
-
view
407 -
download
2
description
Transcript of Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
![Page 1: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/1.jpg)
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University LibraryRalf Stockmann (UGOE)
![Page 2: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/2.jpg)
DATA
RESS
OU
RCES
COM
PUTI
NG
RESE
ARCH
ENVI
RON
MEN
TS
Textmining
Enhanced Context-Search
Multilingual Access
DBPedia, ...
Visualisation
Metadata
OCR/Fulltext
Named Entity Recognition
Catalog Data
Crowd- sourcing
Annotation Tools
Relationship Graphs
Linked Open Data
Ontologies
Scholars
Libraries
Reposi-tories
![Page 3: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/3.jpg)
DATA
RESS
OU
RCES
COM
PUTI
NG
RESE
ARCH
ENVI
RON
MEN
TS
Textmining
Enhanced Context-Search
Multilingual Access
DBPedia, ...
Visualisation
Metadata
OCR/Fulltext
Named Entity Recognition
Catalog Data
Crowd- sourcing
Annotation Tools
Relationship Graphs
Linked Open Data
Ontologies
Scholars
Libraries
Reposi-tories
Use case #1:
eAqua
![Page 4: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/4.jpg)
Projekt: eAqua
• Partners:– Institut of Computer Science - Computerlinguistic,
Leipzig (Büchler, Eckart, Heyer, Baumgardt)– SUB Göttingen (Stockmann, Kothe, Mahnke)
• Comparing semantic graphs between– Headings of journal articles and– Fulltext of the same articles
![Page 5: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/5.jpg)
Search Term „socialism“ on title elements
![Page 6: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/6.jpg)
„Mephisto“ on fulltext
![Page 7: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/7.jpg)
DATA
RESS
OU
RCES
COM
PUTI
NG
RESE
ARCH
ENVI
RON
MEN
TS
Textmining
Enhanced Context-Search
Multilingual Access
DBPedia, ...
Visualisation
Metadata
OCR/Fulltext
Named Entity Recognition
Catalog Data
Crowd- sourcing
Annotation Tools
Relationship Graphs
Linked Open Data
Ontologies
Scholars
Libraries
Reposi-tories
Use case #2:
Europeana 4D visualisation
- Prof. Dr. Gerik Scheuermann- Stefan Jänicke- Christian Mahnke- Ralf Stockmann
![Page 8: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/8.jpg)
Concept
MAP
![Page 9: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/9.jpg)
Concept
MAP TIMELINE
![Page 10: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/10.jpg)
Concept
MAP TIMELINE
![Page 11: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/11.jpg)
• Multiple data layers• Interaction• Animation• Aggregation of data• Connections• Drilldown• Historical/custom
maps• Result table• Splitting Datasets• ...
Refinement
![Page 12: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/12.jpg)
![Page 13: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/13.jpg)
Technological Framework
• OpenLayers• Simile Timeline/Timeplot• GeoNames (Geoparser...)• Explorer Canvas (Google)• GeoServer (OpenStreetmap, Google Maps)• Google Web Toolkit (GWT)• KML (XML)
![Page 14: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/14.jpg)
KML
Data Model
WHAT?
WHERE? WHEN?
NAME
description
url
COORDINATES
address
TIMESTAMP
range
MANDATORY
optional
![Page 15: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/15.jpg)
Exchange Format: KML (XML)
![Page 16: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/16.jpg)
Questonnaire
Who was the first football playerborn on the continent and playing for an English team?
![Page 17: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/17.jpg)
Questonnaire
In how many years more museumsare established
in the western states of the USA than
in the eastern states?
![Page 18: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/18.jpg)
Questonnaire
In how many years more museums
are founded in the western states
of the USA than in the eastern states?
![Page 19: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/19.jpg)
Questonnaire
In how many years more museums
are founded in the western states
of the USA than in the eastern states?
![Page 20: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/20.jpg)
Questonnaire
Iraq police claims „only 265 civilian casualties since March 2007 in
Bagdhad“. Are they right?
![Page 21: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/21.jpg)
Datasets
• Library catalog• Flickr• IMDB• DBpedia• WikiLeaks
Flickr: „tsunami“
![Page 22: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/22.jpg)
Use your own data in 5 easy steps!
1. Take a look at the .kml specificationhttp://tinyurl.com/e4d-kml
2. Build your own KML dataset3. Upload it to a webserver4. Put the URL into the prototype at http://tinyurl.com/e4d-
demo5. Share your set via the magnetic link!
![Page 23: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen](https://reader030.fdocuments.us/reader030/viewer/2022020306/5550690eb4c905cc0f8b467b/html5/thumbnails/23.jpg)
Ressources
• e4D info website:
• Europeana thoughtLab: http://www.europeana.eu/portal/thoughtlab.html