Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

25
Controlled Vocabularies and Text Mining – Use Cases at the Goettingen State and University Library Ralf Stockmann TextGrid Workshops – July 13th 2011

description

 

Transcript of Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Page 1: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Controlled Vocabularies and Text Mining –

Use Cases at the Goettingen State and University Library

Ralf StockmannTextGrid Workshops – July 13th 2011

Page 2: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Page 3: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Use case #1:

eAqua

Page 4: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Projekt: eAqua

• Partners:– Institut of Computer Science - Computerlinguistic,

Leipzig (Büchler, Eckart, Heyer, Baumgardt)– SUB Göttingen (Stockmann, Kothe, Mahnke)

• Comparing semantic graphs between– Headings of journal articles and– Fulltext of the same articles

Page 5: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Search Term „socialism“ on title elements

Page 6: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

„Mephisto“ on fulltext

Page 7: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Use case #2:

Europeana 4D visualisation

Partner:

Page 8: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Concept

MAP

Page 9: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Concept

MAP TIMELINE

Page 10: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Concept

MAP TIMELINE

Page 11: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

• Multiple data layers• Interaction• Animation• Aggregation of data• Connections• Drilldown• Historical/custom

maps• Result table• Splitting Datasets• ...

Refinement

Page 12: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library
Page 13: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Technological Framework

• OpenLayers• Simile Timeline/Timeplot• GeoNames (Geoparser...)• Explorer Canvas (Google)• GeoServer (OpenStreetmap, Google Maps)• Google Web Toolkit (GWT)• KML (XML)

Page 14: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Data Model

WHAT?

NAME

description

url

MANDATORY

optional

Page 15: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Data Model

WHAT?

WHERE?

NAME

description

url

COORDINATES

address

MANDATORY

optional

Page 16: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

KML

Data Model

WHAT?

WHERE? WHEN?

NAME

description

url

COORDINATES

address

TIMESTAMP

range

MANDATORY

optional

Page 17: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Exchange Format: KML (XML)

Page 18: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Questonnaire

Page 19: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Questonnaire

Page 20: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Questonnaire

Page 21: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Questonnaire

Page 22: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Questonnaire

Page 23: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Datasets

• Library catalog• Flickr• IMDB• DBpedia• WikiLeaks

Flickr: „tsunami“

Page 24: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Use your own data in 5 easy steps!

1. Take a look at the .kml specificationhttp://tinyurl.com/e4d-kml

2. Build your own KML dataset3. Upload it to a webserver4. Put the URL into the prototype at http://tinyurl.com/e4d-

demo25. Share your set via the magnetic link!

Page 25: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University Library

Ressources

• e4D info website: http://tinyurl.com/e4d-project

• Europeana thoughtLab: http://www.europeana.eu/portal/thoughtlab.html