Download - Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Transcript
Page 1: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State and University LibraryRalf Stockmann (UGOE)

Page 2: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

DATA

RESS

OU

RCES

COM

PUTI

NG

RESE

ARCH

ENVI

RON

MEN

TS

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Page 3: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

DATA

RESS

OU

RCES

COM

PUTI

NG

RESE

ARCH

ENVI

RON

MEN

TS

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Use case #1:

eAqua

Page 4: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Projekt: eAqua

• Partners:– Institut of Computer Science - Computerlinguistic,

Leipzig (Büchler, Eckart, Heyer, Baumgardt)– SUB Göttingen (Stockmann, Kothe, Mahnke)

• Comparing semantic graphs between– Headings of journal articles and– Fulltext of the same articles

Page 5: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Search Term „socialism“ on title elements

Page 6: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

„Mephisto“ on fulltext

Page 7: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

DATA

RESS

OU

RCES

COM

PUTI

NG

RESE

ARCH

ENVI

RON

MEN

TS

Textmining

Enhanced Context-Search

Multilingual Access

DBPedia, ...

Visualisation

Metadata

OCR/Fulltext

Named Entity Recognition

Catalog Data

Crowd- sourcing

Annotation Tools

Relationship Graphs

Linked Open Data

Ontologies

Scholars

Libraries

Reposi-tories

Use case #2:

Europeana 4D visualisation

- Prof. Dr. Gerik Scheuermann- Stefan Jänicke- Christian Mahnke- Ralf Stockmann

Page 8: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Concept

MAP

Page 9: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Concept

MAP TIMELINE

Page 10: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Concept

MAP TIMELINE

Page 11: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

• Multiple data layers• Interaction• Animation• Aggregation of data• Connections• Drilldown• Historical/custom

maps• Result table• Splitting Datasets• ...

Refinement

Page 12: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Page 13: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Technological Framework

• OpenLayers• Simile Timeline/Timeplot• GeoNames (Geoparser...)• Explorer Canvas (Google)• GeoServer (OpenStreetmap, Google Maps)• Google Web Toolkit (GWT)• KML (XML)

Page 14: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

KML

Data Model

WHAT?

WHERE? WHEN?

NAME

description

url

COORDINATES

address

TIMESTAMP

range

MANDATORY

optional

Page 15: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Exchange Format: KML (XML)

Page 16: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Questonnaire

Who was the first football playerborn on the continent and playing for an English team?

Page 17: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Questonnaire

In how many years more museumsare established

in the western states of the USA than

in the eastern states?

Page 18: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Questonnaire

In how many years more museums

are founded in the western states

of the USA than in the eastern states?

Page 19: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Questonnaire

In how many years more museums

are founded in the western states

of the USA than in the eastern states?

Page 20: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Questonnaire

Iraq police claims „only 265 civilian casualties since March 2007 in

Bagdhad“. Are they right?

Page 21: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Datasets

• Library catalog• Flickr• IMDB• DBpedia• WikiLeaks

Flickr: „tsunami“

Page 22: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Use your own data in 5 easy steps!

1. Take a look at the .kml specificationhttp://tinyurl.com/e4d-kml

2. Build your own KML dataset3. Upload it to a webserver4. Put the URL into the prototype at http://tinyurl.com/e4d-

demo5. Share your set via the magnetic link!

Page 23: Controlled Vocabularies and Text Mining - Use Cases at the Goettingen

Ressources

• e4D info website:

• Europeana thoughtLab: http://www.europeana.eu/portal/thoughtlab.html