Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator...

10
Crowdsourcing KnowledgeIntensive Tasks In Cultural Heritage J. Oosterman, A. Bozzon, G.J. Houben A. NoCamkandath, C. Dijkshoorn, L. Aroyo

description

Slides prepared by Jasper Oosterman ([email protected]). Find more information at our website: http://sealincmedia.wordpress.com Large datasets such as Cultural Heritage collections require detailed annotations when digitised and made available online. Annotating dierent aspects of such collections requires a variety of knowledge and expertise which is not always possessed by the collection curators. Artwork annotation is an example of a knowledge intensive image annotation task, i.e. a task that demands annotators to have domain-specic knowledge in order to be successfully completed. Today, Lora Aroyo will present WebSci2014 conference the results of a study aimed at investigating the applicability of crowdsourcing techniques to knowledge intensive image annotation tasks. We observed a clear relationship between the annotation difficulty of an image, in terms of number of items to identify and annotate, and the performance of the recruited workers. Here you can see the poster and the slides of the presentation.

Transcript of Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator...

Page 1: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

Crowdsourcing  Knowledge-­‐Intensive  Tasks  In  Cultural  Heritage  

J.  Oosterman,  A.  Bozzon,  G.J.  Houben  

A.  NoCamkandath,  C.  Dijkshoorn,  L.  Aroyo  

Page 2: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

AnnotaIon  of  digiIzed  Cultural    Heritage  with  specific  elements.  For  example:  Flowers,  Birds,  Castles  

Page 3: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

Botanical  name  depicted  element  Rosa  banksiae  

What  is  the  relaIon  between  enIty  idenIficaIon  difficulty  and  crowd  annotaIon  behavior?  

Page 4: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

Complica0ons  Varying  amount  of  elements,  different  sizes  and  prominence.  Possibly  overlapping  or  lacking  detail.  

Page 5: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

Examples  of  prints  in  the  collecIon  

Page 6: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

AnnotaIon  task  template  

Number  of  flowers  

Number  of  flower  types  

Up  to  three  flower  labels  

Certainty  of  the  answers  

References  

Page 7: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

# Unable# Fantasy# Flower Name

# Pr

ints

1

10

20

30

40

Worker ID0 10 20 30 40

#  Workers  opened  task   732  

#  Workers  passed  test  quesIons   84  

#  Selected  workers   44  

#  AnnotaIon  tasks  performed   488  

#  “Fantasy”  task   58  

#  “Unable”  task   70  

#  “Flowers”  task     360  

#  Flower  labels   465  

Tasks  performed  by  workers  

LiCle  agreement  on    “unable”  or  “fantasy”  

Tasks  requiring  domain    knowledge  not  popular    

Page 8: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

# Instances# Prints

10

100

RoseLilium

TulipSunflower

Dianthus

BellisIris Viola

Orchidaceae

Paeonia

Papaver

LotusBellis Perennis

Narcissus

Hyacinth

% W

rong

Ans

wer

00.20.40.60.81.0

# of FlowersNP P

Flower TypesNP P

Flower  idenIficaIon  not  dependent  on  prominence    

Most  workers  provide  simple  common  names,  but  also  workers  with  excepIonal  detail.  

Page 9: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

Local  idenIficaIon  

Outlook  

Local  labeling  

Page 10: Crowdsourcing Knowledge-Intensive Tasks In Cultural Heritage: SealincMedia Accurator demonstrator @WebSci2014

hCp://sealincmedia.wordpress.com    

sealincmedia-­‐[email protected]      

J.  Oosterman,  A.  Bozzon,  G.J.  Houben  

A.  NoCamkandath,  C.  Dijkshoorn,  L.  Aroyo