Rijksmuseum presentation
-
Upload
dreamgirl314 -
Category
Education
-
view
89 -
download
0
Transcript of Rijksmuseum presentation
Trusting user-contributed data in Cultural Heritage Domain
Archana Nottamkandath(Work done with Davide Ceolin & Wan Fokkink)
VU University Amsterdam
COMMIT/SEALINC
1
Context
• COMMIT/SEALINC project• Museums have collections which can be
annotated with user-contributed information
COMMIT/SEALINC 2
Evaluation costs Resources
• Is expensive manual labor• Costs a lot of time• Requires adherence to museum policies– Museum X [Accept, not sure, reject]– Museum Y [Foreign, Judgmental, Strong reject,
Strong accept ]..
COMMIT/SEALINC 7
Need for automated trust analysis
• Algorithms automatically/ semi-automatically evaluate annotations
COMMIT/SEALINC 8
(a) Flower(b) 19th century (c) Sunshine(d) Vermeer(e) Bronze
Automated Trust analysis algorithms
• Requirements– High accuracy (Accurately predict evaluations
most of the time)– Minimum input from cultural heritage
professionals– Scalable and Efficient (w.r.t resources and time)– Works with different cultural heritage data
COMMIT/SEALINC 9
Definition
• Trustworthy annotation – Relevant to image– Enhances/re-instates existing knowledge– Is acceptable by museums policies to be published
on their website
COMMIT/SEALINC 10
Used
Accurator Interface
Existing workflow
COMMIT/SEALINC 11
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributed
Tags
Integrate Trust to Existing workflow (Research Question1)
COMMIT/SEALINC 12
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
RQ1:How to determine trust from user contributing annotations to the system?
Integrate Trust to Existing workflow (Research Question 2)
COMMIT/SEALINC 13
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
RQ2: How to determine trust from the Annotation Process?
Integrate Trust to Existing workflow (Research Question 3)
COMMIT/SEALINC 14
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
RQ3: How to determine trust from contributed data?
RQ1:Determine trust from users[1]
• Evaluate subset of user tags
COMMIT/SEALINC 15
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones Test setRosesNight skyVan GoghAsianWarMemorial
contributed
Train setTulipsVan GoghBuddhistMonument
Evaluates
Museum
• User expert on one topic might be expert on similar topics
COMMIT/SEALINC 16
Expert on
Tulips
Possibly Expert on
Possibly Expert on
Roses
Lilies
User_name: Jones
Test setRosesNight skyVan GoghAsianWarMemorial
Train setTulipsVan GoghBuddhistMonument
RQ1:Determine trust from users[1]
With a certain probability
RQ1:Determine trust from users[2]• User profile : [Experience, education, country,
gender, income, museum visits…]
COMMIT/SEALINC 17Steve.museum dataset
RQ1:Determine trust from users[2]• Predict user reputation using Support Vector
Machines(SVM)• [Feature1, Feature2, ..] -> Category of user– [21 yrs, Female, Bachelors, Australia] -> Excellent– [60 yrs, Male, PhD, America] -> Good– [56 yrs, Female, Masters, Croatia] -> Bad– [30 yrs, Male, High School, Mexico] -> ?
COMMIT/SEALINC 18
RQ2: Determine trust from Annotation process
• Time of day, Day of week, Day of month etc. affect user quality
• Typing speed affects user quality– Typing fast might indicate higher confidence
COMMIT/SEALINC 19
TulipsVan GoghBuddhistMonument
Rich LadyPlantLeonardoBronze plate
RQ2: Determine trust from Annotation process
• Predict tag quality using Support Vector Machines(SVM)
• [Feature1, Feature2, ....] -> Category of Tag– [10:00, Monday, June, 3s] -> Excellent– [12:00, Wednesday, 15s] -> Good– [23:56, Friday, April, 80s] -> Bad– [06:00, Thursday, March, 70s] -> ?
COMMIT/SEALINC 20
RQ2: Determine trust from Annotation process
• Why is this important?– Useful for anonymous users who did not fill profile
information
COMMIT/SEALINC 21
RQ3: Determine trust from data• Contributed data itself has features, train SVM
on features to predict quality of tag– Length – Specificity – Presence in vocabularies– Times already contributed– Noun
COMMIT/SEALINC 22
TulipsVan GoghBuddhistMonument
[6,specific, yes, English, 10, no…] -> Good[7,specific, yes, Dutch, 1,yes…] -> Bad
Goals achieved
• Requirements– High accuracy (Accurately predict evaluations
most of the time)– Minimum input from cultural heritage
professionals– Scalable and efficient– Works with different cultural heritage data
COMMIT/SEALINC 23
– High accuracy (Accurately predict evaluations most of the time)• Predicted quality of a tag based on user profile with
accuracy from 68% to 72%
COMMIT/SEALINC 25
Steve dataset results
Goal 1: High Accuracy
Goal 2: Minimum input from Cultural Heritage Institutions
• Algorithms require minimum of 5 evaluated tags per user for predictions
• Working on to minimize/eliminate this requirement
COMMIT/SEALINC 26
Goal 3: Scalable and efficient• Reduced computation time while maintaining
accuracy in Steve dataset
COMMIT/SEALINC 27
Goal 4: Works with different cultural heritage data
• Steve Museum dataset• Waisda? Dataset– Video Tagging Game
• SEALINC Media experiments at CWI
COMMIT/SEALINC 28
Future Work
• Employ our experiences and algorithms to analyze the data from Accurator
• Employ trust scores for ranking in search• Identify techniques to visualize trust
COMMIT/SEALINC 29