Gic2011 aula10-ingles

41
Information & Knowledge Management Discussion of “Collective Knowledge Systems: When the Social Web Meets the Semantic Web” from Tom Gruber (TomGruber.org) Marielba Zacarias Prof. Auxiliar DEEI FCT I, Gab 2.69, Ext. 7749 [email protected]

description

 

Transcript of Gic2011 aula10-ingles

Page 1: Gic2011 aula10-ingles

Information & Knowledge Management

Discussion of “Collective Knowledge Systems: When the Social Web Meets the Semantic Web”

from Tom Gruber (TomGruber.org)

Marielba ZacariasProf. Auxiliar DEEI

FCT I, Gab 2.69, Ext. 7749

[email protected]

Page 2: Gic2011 aula10-ingles

Summary

The Vision of Collective Intelligence

Collective Knowledge Systems

The Role of the Semantic Web

Augmenting User-Contributed Data with Structured Data

Enabling Data Sharing and Computation Across Applications

Example

Collective Knowledge System for Travel

Page 3: Gic2011 aula10-ingles

The vision of Collective Intelligence

Web 2.0 (Social Web)

Class of web sites and applications in which user participation is the main driver of value

Wikipedia, MySpace, YouTube, FIicker, Del.icio.us, Facebook, Technorati, etc., Blogger, WordPress

Page 4: Gic2011 aula10-ingles

The vision of Collective Intelligence

web 2.0web 1.0

Page 5: Gic2011 aula10-ingles

The vision of Collective Intelligence

Harnessing Collective IntelligenceHyperlinking works as brain synapsisYahoo!’s role as a portal of net users’ collective workGoogle’s PageRank search exploits web structure rather than just doc characteristicseBays’ product is the collective activity of its usersAmazon has made a science of user engagementFlicker & Del.icio.us pioneered folksonomies Wikipedia based on the idea that any user may edit any entryCollaborative spam filtering like CloudmarkGreatest internet successes driven by viral marketingInternet infrastructure (php, apache, mysql, python) mostly based on peer-production of open source software

Page 6: Gic2011 aula10-ingles

The vision of Collective Intelligence

Collective Intelligence or Wisdom of the Crowds

Value created by collective writing articles in wikipedia, sharing tagged photos in flicker, sharing bookmarks in del.icio.us or streaming their personal blogs in the open space called the blogosphere

Unmatched potential for knowledge sharing

Collected intelligence

But not collective intelligence

No emergence of new levels of undersanding of knowledge

Page 7: Gic2011 aula10-ingles

The vision of Collective Intelligence

Collective intelligence has been goal of several visionaries

Grand challenge is to boost the collective IQ of organizations and society

human-machine system for

collecting knowledge for learning

evolving technology for collective learning

humans and machines actively contribute doing what they do best

Page 8: Gic2011 aula10-ingles

The vision of Collective Intelligence

Tim Berners-Lee inventor of the semantic web

Semantic web is an extension of social web in which information is given precise meaning

better enabling people and computers to cooperate

Page 9: Gic2011 aula10-ingles

The vision of Collective Intelligence

The key is the synergy between humans and machines

What kind of synergy?

People are producers and customers

knowledge sources

have real world problems and interests

learn/create knowledge communicating with each other

Machines are enablers

store & remember data

search & combine data

draw mathematical & logical inferencex

Page 10: Gic2011 aula10-ingles

The Vision ofCollective Intelligence

With the rise of the social web we have now millions of humans offering their knowledge online i.e.

The information is stored, searchable and easily shared

Challenge: match between what is put online and methods for doing useful reasoning with data

True collective knowledge emerges if the knowledge collected from all those people is aggregated or recombined to create new knowledge or new ways of learning

Page 11: Gic2011 aula10-ingles

Collective Knowledge Systems

human-machines systems in which machines enable the collection and harvesting of large amounts of human-generated knowledge

Page 12: Gic2011 aula10-ingles

Collective knowledge systemsthe faq-o-sphere

social system supported by ICT which generates self-service problem solving discussions in the internet

product support forums

special interest mailing lists

structured question-answer catalogs

in which some people pose problems and others reply with answers

Page 13: Gic2011 aula10-ingles

Collective Knowledge Systems the faq-o-sphere

A search engine able of finding questions and answers in this body of content

Google is very good in finding a message in public forums in which someone has asked a question similar to one’s query

intelligent users, who know how to formulate their queries and provide feedback about which query/doc pairs were effective

though not designed as a system, faq-o-sphere behave as competent expert systems

Page 14: Gic2011 aula10-ingles

Collective Knowledge Systemsthe faq-o-sphere

Page 15: Gic2011 aula10-ingles

Collective Knowledge Systems

Citizen Journalism

blog-o-sphere

Product Reviews

computer products, gadgets, digital cameras

Collaborative filtering

Amazon recomendations

Page 16: Gic2011 aula10-ingles

Collective Knowledge Systems

User-generated content (by a lot of users!)

Human-machine synergy

Increasing returns with scale

Emergent Knowledge

new ideas, products, concepts, theories, ways of doing things, etc.

how? with the semantic web

Page 17: Gic2011 aula10-ingles

Semantic Web

The problem of semantics

what we say

how we say it

different symbols/terms with same meaning

same symbols/terms with different meaning

Page 18: Gic2011 aula10-ingles

Traditional web

htmlkeyword-based searh

ProblemComputers don’t understand meaning

Solution?

“My mouse is broken. I need a new one…”

Page 19: Gic2011 aula10-ingles

Semantic Web

Page 20: Gic2011 aula10-ingles

Semantic Web Benefits

Page 21: Gic2011 aula10-ingles

Semantic Web Layer Cake

Page 22: Gic2011 aula10-ingles

The Web of Things

Page 23: Gic2011 aula10-ingles

RDF

URIs

Page 24: Gic2011 aula10-ingles

RDF -> XML

Page 25: Gic2011 aula10-ingles

Ontologies

Concept conceptual entity of the domain

Attribute property of a concept

Relation relationship between concepts or properties

Axiom coherent description between Concepts / Properties / Relations via logical expressions

Person

Student Professor

Lecture

isA – hierarchy (taxonomy)

name email

student nr.

research field

topic lecture nr.

attends holds

Page 26: Gic2011 aula10-ingles

The role of the semantic webTechnology has enabled the generation of collected knowledge by making it easy and cheap to:

Capture

Store

Distribute

Communicate

Create new value from the collected data

Page 27: Gic2011 aula10-ingles

The role of the semantic webCreating value from data is the main role of the semantic web in collective knowledge systems

semantic web adds structure to data related to user contributions

enabling sharing and computation among independent, heterogeneous social web applications

Page 28: Gic2011 aula10-ingles

The role of the semantic web

Augmenting user-contributed data with structured data

structured data exposed in a structured way

distinguish Paris Hilton from Paris, France

expose data in data bases used to build html documents

extract data retrospectively from user contributions

capture data as people share information

Page 29: Gic2011 aula10-ingles

Enabling data sharing and computation among applications

RDF enables structured data referencing well maintained namespaces, unambiguous entity reference with URIs

Ontologies for common conceptualizations independent of data models

in social web applications enables integrating tagging data

tagCommons project (mapping rather than homogenizing)

The role of the semantic web

Page 30: Gic2011 aula10-ingles

Example: Real Travel

RealTravel attracts people to write about their travels, sharing stories, photos, etc.

Travel researchers get the value of all experiences relevant to their target destinations.

Page 31: Gic2011 aula10-ingles

Real Travel

Page 32: Gic2011 aula10-ingles

Real Travel

Page 33: Gic2011 aula10-ingles

Real Travel

Page 34: Gic2011 aula10-ingles

Real Travel

Group Stories together by destination

Aggregate cities to states to countries

Inherit locatioins down to photos

Infer geo-coordinatees, which drive dynamic rout management

Destinations map

to external contents (travel guides)

to targeted advertising

Page 35: Gic2011 aula10-ingles

Real Travel as Collective Knowledge System

User generated content

Most of the content is from real traveler experiences

Human-machine synergy

travel planners could do the equivalent asking asking thousands of other travelers advice

Increasing returns with scale

as more people report their experiences, better coverage (more exotic locations) and depth (what to do or avoid)

Emergent knowledge

recommendations from unsupervised learning from travel blog texts and multi-dimensional match with structured data (e.g. traveler demographics, declared interest)

Page 36: Gic2011 aula10-ingles

Real Traveler as Collective Knowledge Systems

Snap to grid Travel Destinations

auto-completion of candidate locations

allow introducing new locations

Contextual browsing

combining tags, location and rating data (feedback from users and editors of content quality)

Snap to grid Tags

associate tags to useful domain concepts (e.g. arts)

Page 37: Gic2011 aula10-ingles

Real Travel Collective Knowledge System

Page 38: Gic2011 aula10-ingles

Real TravelPivot searching

Structured data provides dimensions of a hypercube

location, author, type, date, quality rating

Travel researchers browse along any dimension.

The key structured data is the destination hierarchy

Contributors place their content into the destination hierarchy, and the other dimensions are automatic.

Page 39: Gic2011 aula10-ingles

Real Travel as Collective Knowledge Systems

Learning from semi-structured data

System processes every contribution looking at text, tags, user profiles and other structured data

Clustering of the content to find synthetic dimensions

Stable classification of blogs and users in buckets

when users ask for recommendations they introduce desired location, trip length and demographic data

this data is used to filter some dimensions and they are asked to rate the remaining dimensions

the system matches this information with classified users and docs and ranks places to go and traveler blogs for those places

Page 40: Gic2011 aula10-ingles

Resources used

Open source software or free services

powerful databases

fancy UI libraries

search engines

usage analytics

Open APIs from Google Maps and Flickr (photos)

Commercially available geo-coordinate data and services

Page 41: Gic2011 aula10-ingles

How could semantic could help?No standard source of structured destination data for the world

or way to map among alternative hierarchies

Integrating with other destination-based sites is expensive

e.g. travel guides

No standard collection of travel tags

or way to share RealTravel’s folksonomy

Integration with other tagging sites is ad-hoc