Nuxeo World Session: Gagnavarslan and Nuxeo - Building software services on Nuxeo and Django
Nuxeo World Session: Semantic Technologies - Update on Recent Research
-
Upload
nuxeo -
Category
Technology
-
view
105 -
download
1
description
Transcript of Nuxeo World Session: Semantic Technologies - Update on Recent Research
Nov. 17 2010 - S. Fermigier & O. Grisel, Nuxeo
Towards semantic ECM:report on the IKS and Scribo projects
Monday, November 22, 2010
Outline
• Introduction to semantic technologies
• Collaborative R&D within the Scribo and IKS projects
• Fise & Apache Stanbol / Nuxeo Integration
Monday, November 22, 2010
1. Introduction to semantic technologies
Monday, November 22, 2010
Illustration source: Mills Davis, “Semantic Social Computing”, sept. 2007Monday, November 22, 2010
Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010
Invented the web in 1989(yeah!)
Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010
Invented the web in 1989(yeah!)
Invented the semantic web in 1999 (duh?)
Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010
Historical perspective
• From web 1.0: web of pages, aka the World Wide Web
• To web 2.0: web of people and of participation, aka the Social Web
• To web 3.0: web of data, of meaning and of connected knowledge, aka the Semantic Web
Monday, November 22, 2010
Picture source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010
Monday, November 22, 2010
Monday, November 22, 2010
Monday, November 22, 2010
A “layer cake” of technologies
Monday, November 22, 2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Linked Online Data in 2007
Monday, November 22, 2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
2008
Monday, November 22, 2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
2009
Monday, November 22, 2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
2010
Monday, November 22, 2010
Good for Enterprise apps too!
Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/Monday, November 22, 2010
Key Enablers
• Open Data and Linked Online Data
• Advances in automatic content analysis (linguistics, image processing)
• Computing power (Moore’s law + MapReduce)
• Classical logic and classical AI
Monday, November 22, 2010
let’s put them to use!
The technologies and data are available,
Monday, November 22, 2010
Content Meaning
Text
Image
Sound
Video
Metadata
Relations
EntitiesTags
Reasoning
Semantic ECM
Monday, November 22, 2010
Goals for Semantic ECM(& Nuxeo)
• Repurpose existing content
• Improve search and collaboration
• Make information contextual
• Extract and use information from your content
•Make your content smarter!
Monday, November 22, 2010
Challenges
• Extract meaning from content
• Enrich content with knowledge
• Enhance interaction with content thanks to added meaning
Monday, November 22, 2010
Business valuefrom semantic ECM
• Efficiency gains: 20% to 90% (ex: in search, collaboration)
• Effectiveness gains: better returns from your assets (ex: news and images from AFP)
• Strategic edge: growth, value capture, new services, gain unfair strategic advantage (ex: vertical ontologies for CEVAs / CCAs)
Monday, November 22, 2010
2. SCRIBO and IKS
Monday, November 22, 2010
• Project under the french FUI program, with 9 partners, and a budget of 4.7 M€
• Goal: to develop algorithms and collaborative tools for extracting knowledge from unstructured documents and images
• Started in 2008, finishing in Dec. 2010, with results already integrated as a Nuxeo plugin
Monday, November 22, 2010
• European project under the FP7, with 13 partners (6 SMEs) and a 8.5 M€ budget
• Goal: create a semantic software “stack” that will be used by CMS vendors to add semantic features to their products
• Started in Jan. 2009, will last until Dec. 2012
• First tangible result: FISE, already integrated in a Nuxeo plugin
Monday, November 22, 2010
3. Linking Semantic EntitiesApache Stanbol - Nuxeo integration
Monday, November 22, 2010
What are entities?
27
Monday, November 22, 2010
28
Monday, November 22, 2010
What is wrong with tags?
29
• Many terms for same meaning
• NYC, New York, New York City
• Many meanings for same terms
• Need context to remove any ambiguity
Monday, November 22, 2010
30
Washington is...
Monday, November 22, 2010
Tagging with Entities
31
• Global namespace / universal meaning context
• Interoperability across domains
• Interoperability across applications
Monday, November 22, 2010
Demo time!
32
Screencast online at http://blogs.nuxeo.com/dev
Monday, November 22, 2010
How does this work?
33
Monday, November 22, 2010
34
Monday, November 22, 2010
35
• Open Source Semantic Engine
• HTTP Services
• For content driven applications
• OSGi: loosely coupled components
• Analysis Engines
• Knowledge RDF vocabularies
Monday, November 22, 2010
What is a semantic engine?
36
• Unstructured content => Knowledge
• Language guessing
• Topic classification (Business, Sports, Media, ...)
• Named Entities extraction and linking
• Relationships and properties extraction
Monday, November 22, 2010
37
Monday, November 22, 2010
38
Monday, November 22, 2010
39
RESTfulis
Beautiful
Monday, November 22, 2010
40
curl -X POST \ -H "Accept: application/json" \ -H "Content-type: text/plain" \ --data "John Smith works at Smith Consulting in Paris." \ http://fise.demo.nuxeo.com/engines
{ "urn:enhancement-1564680b-861c-df6f-fdf9-d34a75d68dfe": { "http://fise.iks-project.eu/ontology/selected-text": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "Paris" } ], "http://fise.iks-project.eu/ontology/selection-context": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "John Smith works at Smith Consulting Paris." } ], "http://purl.org/dc/terms/type": [ { "type": "uri", "value": "http://dbpedia.org/ontology/Place" } ] }, …
Monday, November 22, 2010
41
Monday, November 22, 2010
42
Monday, November 22, 2010
43
= fise +
fast Linked Data local index +
semantic rule engine+
more ?
Monday, November 22, 2010
Apache Stanbol / Nuxeo integration
44
Monday, November 22, 2010
Local IT infrastructure (LAN) 45
Nuxeo DM
addon
1
Apache Stanbol
2
Engine 1
Engine 2
Engine 3
3
DBpedia
Freebase
GeonamesLDAP
Monday, November 22, 2010
46
• Implemented as an Operation for Studio
• Entities & Relationships stored in Nuxeo Core
• CMIS interoperability
Monday, November 22, 2010
Soon available on marketplace.nuxeo.com
47
Monday, November 22, 2010
48
• http://iks-project.eu
• http://fise.demo.nuxeo.com
• http://scribo.ws
• http://incubator.apache.org/stanbol
• http://blogs.nuxeo.com/dev
Questions?
Monday, November 22, 2010