From Text and Data to Knowledge: The Social Semantic Web...
Transcript of From Text and Data to Knowledge: The Social Semantic Web...
From Text and Data to Knowledge
The Social Semantic Web in Action
Dr Mark Greaves markgvulcancom
copy 2012 Vulcan Inc
Jesse Wang jessewvulcancom
What is Vulcan
Vulcan is the asset management firm for Paul G Allen
ndash Vulcan (wwwvulcancom) creates and advances
a variety of world-class endeavors and high
impact initiatives that change and improve the
way we live learn do business and experience
the world
Major Operating Groups
ndash Vulcan Capital
ndash Vulcan Sports and Entertainment
bull Seattle Seahawks Portland Trail Blazers
ndash Vulcan Productions
ndash Vulcan Real Estate
ndash Paul G Allen Family Foundation
ndash Allen Institute for Brain Sciences
ndash Vulcan Technology (Stratolaunch Halo Marine)
2
Paul Allen and Bill Gates in 1968
The Founding of Microsoft
Paul Allenrsquos Concept of the Digital Aristotle
ldquoDigital Aristotlerdquo ndash Inspired by the broad science fiction vision
of Big AI
ndash The volume of scientific knowledge has outpaced our ability to manage it
ndash Need systems to reason and answer questions rather than simply retrieve 1000s of relevant documents
bull Example ldquoWhat are the reaction products if metallic copper is heated strongly with concentrated hydrochloric acidrdquo
Project Formation ndash Initial Workshops conducted in 2001
ndash Digital Aristotle vision formulated
ndash Project Halo created in 2002
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
What is Vulcan
Vulcan is the asset management firm for Paul G Allen
ndash Vulcan (wwwvulcancom) creates and advances
a variety of world-class endeavors and high
impact initiatives that change and improve the
way we live learn do business and experience
the world
Major Operating Groups
ndash Vulcan Capital
ndash Vulcan Sports and Entertainment
bull Seattle Seahawks Portland Trail Blazers
ndash Vulcan Productions
ndash Vulcan Real Estate
ndash Paul G Allen Family Foundation
ndash Allen Institute for Brain Sciences
ndash Vulcan Technology (Stratolaunch Halo Marine)
2
Paul Allen and Bill Gates in 1968
The Founding of Microsoft
Paul Allenrsquos Concept of the Digital Aristotle
ldquoDigital Aristotlerdquo ndash Inspired by the broad science fiction vision
of Big AI
ndash The volume of scientific knowledge has outpaced our ability to manage it
ndash Need systems to reason and answer questions rather than simply retrieve 1000s of relevant documents
bull Example ldquoWhat are the reaction products if metallic copper is heated strongly with concentrated hydrochloric acidrdquo
Project Formation ndash Initial Workshops conducted in 2001
ndash Digital Aristotle vision formulated
ndash Project Halo created in 2002
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Paul Allen and Bill Gates in 1968
The Founding of Microsoft
Paul Allenrsquos Concept of the Digital Aristotle
ldquoDigital Aristotlerdquo ndash Inspired by the broad science fiction vision
of Big AI
ndash The volume of scientific knowledge has outpaced our ability to manage it
ndash Need systems to reason and answer questions rather than simply retrieve 1000s of relevant documents
bull Example ldquoWhat are the reaction products if metallic copper is heated strongly with concentrated hydrochloric acidrdquo
Project Formation ndash Initial Workshops conducted in 2001
ndash Digital Aristotle vision formulated
ndash Project Halo created in 2002
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
The Founding of Microsoft
Paul Allenrsquos Concept of the Digital Aristotle
ldquoDigital Aristotlerdquo ndash Inspired by the broad science fiction vision
of Big AI
ndash The volume of scientific knowledge has outpaced our ability to manage it
ndash Need systems to reason and answer questions rather than simply retrieve 1000s of relevant documents
bull Example ldquoWhat are the reaction products if metallic copper is heated strongly with concentrated hydrochloric acidrdquo
Project Formation ndash Initial Workshops conducted in 2001
ndash Digital Aristotle vision formulated
ndash Project Halo created in 2002
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Paul Allenrsquos Concept of the Digital Aristotle
ldquoDigital Aristotlerdquo ndash Inspired by the broad science fiction vision
of Big AI
ndash The volume of scientific knowledge has outpaced our ability to manage it
ndash Need systems to reason and answer questions rather than simply retrieve 1000s of relevant documents
bull Example ldquoWhat are the reaction products if metallic copper is heated strongly with concentrated hydrochloric acidrdquo
Project Formation ndash Initial Workshops conducted in 2001
ndash Digital Aristotle vision formulated
ndash Project Halo created in 2002
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
The Digital Aristotle and Project Halo
Project Halo is a staged long-range research effort by Vulcan
Inc towards the development of a Digital Aristotle
Project Halo envisions two primary classes of application
1 A digital textbook capable of answering student questions and
helping students with their work
2 A research assistant with broad interdisciplinary skills to help
scientists make connections
A Digital Aristotle is a reasoning
system capable of answering novel
questions and solving advanced
problems in a broad range of
scientific disciplines
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Where to Begin
AI Grand Challenge ndash Read a chapter and answer
questions in the back of the book (Reddy 2003)
Focus on Science ndash Focus on science where
knowledge is explicitly stated
US Advanced Placement Exam ndash Use this US-based test of human
competence as a metric
Systems AI ndash Include the knowledge knowledge
acquisition reasoning and question answering pieces
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Project Halo Development Strategy
KR
Expert
Single Domain
Expert
Small Team of
Domain and KR
Experts
Community of
Scientists
Teachers and KR
Experts
Logic
Queries
AP Question
Answering
AP QA
General QA
Education
AP QA
General QA
Education
Research
Authors
Uses
Pilot Phase II Inquire DA
3 Partial
AP Syllabi
Single
Textbook Partial
AP Syllabus Complete
Domain
2002-2003 2004-2009 2010-2015 2016-
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
What is Inquire
Inquire is a concept for a new kind of electronic textbook ndash Based on Campbell and Reese
Biology
Inquire is a standard electronic textbook but also ndash Contains a full underlying knowledge
base of the bookrsquos contents
ndash Answers the readerrsquos questions and provides tailored instruction
ndash Core technology applies to other areas of biology and other exact sciences (chemistry physics geology etc)
bull Is less applicable in history art languages philosophy etc
ndash Should support other languages
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Inquirersquos Design Goals
An Inquire user can radic READ the textbook contents
dynamically interactively
radic ASK questions and get explanations on any subject in the book
ndash LEARN and master the subject through individualized tutoring
ndash CREATE and explore their own conceptualizations of the material
A Calculator for Biology ndash Simple questions are answered
from specific knowledge in the text
ndash Fully embedded in the textbook
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Building a Question-Answering Textbook
Inquire Application
Knowledge Authoring
Question Answering
Educational Impact
Factual
Relationship
Biology Core Themes
All Questions
Domain experts
(biologists) enter
knowledge from
Campbellrsquos Biology
textbook into a
structured logical
knowledge base
The knowledge
base is connected
to a electronic
version of the
textbook running
on an iPad that
enables students
to ask questions
about the material
as they read
Measure retention
and understanding
for students using
Inquire against
students using
either a paper or
electronic version
of the textbook
Inquire answers a
wide range of
questions by
performing logical
inference on the
knowledge entered
into the knowledge
base
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Inquire Demo
Digital Textbook iPad Application ndash eBook reader (with standard features)
Knowledge-based features ndash Glossary pages (from KB)
ndash Suggested questions and answers
ndash Free-form QA dialog
AURA Knowledge Server ndash Concept taxonomy from Campbell Biology
ndash AURA Knowledge Base (21 chapters)
ndash Suggested question generation
ndash Reasoning and question answering system
ndash AAAI Best Video of 2012
httpwwwyoutubecomwatchv=fTiW31MBtfA
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Inquire vs Siri vs Watson vs Search Engines
Project Halorsquos Inquire ndash Subject Matter biology
ndash Questions Question suggestion and simple English queries
ndash Answers Formatted data and textbook content exact answers
Applersquos Siri ndash Subject Matter Actions on iPhone plus Wolfram Alpha searches
ndash Questions Voice commands short dialogs
ndash Answers Performs tasks Alpha-formatted pages
IBMrsquos Watson ndash Subject Matter General knowledge trivia attempting medical
ndash Questions Jepoardy-style ldquoreversedrdquo questions
ndash Answers Single word or short phrase answers
Google Yahoo Bing Baidu ndash Subject Matter Open domain web knowledge
ndash Questions keywords and search queries
ndash Answers web page listings
13
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Three Main Components of Inquire
Task Encoding knowledge sentence-by-sentence
ndash Define a computer language for capturing knowledge
ndash Represent background knowledge and explicit knowledge
Systems AURA Semantic Media Wiki (SMW)
Metrics ndash Coverage of book at
various depths
ndash Time per sentence
RampD thrusts ndash SMW Lower authoring
costs by crowdsourcing
ndash SILK Capturing more knowledge
Task Map a student question to an answerable query
ndash English language interpretation
ndash Suggest good questions to student
Systems AURA Question Processor
Metrics ndash Usefulness of suggested
questions
ndash Accuracy of free-form question processing
RampD thrusts ndash Rephrasing
ndash Textual Entailment
ndash Learning suggested question mappings
ndash Question generation from the knowledge base
Task Generate a useful answer to a student question
ndash Derive an answer from the knowledge base
ndash English generation and media selection
ndash Answer presentation and relevance pruning
Systems AURA Cyc Text Analysis
Metrics ndash Answer accuracy
ndash Answer usefulness to student users
ndash Time to answer
RampD thrusts ndash Hybrid system Answer
more questions
Knowledge Entry Question Processing Question Answering
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Knowledge from Campbell Biology in Inquire
Most-used biology textbook in the US both high school and college
Total Core Sentences 28878
56 Chapters 262 Sections ndash Avg 47 sectionschapter 515
sentenceschapter 109 sentencessection
ndash Relevant sentences in book (extrapolation from Chaps 6910) 49 14150 sentences
750 Core library concepts
~6000 total concepts in taxonomy ndash ~3000 concepts in Chaps 2-22
~65000 knowledge base assertions
~50000 suggested questions
4 hourssentence authoring time ndash Design planning encoding testing
refining new relation development
60-70 new encoding relationsyear
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Inquirersquos Existing Knowledge Entry Process
Multiple 3-person teams
Highly accurate but costly ndash ~4 person-hours per
sentence
ndash Intensive testing and knowledge refinement
Biology Teams and AI experts
represent each sentence
Senior
SME
Senior KR
Expert
Concept
Map
Planning
Concept
Map Entry
Concept
Map
Testing
Content
Review
KR
Review
Assess amp
Correct
Knowledge Entry Group
Delhi India
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
Universal Truth Authoring Concept Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Knowledge Entry Process
3) Encoding Planning -- 35 time
Group Common UTs ID KRKE Issues
ID Already Encoded Write How to Encode
Pre-Planning QA Check
Status Labeling Encoding Complete KR Issue (Closed)
2) Reaching Consensus -- 14 time
UT Authoring cMap Chosen QA Check
1) Determining Relevance -- 2 time
Highlighting Diagram Analysis QA Check
Status Labeling Relevant Irrelevant (Closed)
6) Question-Based Testing -- 14 time
Use Minimal Test Suite Reasoning JIRA Issues Filed
Encoder Fills KB Gaps
QA Check with Screenshots of ldquoPassing Comparison and Relationship Questions
5) Key Term Review -- 25 time
KR Evaluated by Modeling Expert and Biologist Encoder Makes Changes
KR Evaluated by Modeling Expert and Biologist
QA Check
4) Encoding -- 10 time
Encode File JIRA Issues QA Check
Status Labeling Encoding Complete KE Issue
Peach = EVS Light Blue = SRI
Planning (51)
Testing (39)
Encoding (10)
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
This is the well-known Knowledge Acquisition Problem ndash AI authoring is too expensive too slow not scalable
Three Possible Solutions ndash Automatic Machine ParsingEditing
bull Not good enough for biology textbook sentences
bull Error rates are too high
bull Need humans in the loop
ndash Mechanical Turk Authoring bull Biology expertise is difficult to get
bull Mechanical Turk uses individuals but the Knowledge Entry task appears to require coordination judgment discussion and working together
ndash Social Authoring and Crowdsourcing bull Wikipedia showed this could work for text
bull In 2008 Vulcan started a pilot project to explore social knowledge authoring
How Can We Improve Knowledge Entry
19
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Crowdsourcing for Better Knowledge Acquisition
20
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
edit wow I can change the web
letrsquos share and publish knowledge
to make an [[encyclopedia]]
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Success of Wikis
27
Actual number of articles on enwikipediaorg (thick
blue line) compared with a Gompertz model that leads
eventually to a maximum of about 44 million articles
(thin green line)
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
A Key Feature of Wiki
28
This distinguishes wikis from other publication tools
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Consensus in Wikis Comes from
Collaboration ndash ~17 editspage on average in
Wikipedia (with high variance)
ndash Wikipediarsquos Neutral Point of View
Convention ndash Users follow customs and
conventions to engage with articles effectively
29
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Software Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes one-
step rollback Every article has a ldquoTalkrdquo page
for discussion Notification facility allows anyone
to ldquowatchrdquo an article Sufficient security on pages
logins can be required A hierarchy of administrators
gardeners and editors Software Bots recognize certain
kinds of vandalism and auto-revert or recognize articles that need work and flag them for editors
30
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
However Finding Information Is hellip
Wikipedia has articles abouthellip
bull hellip all movies with cast director
budget running time grosshellip
hellip all German cars with engine
size accelerating datahellip
Can you find
Sci-Fi movies made between
2000-2008 that cost less than
$10000 and made $30000
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1000000+ people)
31
Or German(Porsche) cars that
accelerate from 0-100kmh in 5
seconds
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Can Search Solve the Problem
32
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Answer is Hidden Deeply in
33
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
To Find More Info
All Porsche vehicles made in
Germany that accelerate from 1-
100 kmh less than 4 seconds
Sci-Fi movies made between
2000-2008 that cost less than
$10M and gross more than $30M
A map showing where all
Mercedes-Benz vehicles are
manufactured
All skyscrapers in China (Japan
Thailandhellip) of 50 (406070) floors
or more and built in year 2000
(20012002) and after sorted by
built year floorshellip grouped by
cities regionshellip
35
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
36
Semantic Wiki
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Two Perspectives
Wikis for Metadata
Metadata for Wikis
37
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
A SIMPLE EXAMPLE SEMANTIC MOVIES
Sci-Fi Movie Wiki Live Demo
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Basics of Semantic Wikis
Still a wiki with regular wiki features
ndash CategoryTags Namespaces Title Versioning
Typed Content (built-ins + user created eg categories)
ndash PageCard Date Number URLEmail String hellip
Typed Links (eg properties)
ndash ldquocapital_ofrdquo ldquocontainsrdquo ldquoborn_inrdquohellip
Querying Interface Support
ndash Eg ldquo[[CategoryMember]] [[Agelt30]]rdquo (in SMW)
41
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Characteristics of Semantic Wikis
42
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
What is the Promise of Semantic Wikis
Semantic Wikis facilitate Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed user-maintained user-defined
Easy to use as an extension of text authoring
43
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
One Key Helpful Feature of Semantic Wikis
Semantic Wikis are ldquoSchema-Lastrdquo Databases require DBAs and schema design
Semantic Wikis develop and maintain the schema in the wiki
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
List of Semantic Wikis
AceWiki
ArtificialMemory
Wagn - Ruby on Rails-based
KiWi ndash Knowledge in a Wiki
Knoodl ndash Semantic
Collaboration tool and
application platform
Metaweb - the software that
powers Freebase
OntoWiki
OpenRecord
PhpWiki
Semantic MediaWiki - an
extension to MediaWiki that
turns it into a semantic wiki
Swirrl - a spreadsheet-based
semantic wiki application
TaOPis - has a semantic wiki
subsystem based on Frame
logic
TikiWiki CMSGroupware
integrates Semantic links as a
core feature
zAgile Wikidsmart - semantically
enables Confluence
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
SEMANTIC EXTENSIONS Semantic MediaWiki and
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic MediaWiki Software
Born at AIFB in 2005
ndash Open source (GPL) well documented
ndash Vulcan kicks off SMW+ in 2007
Active development
ndash Commercial support available
ndash World-wide community
International Conferences
ndash Last SMWCon 425-27 2012 in Carlsbad CA
ndash Next SMWCon 1024-26 2012 in Koln Germany
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic MediaWiki (SMW) Markup Syntax
Beijing University is located in
[[Has locationBeijing]] with
[[Has population30000|about 30 thousands]]
students
In page PropertyHas locationrdquo [[Has typePage]]
In page PropertyHas populationrdquo [[Has typenumber]]
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Special Properties
ldquoHas Typerdquo is a pre-defined ldquospecialrdquo property for meta-
data
ndash Example [[Has typeString]]
ldquoAllowed Valuesrdquo is another special property
ndash [[Allows valueLow]]
ndash [[Allows valueMedium]]
ndash [[Allows valueHigh]]
In Halo Extensions there are domain and range support
ndash RDFs expressivity
ndash Semantic Gardening extension also supports ldquoCardinalityrdquo
49
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Define Classes
50
Beijing is a city in [[Has
countryChina]] with population
[[Has population2200000]]
[[CategoryCities]]
Categories are used to define classes because they are better for class inheritance
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in hellip
[[Categories 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore Owings and Merrill buildings]]
CategorySkyscrapers in China Category Skyscrapers by country
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Possible Database-style Query over Data
51
ask
[[CategorySkyscrapers]]
[[Located inChina]]
[[Floor countgt50]]
[[Year builtlt2000]]
hellip
Ex Skyscrapers in China higher than
50 stories built before 2000
ASKSPARQL query target
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic MediaWiki Stack
MediaWiki (XAMPP)
Extension Semantic MediaWiki
More Extensions and Applications
52
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
MediaWiki SMW+
Halo Extension
Usability extension to Semantic MediaWiki
Increases user consensus
Increases use of semantic data
Semantic MediaWiki
Core Semantic Wiki engine
Authoring of explicit knowledge in content
Basic reasoning capabilities
SMW+
Shrink wrap suite of open source software products
Comes with ready to use ontology
Easy to procure and install
MediaWiki
Powerful Wiki engine
Basic CMS feature set
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
SMW Extensions ndashTowards Great Apps
56
bull Halo Extensions Wiki Object Model Semantic Forms Notification hellip
Data IO
bull Semantic Toolbar Semantic Drilldown Faceted Searchhellip
Query and Browsing
bull Semantic Result Printers Tree View Exhibit Flash chartshellip
Visualization
bull HaloACL Deployment Triplestore Connector Simple Ruleshellip
bull Semantic WikiTags and Subversion Integration extensions
bull Linked Data Extension with R2R and SILK from FUBerlin
Other useful extensions
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic Wikipedia
Ultrapedia An SMW proof of concept built to explore general knowledge acquisition in a wiki
Domain German Cars (in Wikipedia)
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
57
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Standard View of the Wiki Data
httpwikingvulcancomupindexphpPorsche_996
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Dynamic View of the Acceleration Data
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Graph View of the Acceleration Data
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Dynamic Mapping and Charting
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Information Discovery via Visualization
62
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
FROM WIKI TO APPLICATION We Started With a Wiki Ended with an App
63
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Build Social Semantic Web Apps
64
Fill in data (content)
Develop schema (ontology)
Design UI skin form amp templates
Choose extensions
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
65
Video Semantic Wikis for A New Problem (2008)
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity rarr
larr Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics
Extremely difficult to engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic
Entertainment
Wiki
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic Seahawks Football Wiki
66
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Semantic Entertainment Query Result Highlight Reel
Commercial LookFeel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (ldquotouchdowns with big hitsrdquo)
Tree-based navigation widget
Very favorable economics
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
AGILE APPLICATION DEVELOPMENT With Semantic MediaWiki+
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
72
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
MINI-OLYMPIC APPLICATION IN 2 DAYS
From Agile to Quick Agile
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Application Mini-Olympics
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
77
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
78
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
79
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Requirements for Wiki ldquoDevelopersrdquo
One need not
ndash Write code like a hardcore programmer
ndash Design setup RDBMS or make frequent
schema changes
ndash Possess knowledge of a senior system
admin
Instead one need
ndash Configure the wiki with desired extensions
ndash Design and evolve the data model
(schema)
ndash Design Content
bull Customize templates forms styles skin etc
80
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
We Can Build
81
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Effectiveness of SMW as a Platform Choice
Packaged Software
Very quick to
obtain
N Hard to customize
N Expensive
Microsoft Project
Version One
Microsoft
SharePoint
Custom Development
N Slow to develop
Extremely flexible
N High cost to develop
and maintain
NET Framework
J2EE hellip
Ruby on rails
82
SMW+ Suite
Still quick to
program
Easy to customize
Low-moderate cost
Mini-Olympics
Scrum Wiki
RPI map
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Neurowiki An SMW to Acquire Neuroscience Knowledge
Can we use SMW to acquire relationship information ndash From data focus (movies) to relationship focus (Neurowiki) ndash A Neurowiki user cannot contribute data without also contributing
relationships
Experimental goals ndash Create an SMW instance for acquiring both instance and relationship
information between data elements ndash Integrate SMW with the Berlin LDIF system
Product goals ndash A community-driven collaborative framework for neuroscience data
integration bull Support for community-built neuroscience vocabularies bull A library of visualizations and simple analytics
ndash Neuroscientist users can createmodify pages and metadata bull Scientists can own the evolving knowledge structure bull Low barrier for scientist to publish data
ndash Support scientific discovery
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
NEUROWIKI DEMO Using SMW+ to crowdsource knowledge of mapping and linking
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
AURA Wiki Bringing Together Inquire and SMW+
Reduce costs and increase scale of AURA authoring ndash A real high-quality application of SMW+
ndash Focus on creating universally-quantified sentences (UTs)
bull Example ldquoGlycolysis which occurs in the cytosol begins the degradation process by breaking glucose into two molecules of a compound called pyruvaterdquo
ndash UT Glycolysis occurs in Cytosol (concept Glycolysis)
ndash UT During cellular respiration glycolysis is the first step (concept Cellular Respiration)
ndash UT During glycolysis breakdown of 1 glucose results in 2 pyruvate molecules (concept Glycolysis)
ndash Use these UTs as inputs to the AURA authoring process
Use SMW-acquired statements as inputs to the process ndash Potential savings of over 50 of AURA authoring time
Use the social semantic web for real AI authoring 85
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
AURAwiki Strategy
Provide a community of authors with everything they need to collaborate on high-quality UT authoring ndash Top-down Encoding Flow
bull Suggestions and commentary from KE (Knowledge Engineers) and SMEs previous contributions
bull Supported by ontological terms and relationships questions guidelines testing
ndash Bottom-up Encoding Flow bull UT suggestions based upon algorithms (ReVerb by University of Washington)
bull All suggestions validated via Mechanical Turk before use in AURAWiki
ndash Build a ldquoKnowledge Integration Centerrdquo
Current Page Designs ndash Main Page
ndash User Page
ndash Authoring Page
Will perform initial testing by September 15
86
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Conclusion
Inquire is a new application of AI technology ndash Question-answering in textbooks
ndash Precise embedded answers over a complex subject matter
Semantic wikis and SMW+ use social and crowd phenomena as a new way to address the knowledge acquisition problem ndash Crowdsourcing addresses the cost and update problems
ndash Validate using Inquire authoring process
87
From text (textbook sentences)
to data (knowledge encodings)
to knowledge (use in a QA system)
The Social Semantic Web in Action
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Project Halo Team
DIQA
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
89
Thank You
wwwvulcancom
wwwprojecthalocom
wwwinquireprojectcom
wwwsmwpluscom
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution
Quiz and Homework Scores
P-value from 2 tailed t-tests HW scores Full vs Ablated 012 Full vs textbook 002 Ablated vs Text 052
Quiz scores Full vs Ablated 0002 Full vs textbook 005 Ablated vs Text 018
81
88
74 75 71
81
HW SCORE QUIZ SCORE
A
B
C
D
n=24
n=25
n=23
0
10
20
30
40
50
A B C D F
o
f st
ud
ents
Inquire Ablated Textbook
textbook
Quiz Score Distribution