Diadem 1.0
-
Upload
giorgio-orsi -
Category
Technology
-
view
747 -
download
0
description
Transcript of Diadem 1.0
![Page 1: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/1.jpg)
DIADEMDomain-centric, Intelligent, Automated
Data ExtractionTim Furche, Georg Gottlob, Giorgio Orsi
May 11th, 2011 @ Oxford University Computing Laboratoriesjoint work with Giovanni Grasso, Omer Gunes, Xiaonan Guo, Andrey Kravchenko, Thomas
Lukasiewicz, Christian Schallhart, Andrew Sellers, Gerardo Simaris, Cheng Wang
![Page 2: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/2.jpg)
3
1
Web Data Extraction
![Page 3: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/3.jpg)
4
Section 1: Web Data Extraction
Data on the Web
there is more of it than we can use
no longer availability, but finding, integrating, analysing, …
![Page 4: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/4.jpg)
5
Section 1: Web Data Extraction
Surface vs. Deep Web
estimated 500 × surface web
estimated 400 000 deep web databases
What?
Products (stores)
Directories (yellow pages)
Catalogs (libraries)
Public DBs (publications, census, data.gov,…)
Public services (weather, location, …)
![Page 5: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/5.jpg)
6And it’s not just one haystack …
![Page 6: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/6.jpg)
8
![Page 7: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/7.jpg)
10
![Page 8: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/8.jpg)
11
7 bedrooms
5 bedrooms
![Page 9: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/9.jpg)
12
Section 1: Web Data Extraction
The Web is more than HTML
![Page 10: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/10.jpg)
13
Section 1: Web Data Extraction
Overview
Introducing Web Data Extraction
Scenarios
Why now?
Supervised Web Data Extraction
Unsupervised Web Data Extraction
DIADEM
OPAL
AMBER
OXPath
IVLIA
Datalog±
![Page 11: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/11.jpg)
14
1.1
Web Data Extraction: Scenarios
![Page 12: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/12.jpg)
15
Section 1: Web Data Extraction
The Need of Web Data Extraction
information
drives business (decision making, trend analysis, …)
available in troves on the internet
but: as HTML made for humans, not as structured data
companies need
product specifications
pricing information
market trends
regulatory information
![Page 13: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/13.jpg)
17
keyword search fails
![Page 14: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/14.jpg)
18
Section 1: Web Data Extraction
Scenario ➀: Electronics retailer
electronics retailer: online market intelligence
comprehensive overview of the market
daily information on price, shipping costs, trends, product mix
by product, geographical region, or competitor
thousands of products
hundreds of competitors
nowadays: specialised companies
mostly manual, interpolation
large cost
![Page 15: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/15.jpg)
19
Section 1: Web Data Extraction
Scenario ➁: Supermarket chain
supermarket chain
competitors’ product prices
special offer or promotion (time sensitive)
new products, product formats & packaging
![Page 16: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/16.jpg)
20
Section 1: Web Data Extraction
Scenario ➂: Hotel Agency
online travel agency
best price guarantee
prices of competing agencies
average market price
![Page 17: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/17.jpg)
21
Section 1: Web Data Extraction
Scenario ➃: Hedge Fund
house price index
published in regular intervals by national statistics agency
affects share values of various industries
hedge fund
online market intelligence to predict the house price index
![Page 18: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/18.jpg)
22
Section 1: Web Data Extraction
And a lot more …
monitor blogs and forums
market intelligence, e.g., complaints, common problems
customer opinions
ranking and analysing product reviews
financial analysts
monitor trends and stats for products of a certain company / category
interest rates from financial institutions
press releases and financial reports
patent search & analysis
…
![Page 19: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/19.jpg)
24
1.1
Web Data Extraction: Why Now?
![Page 20: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/20.jpg)
25
Scale
![Page 21: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/21.jpg)
26
Applications
![Page 22: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/22.jpg)
27
Section 1: Web Data Extraction
How to book a flight?
![Page 23: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/23.jpg)
31
Structured Data
![Page 24: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/24.jpg)
33
Section 1: Web Data Extraction
Why Web Data Extraction Now?
Why now? Trends
Trend ➊: scale—every business is online
automation at scale
Trend ➋: web applications rather than web documents
automated form filling (deep web navigation)
Trend ➌: structured, common-sense data available
allows more sophisticated automated analysis
also a tool for improved data extraction?
![Page 25: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/25.jpg)
34
Web Data Extraction: Supervised
2
![Page 26: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/26.jpg)
35
manual: (e.g., Web Harvest)
user writes the wrapper, sometimes using wrapping libraries
supervised: (e.g., Lixto)
user provides examples and refines the wrapper
semi-supervised:
user provides examples (per site), wrapper is automatically learned
unsupervised: entirely automated (e.g., DIADEM)
some systems omit examples and run analysis directly on all pages
some systems automatically guess examples
![Page 27: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/27.jpg)
36
Section 2: Supervised Web Data Extraction
Supervised Web Data Extraction
User interaction needed to
rather than manually writing in a programming language
record interaction sequences (such as form fillings)
visually select examples for data
Current gold standard for high-accuracy extraction
Examples:
Lixto
Automation Anywhere
Web Harvest
…
![Page 28: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/28.jpg)
38
![Page 29: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/29.jpg)
40
Section 1: Supervised Web Data Extraction
Lixto: Extraction & Analysis
Lixto: sophisticated, visual semi-automated extraction tool
visually select, automatically derives patterns, verification
highly scalable extraction and processing with Lixto server
but also: data integration & business analytics suite
data cleaning
data flow scenarios: merge & filter from different web sites
market intelligence & analytics
![Page 30: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/30.jpg)
43
Web Data Extraction: Unsupervised
3
![Page 31: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/31.jpg)
44
17000 real estate sites in the UK
alone
![Page 32: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/32.jpg)
45
Section 3: Unsupervised Web Data Extraction
Why Automating Data Extraction?
Too many fish in the pond
> 17 000 real estate UK sites
similar for restaurants, travel, airlines, pharmacies, retail shops, …
aggregators cover only a fraction
updated slowly
⇒ per site manual work infeasible
wrapper construction too expensive
tracking changes
excludes manual & (semi-) supervised
![Page 33: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/33.jpg)
46
Section 3: Unsupervised Web Data Extraction
Why Automating Data Extraction?
All the fish are different
large, modern aggregators (>100000)
nation-wide agencies (>10000)
agencies for single quarter (< 15)
⇒ no single unsupervised wrapper
can do this today
![Page 34: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/34.jpg)
47
Section 3: Unsupervised Web Data Extraction
… and we really need it!
search engine providers (Google, Microsoft, Yahoo!) all work on
information and data extraction for
“vertical”, “object” and “semantic” search
turn search engines into knowledge bases for decision support
![Page 35: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/35.jpg)
48
“no one really has done this successfully at scale yet”
Raghu Ramakrishnan, Yahoo!, March 2009
“Current technologies are not good enough yet to provide what search
engines really need. [...] Any successful approach would probably need a combination of knowledge and
learning.”
Alon Halevy, Google, Feb. 2009
![Page 36: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/36.jpg)
49
Section 3: Unsupervised Web Data Extraction
Unsupervised: The Story so Far
Key observation:
“database” web sites are generated using templates
wrapper generators need to automatically identifying templates
Two major approaches
machine learning from a few hand-labeled examples
similar to semi-supervised, but only one set of examples for an entire domain
high precision only for simple domains (single entity type, few attributes)
fully automatically exploit the repeated structure of result pages
good precision needs a lot of data (many records per page, many pages)
doesn’t work for forms (no repetition)
![Page 37: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/37.jpg)
![Page 38: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/38.jpg)
51
?
![Page 39: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/39.jpg)
52
4
DIADEM
![Page 40: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/40.jpg)
53
Section 4: DIADEM
Domain-Centric Data Extraction
Blackbox analyser that
turns any of the thousands of websites of a domain
into structured data
![Page 41: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/41.jpg)
54
host of domain specific annotators
![Page 42: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/42.jpg)
55
domain ontology & phenomenology
![Page 43: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/43.jpg)
56
+ everything the others are doing
machine learning for classification
template discovery
![Page 44: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/44.jpg)
57
![Page 45: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/45.jpg)
58
![Page 46: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/46.jpg)
59
Section 4: DIADEM
DIADEM: Overview
DIADEM combines
host of domain-specific annotators with
gives us a first “guess” to automatically generate examples
high-level ontology about domain entities and
their phenomenology on web sites of the domain
allows us to verify & refine examples
+ advances in existing techniques for
repeated structure analysis
page & block classification
bottom-up understanding & top-down reasoning
![Page 47: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/47.jpg)
60
4.1
DEMO
![Page 48: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/48.jpg)
61
![Page 49: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/49.jpg)
62
DIADEM 0.1First prototype
![Page 50: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/50.jpg)
63
![Page 51: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/51.jpg)
69
![Page 52: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/52.jpg)
70
OPAL:Ontologies for Form Analysis
4.2
![Page 53: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/53.jpg)
71
![Page 54: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/54.jpg)
72
Diversity
![Page 55: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/55.jpg)
74
Section 4: DIADEM » OPAL
OPAL: Overview
Three step process:
browser extraction and annotation
labelling & segmentation
classification (phenomenological mapping)
Model-based, knowledge driven
latter two steps are model transformations
thin layer of domain-dependent concepts
field types and labels
triggers for field & form creation
![Page 56: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/56.jpg)
75
![Page 57: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/57.jpg)
77
![Page 58: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/58.jpg)
78
![Page 59: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/59.jpg)
79ICQ Data Set: Application to Other Domains
![Page 60: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/60.jpg)
80
AMBER:Ontologies for
Record Extraction
4.3
![Page 61: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/61.jpg)
81
7 bedrooms
5 bedrooms
![Page 62: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/62.jpg)
82
just opposite as in OPAL
![Page 63: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/63.jpg)
83
AMBER: Overview
Three step process like OPAL
browser extraction and annotation
classification (phenomenological mapping)
record segmentation (much harder than in OPAL)
Model-based, knowledge driven
latter two steps are model transformations
thin layer of domain-dependent concepts
record and attribute types
triggers for record & attribute creation
Section 4: DIADEM » AMBER
![Page 64: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/64.jpg)
84
![Page 65: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/65.jpg)
85
![Page 66: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/66.jpg)
86
Repeating
![Page 67: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/67.jpg)
87
Similarity
![Page 68: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/68.jpg)
88
![Page 69: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/69.jpg)
89
OXPath:Scalable, Memory-
Efficient Web Extraction
4.4
![Page 70: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/70.jpg)
90
How to book a flight?
Section 4: DIADEM » OXPath
![Page 71: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/71.jpg)
92
How to find a flat?
Section 4: DIADEM » OXPath
![Page 72: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/72.jpg)
94
How to find a flat with OXPath
Start at rightmove.co.uk: doc("rightmove.co.uk")
Fill “oxford’ into the first visible field /descendant::field()[1]/{"oxford"}
Click on the second next button /following::field()[2]/{click /}
On the refinement form just continue by clicking on the last field /descendant::field()[last()]/{click /}
Grab all the prices //p.price
Section 4: DIADEM » OXPath
![Page 73: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/73.jpg)
95
State of Web Extraction
No interaction with rich, scripted interfaces
no actions other than form filling and submission
➀ Imperative extraction scripts
explicit variable assignments, flow control, etc.
either proprietary selection language or mix of XPath & external flow control
➁ Focus on automation and visual interfaces
no or very limited extraction language, only ad-hoc extractions
no multiway navigation, no optimization
Section 4: DIADEM » OXPath
![Page 74: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/74.jpg)
98
Summary of Complexity
Section 4: DIADEM » OXPath
Time Space
OXPath w/o Actions & Kleene
O( n6⋅q2 ) O( n5⋅q2 )
OXPath w/o Kleene O( (p⋅n)6⋅q3 ) O( n5⋅q3 )
OXPath w/o unbounded Kleene
O( (p⋅n)6⋅q3 ) O( n5⋅q∑3 )
OXPath (full) O( (p⋅n)6⋅q3 ) O( n5⋅(q+d)3 )
O(n4⋅q2) O(n3⋅q2)
Combined: PTime-hard PTime-hard
Data: NLogSpace
LogSpaceExtraction marker = n-ary, nested
queries
Contextual actions (action free prefix)
Actions = multiple pages
Buffer bounded by page depth
![Page 75: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/75.jpg)
99
Constant Memory
![Page 76: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/76.jpg)
105
even faster
![Page 77: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/77.jpg)
106
4.5
IVLIA:Ontologies for PDF Extraction
![Page 78: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/78.jpg)
107
![Page 79: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/79.jpg)
108
PDF Analysis
Section 4: DIADEM » IVLIA
![Page 80: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/80.jpg)
109
Semantic Analysis and Annotation
Section 4: DIADEM » IVLIA
![Page 81: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/81.jpg)
110
Datalog±:Ontological Reasoning
at Web Scale
4.6
![Page 82: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/82.jpg)
113
Relational Schemaperson(ssn, name, birthdate)employee (ssn, empID, name, birthdate, department)department (depName, building)project (projID, startDate, duration)supervision (supervisor, supervised)assignment (employee, project)
E/R Schema Object Relational Schema
Ontological Databases
Section 4: DIADEM » Datalog±
![Page 83: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/83.jpg)
114
Taxonomy Definitions
employee(X,Y,Z,W) → ∃V person(V,Y,Z)
project(X,Y,Z) → activity(X,Y,Z)
employee(X1,Y1,Z1,W1,U1), supervision(Y1,Y2), employee(X2,Y2,Z2,W2,U2) → supervisor(X1,Y1,Z1,W1,U1)
Concept Definitions
generalManager(X1,Y1,Z1,W1,U1) → supervision(Y1,Y1)
An employee who supervises another employee is a supervisor
A general manager supervises him/herself
Ontological Constraints
Section 4: DIADEM » Datalog±
![Page 84: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/84.jpg)
115
efficiency
KR
expressiveness
expressiveness
DB
efficiency
Big Picture
![Page 85: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/85.jpg)
116
Big Picture
![Page 86: Diadem 1.0](https://reader033.fdocuments.us/reader033/viewer/2022061221/54bea7d84a7959705b8b45d4/html5/thumbnails/86.jpg)
123