1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language...
-
date post
19-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language...
![Page 1: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/1.jpg)
1
Concepts, Ontologies, and Project TANGO
Deryle LonsdaleBYU Linguistics and English Language
![Page 2: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/2.jpg)
2
Outline
NSF projects Semantic Web
Concepts Project TIDIE
Ontologies Project TANGO
Tables Ontology generation
![Page 3: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/3.jpg)
3
Acknowledgements NSF David Embley (BYU CS), Steve Liddle (BYU
Marriott School) and Yuri Tijerino BYU Data Extraction Group members
![Page 4: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/4.jpg)
4
The National Science Foundation Federal agency, $5.5 billion budget, funds 20%
of all federally supported basic research conducted by America’s colleges and universities
7 directorates Biological Sciences, Computer and Information Science
and Engineering, Engineering, Geosciences, Mathematics and Physical Sciences, Social, Behavioral and Economic Sciences, and Education and Human Resources
200,000 scientists, engineers, educators and students at universities, laboratories and field sites
10,000 awards/year, 3 years duration (avg.)
![Page 5: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/5.jpg)
5
The NSF Nifty 50 (general) ACCELERATING, EXPANDING
UNIVERSE ANTARCTIC OZONE HOLE
RESEARCH ARABIDOPSIS—A PLANT GENOME
PROJECT BAR CODES BLACK HOLES CONFIRMED BUCKY BALLS COMPUTER VISUALIZATION
TECHNIQUES DATA COMPRESSION TECHNOLOGY DISCOVERY OF PLANETS DOPPLER RADAR EFFECTS OF ACID RAIN EL NIÑO AND LA NIÑA PREDICTIONS FIBER OPTICS
GEMINI TELESCOPES HANTAVIRUS
IDENTIFICATION DNA FINGERPRINTING MRI—MAGNETIC
RESONANCE IMAGING NANOTECHNOLOGY THE NATIONAL
OBSERVATORIES OVERCOMING HEAVY
METALS OVERCOMING SALT
TOXICITY TISSUE ENGINEERING TUMOR DETECTION VOLCANIC ERUPTION
DETECTION YELLOW BARRELS
![Page 6: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/6.jpg)
6
Language-related Nifty 50 AMERICAN SIGN LANGUAGE DICTIONARY
DEVELOPMENT COMPUTER VISUALIZATION TECHNIQUES THE DARCI CARD DATA COMPRESSION TECHNOLOGY THE "EYE CHIP" OR RETINA CHIP THE INTERNET PERSONS WITH DISABILITIES ACCESS
TO THE WEB PROJECT LISTEN SPEECH RECOGNITION TECHNOLOGY vBNS—VERY HIGH SPEED BACKBONE
NETWORK SYSTEM WEB BROWSERS
![Page 7: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/7.jpg)
7
Hypernym
Synonym
Annotation
The search query
Browsing the Semantic Web
![Page 8: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/8.jpg)
8
Ranking based on content data and structure
Grouping results by their conceptual relationships Using lexical semantics for similarity search
movie
astronomy
sports
Browsing the Semantic Web
![Page 9: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/9.jpg)
9
Desirable, not (yet) possible
Word sense disambiguation Other types of queries (e.g. services)
What is the cheapest available round-trip flight to Cancun the day after finals this semester?
Set up an appointment with my optometrist for next week.
List available 4-person BYU-approved apartments in Orem for under $150/month.
Find me a linguistics job in Tahiti.
![Page 10: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/10.jpg)
10
Project TIDIE
Apr 10, 2001 – May 12, 2005
![Page 11: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/11.jpg)
11
Overview of TIDIE
3-year NSF project at BYU Total amount about $430,000 PI David Embley (BYU CS), 4 co-PI’s
from BYU 18 grad students, 45 publications Demos, tools, papers, presentations at
website (www.deg.byu.edu/)
![Page 12: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/12.jpg)
12
Goal of TIDIE Target-Based Independent-of-Document
Information Extraction Target-based: user specifies what to find
Not just keyword search, but concept-based search using an ontology
Document independent Should work even if pages change over time, on
new documents IE: match, merge, retrieve, format information Present in way that user can search, query
results
![Page 13: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/13.jpg)
13
Document-based IE
![Page 14: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/14.jpg)
14
Recognition and extraction
Car Year Make Model Mileage Price PhoneNr0001 1989 Subaru SW $1900 (336)835-85970002 1998 Elantra (336)526-54440003 1994 HONDA ACCORD EX 100K (336)526-1081
Car Feature0001 Auto0001 AC0002 Black0002 4 door0002 tinted windows0002 Auto0002 pb0002 ps0002 cruise0002 am/fm0002 cassette stereo0002 a/c0003 Auto0003 jade green0003 gold
![Page 15: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/15.jpg)
15
Concepts
What drive the matching process for IE Inherent in words, numbers, phrases,
text Linguistics: lexical semantics Denotations: entities, attributes Location: relationships Occurrences: constraints
![Page 16: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/16.jpg)
16
Concept matching
We use exhaustive concept matching techniques to find concepts in documents including: Lexical information (lexicons) Natural language processing (NLP)
techniques Similarity of values Features of value Data frames Constraints
![Page 17: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/17.jpg)
17
Lexicons
Repositories of enumerable classes of lexical information
FirstNames, LastNames, USStates, ProvoOremApts, CarMakes, Drugs, CampGroundFeats, etc.
WordNet (synonyms, word senses, hypernyms/hyponyms)
![Page 18: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/18.jpg)
18
The data-frame library Snippets of real-world knowledge about data
(type, length, nearby keywords, patterns [as in regexps], functional relations, etc)
Low-level patterns implemented as regular expressions
Match items such as email addresses, phone numbers, names, etc.
Mileage matches [8] constant { extract "\b[1-9]\d{0,2}k"; substitute "[kK]" -> "000"; },
{ extract "[1-9]\d{0,2}?,\d{3}"; context "[^\$\d][1-9]\d{0,2}?,\d{3}[^\d]"; substitute "," -> "";}, { extract "[1-9]\d{0,2}?,\d{3}"; context "(mileage\:\s*)[^\$\d][1-9]\d{0,2}?,\d{3}[^\d]"; substitute "," -> "";},
{ extract "[1-9]\d{3,6}"; context "[^\$\d][1-9]\d{3,6}\s*mi(\.|\b\les\b)";}, { extract "[1-9]\d{3,6}"; context "(mileage\:\s*)[^\$\d][1-9]\d{3,6}\b";}; keyword "\bmiles\b", "\bmi\.", "\bmi\b", "\bmileage\b";end;
![Page 19: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/19.jpg)
19
Isolated concepts are OK, but...
We’re also interested in the relations between concepts
This is often best done graphically Ontology: arrangement of concepts that
explicitizes their relations, constraints Conceptual modeling: field of CS /
linguistics that deals with formalizing concepts, using such information
BYU has its own well-known conceptual modeling framework (OSM)
![Page 20: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/20.jpg)
20
Conceptual modeling (OSM)
Year Price
Make Mileage
Model
Feature
PhoneNr
Extension
Car
hashas
has
has is for
has
has
has
1..*
0..1
1..*
1..* 1..*
1..*
1..*
1..*
0..1 0..10..1
0..1
0..1
0..1
0..*
1..*
![Page 21: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/21.jpg)
21
Ontologies and IE
Source Target
![Page 22: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/22.jpg)
22
'97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888
'97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888
Constant/keyword recognition
Descriptor/String/Position(start/end)
Year|97|2|3Make|CHEV|5|8Make|CHEVY|5|9Model|Cavalier|11|18Feature|Red|21|23Feature|5 spd|26|30Mileage|7,000|38|42KEYWORD(Mileage)|miles|44|48Price|11,995|100|105Mileage|11,995|100|105PhoneNr|566-3800|136|143PhoneNr|566-3888|148|155
![Page 23: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/23.jpg)
23
Year|97|2|3Make|CHEV|5|8Make|CHEVY|5|9Model|Cavalier|11|18Feature|Red|21|23Feature|5 spd|26|30Mileage|7,000|38|42KEYWORD(Mileage)|miles|44|48Price|11,995|100|105Mileage|11,995|100|105PhoneNr|566-3800|136|143PhoneNr|566-3888|148|155
Database instance generator
insert into Car values(1001, “97”, “CHEVY”, “Cavalier”, “7,000”, “11,995”, “556-3800”)insert into CarFeature values(1001, “Red”)insert into CarFeature values(1001, “5 spd”)
![Page 24: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/24.jpg)
24
CarAds
Color
Feature
AccessoryBodyType
OtherFeatureEngine
Transmission
Mileage
ModelTrim
TrimModel
Year
Make
Price
PhoneNr
0:1
has1:*
0:1has1:*
0:0.7:1has
1:* 0:0.9:1has
1:*
0:0.78:1
has
1:*
0:1
1:*
0:1
1:*
0:1
has1:*
0:*has
1:*
0:*
has
1:*
CarAds
Color
Feature
AccessoryBodyType
OtherFeatureEngine
Transmission
Mileage
ModelTrim
TrimModel
Year
Make
Price
PhoneNr
0:1
has1:*
0:1has1:*
0:0.7:1has
1:* 0:0.9:1has
1:*
0:0.78:1
has
1:*
0:1
1:*
0:1
1:*
0:1
has1:*
0:*has
1:*
0:*
has
1:*
Car ads extraction ontology
![Page 25: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/25.jpg)
25
Car ads ontology (textual)Car [->object];Car [0..1] has Year [1..*];Car [0..1] has Make [1..*];Car [0...1] has Model [1..*];Car [0..1] has Mileage [1..*];Car [0..*] has Feature [1..*];Car [0..1] has Price [1..*];PhoneNr [1..*] is for Car [0..*];PhoneNr [0..1] has Extension [1..*];Year matches [4]
constant {extract “\d{2}”; context "([^\$\d]|^)[4-9]\d[^\d]"; substitute "^" -> "19"; }, … …End;
![Page 26: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/26.jpg)
26
A gene ontology
![Page 27: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/27.jpg)
27
A geneology data model
![Page 28: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/28.jpg)
28
Finding jobs in linguistics
Built ontology for linguistics jobs: what defines a linguistics job
Data frames and lexicons: language names (www.ethnologue.com), subfields of linguistics (www.linguistlist.org), tools linguists use, programming languages, activities, responsibilities, country names
Documents: 3500 web pages + emails to me
Complete results reported in DLLS 2003
![Page 29: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/29.jpg)
29
Sample query
![Page 30: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/30.jpg)
30
Sample output
![Page 31: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/31.jpg)
31
Subfield expertise sought
0
100
200
300
400
500
600
700
IE/ IR Morpho NLP Phonetics
Phonology Pragmatics Speech SyntaxSemantics MT TESOL/EFL Translation
0
200
400
600
800
Psycho Neuro HistoricalTypological Acquisition CognitionSocioling Lexicography PhilologyPhilosophy Anthropo
![Page 32: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/32.jpg)
32
Technical skills sought
0
100
200
300
400
500
600
700
C/C++ CGI HTML/SGMLJ ava/ J script Lisp PerlProlog SQL TclVB XML/XSLT
0
50
100
150
200
250
300
Machine learning Finite- stateStatistical Stoch/ProbMath GenerativeField Methods
![Page 33: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/33.jpg)
33
Sample observations 270 don’t have linguist* (!) Computer/computational background required
for almost 1/3 (1116) Noticeable amount of headhunting,
particularly in Seattle, DC areas Often a job title is not even listed (!) Great need for ontologies related to linguistics
job titles theoretical frameworks, subfields typical linguist job activities linguistic research/development venues
![Page 34: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/34.jpg)
34
An engineering discipline? 160 linguistics jobs ending in “engineer” Software development cycle
research e., software design e. development e., software e. software quality e., linguistic test e., linguistic quality e. linguistic support e., user experience e. presales e., technical sales e.
Specific subfields web site e. speech e., voice recognition e., speech recognition application e., speech e.,
ASR tuning e., audio e. dialog e.
tools e. AI e., NLP e. knowledge e., ontology e. linguist e., natural language e. staff e. human factors e., user interface e.
![Page 35: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/35.jpg)
35
A recent ontologist job ad Date: Thu, 28 Jul 2005 11:44:40 Subject: General Linguistics: Ontologist, Denver, USA
Job Rank: Ontologist Specialty Areas: General Linguistics
Position Summary: Ontologist
This person will be responsible for modifying & editing Ontology structures.
Skills: Basic computer skills such as Internet, email, and spreadsheet programs In-depth knowledge of any major industry, such as Health Care, Automotive, Legal, Construction, and
so forth helpful Superior communication skills, both oral and written. Ability to communicate effectively with reports,
peers, superiors, and customers essential Travel &/or foreign language experience desired
Personal Characteristics: A healthy sense of logic, and a love for details A deep and abiding love of language, and of rule-governed classification systems. This person should
be excited by the challenge of figuring out the precise place where a word belongs, and be delighted with the prospect of performing such tasks as the major part of their job
Position Qualifications: -Bachelor's degree, preferably in Linguistics, Library Science, English, or related field
![Page 36: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/36.jpg)
36
Another recent ontologist ad Position Summary: Lead Ontologist
The Lead Ontologist will be responsible for creating & designing Ontology and Ontology structures. This person will be responsible for innovation and general Ontology development as Ontology requirements change. They will serve as Team Lead on various Ontology projects, and they will assist the Director with certain aspects of management, including the development of department culture and standards. They will also serve as a liaison between the Director and the rest of the team.
Skills: Ability to edit & manipulate text highly desired, using tools such as Emacs and Perl.
High level programming language experience and SQL also desired Knowledge of Ontology structures, and experience with developing and maintaining
such structures Ability to assist with Ontology development and use problem-solving skills to overcome
obstacles Ability to QA own Ontology work, and work of others Ability to lead projects from set-up through to QA Leadership or management experience a plus
Position Qualifications: -Bachelor's degree in Linguistics, Library Science, or related field -2-3 years experience in Ontology or related field
Application Deadline: Open until filled.
![Page 37: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/37.jpg)
37
Matching request with ontology
“Tell me about cruises on San Francisco Bay. I’d like to know scheduled times, cost, and the duration of cruises on Friday of next week.”
![Page 38: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/38.jpg)
38
Building a query
Friday, Oct. 29thcost
duration
Selection Constants
San Francisco Bayscheduled times
Projection
= Result ( )
Join Path
![Page 39: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/39.jpg)
39
StartTime Price Duration
Source
10:45 am, 12:00 pm, 1:15, 2:30, 4:00 $20.00, $16.00, $12.00
1
10:00 am, 10:45 am, 11:15 am, 12:00 pm, 12:30 pm, 1:15 pm, 1:45 pm, 2:30 pm, 3:00 pm, 3:45 pm, 4:15 pm, 5:00 pm
$17.00, $16.00, $12.00
1 Hour 2
![Page 40: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/40.jpg)
40
Another example Service Request
Match with Task Ontology Domain Ontology Process Ontology
Complete, Negotiate, Finalize
I want to see a dermatologist next week; any day would
be ok for me, at 4:00 p.m. The dermatologist must be
within 20 miles from my home and must accept my
insurance.
![Page 41: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/41.jpg)
41
Service domain ontology
Appointment
Place
Insurance
Service Provider
Person
NameDoctor
Pediatrcian
Service Description
Duration
Medical Service Provider
Auto Service Provider Auto Mechanic
Dermatologist
Address
Cost
Date
Time
has
is at
is on
has
provides
has
accepts
hashas
"IHC"
is with
is for
is at
is at
has
"DMBA"
is at
->Appointment
Place
Insurance
Service Provider
Person
NameDoctor
Pediatrcian
Service Description
Duration
Medical Service Provider
Auto Service Provider Auto Mechanic
Dermatologist
Address
Cost
Date
Time
has
is at
is on
has
provides
has
accepts
hashas
"IHC"
is with
is for
is at
is at
has
"DMBA"
is at
->
![Page 42: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/42.jpg)
42
Appointment
Place
Insurance
Service Provider
Person
NameDoctor
Pediatrcian
Service Description
Duration
Medical Service Provider
Auto Service Provider Auto Mechanic
Dermatologist
Address
Cost
Date
Time
has
is at
is on
has
provides
has
accepts
hashas
"IHC"
is with
is for
is at
is at
has
"DMBA"
is at
->Appointment
Place
Insurance
Service Provider
Person
NameDoctor
Pediatrcian
Service Description
Duration
Medical Service Provider
Auto Service Provider Auto Mechanic
Dermatologist
Address
Cost
Date
Time
has
is at
is on
has
provides
has
accepts
hashas
"IHC"
is with
is for
is at
is at
has
"DMBA"
is at
->
![Page 43: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/43.jpg)
43
Relevant mini-ontology
Appointment
Place
Dermatologist
Person
Name
Address
Date
Time
is at
is on
has
hasis with
is for
is at
is at
has
is at
->Appointment
Place
Dermatologist
Person
Name
Address
Date
Time
is at
is on
has
hasis with
is for
is at
is at
has
is at
->
![Page 44: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/44.jpg)
44
Ontologies: issues Most successful in data-rich, narrow- domain
applications Ambiguities are problematic, context only
partially eliminates Incompleteness: implicit information Commonsense world pragmatics evasive Knowledge prerequisites are steep Major efforts in creation, maintenance
Must be created by experts Experts are biased in knowledge, agreement needed Ontologies continually change; upkeep a massive task
![Page 45: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/45.jpg)
45
Ontologies: possible solutions
Some automation is needed Current automatic generation of ontologies is
not successful, because extracted from free-form, unstructured text.
A more effective alternative is to extract ontologies from structured data on the web (tables, charts, etc.)
TANGO project Part 1: Extract tables from the web Part 2: Define mini-ontologies from tables Part 3: Merge into growing domain ontology
![Page 46: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/46.jpg)
46
Project TANGO
![Page 47: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/47.jpg)
47
Overview
Table ANalysis for Generating Ontologies
3-year NSF-funded project Joint BYU/RPI project Uses and extends TIDIE concepts,
ontologies Goal is to process tables, generate
ontologies, use results for IE
![Page 48: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/48.jpg)
48
Motivation
Keyword or link analysis search not enough to search for information in tables
Structure in tables can lead to domain knowledge which includes concepts, relationships and constraints (ontologies)
Tables on web created for human use can lead to robust domain ontologies
![Page 49: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/49.jpg)
49
Table understanding
What is a table? Why table normalization? What is table understanding? What is mini-ontology generation?
![Page 50: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/50.jpg)
50
What is a table?
“…a two-dimensional assembly of cells used to present information…” Lopresti and Nagy
Normalized tables (row-column format) Small paper (using OCR) and/or
electronic tables (marked up) intended for human use
![Page 51: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/51.jpg)
51
?
Olympus C-750 Ultra Zoom
Sensor Resolution: 4.2 megapixelsOptical Zoom: 10 xDigital Zoom: 4 xInstalled Memory: 16 MBLens Aperture: F/8-2.8/3.7Focal Length min: 6.3 mmFocal Length max: 63.0 mm
![Page 52: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/52.jpg)
52
?
Olympus C-750 Ultra Zoom
Sensor Resolution: 4.2 megapixelsOptical Zoom: 10 xDigital Zoom: 4 xInstalled Memory: 16 MBLens Aperture: F/8-2.8/3.7Focal Length min: 6.3 mmFocal Length max: 63.0 mm
![Page 53: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/53.jpg)
53
?
Olympus C-750 Ultra Zoom
Sensor Resolution: 4.2 megapixelsOptical Zoom: 10 xDigital Zoom: 4 xInstalled Memory: 16 MBLens Aperture: F/8-2.8/3.7Focal Length min: 6.3 mmFocal Length max: 63.0 mm
![Page 54: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/54.jpg)
54
?
Olympus C-750 Ultra Zoom
Sensor Resolution 4.2 megapixelsOptical Zoom 10 xDigital Zoom 4 xInstalled Memory 16 MBLens Aperture F/8-2.8/3.7Focal Length min 6.3 mmFocal Length max 63.0 mm
![Page 55: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/55.jpg)
55
Digital Camera
Olympus C-750 Ultra Zoom
Sensor Resolution: 4.2 megapixelsOptical Zoom: 10 xDigital Zoom: 4 xInstalled Memory: 16 MBLens Aperture: F/8-2.8/3.7Focal Length min: 6.3 mmFocal Length max: 63.0 mm
![Page 56: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/56.jpg)
56
?
Flight # Class From Time/Date To Time/Date Stops
Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04
Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
![Page 57: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/57.jpg)
57
?
Flight # Class From Time/Date To Time/Date Stops
Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04
Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
![Page 58: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/58.jpg)
58
Airline Itinerary
Flight # Class From Time/Date To Time/Date Stops
Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04
Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
![Page 59: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/59.jpg)
59
?
Place Bonnie LakeCounty DuchesneState UtahType LakeElevation 10,000 feetUSGS Quad Mirror LakeLatitude 40.711ºNLongitude 110.876ºW
![Page 60: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/60.jpg)
60
?
Place Bonnie LakeCounty DuchesneState UtahType LakeElevation 10,000 feetUSGS Quad Mirror LakeLatitude 40.711ºNLongitude 110.876ºW
![Page 61: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/61.jpg)
61
?
Place Bonnie LakeCounty DuchesneState UtahType LakeElevation 10,000 feetUSGS Quad Mirror LakeLatitude 40.711ºNLongitude 110.876ºW
![Page 62: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/62.jpg)
62
Maps
Place Bonnie LakeCounty DuchesneState UtahType LakeElevation 10,100 feetUSGS Quad Mirror LakeLatitude 40.711ºNLongitude 110.876ºW
![Page 63: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/63.jpg)
63
Table normalization
take any table, produce a standard row-column table with all data cells containing expanded values and type information
Country GDP/PPP GDP/PPP Per
Capita
Real-Growt
h Rate
Inflation
Afghanistan $21,000,000,000 $800 ? ?
Albania $13,200,000,000 $3,800 7.3% 3.0%
Algeria $177,000,000,000 $5,600 3.8% 3.0%
Andorra $1,300,000,000 $19,000 3.8% 4.3%
Angola $13,300,000,000 $1,330 5.4% 110.0%
Antigua and Barbuda
$674,000,000 $10,000 3.5% 0.4%
… … … … …
Raw table
Normalizedtable
![Page 64: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/64.jpg)
64
Normalizing across hyperlinks
![Page 65: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/65.jpg)
65
Normalized table?? Population Populatio
nGrowth
rate
PopulationDensity
BirthRate
DeathRate
Migration
Rate
LifeExpectan
cyMale
LifeExpectanc
yFemale
InfantMortalit
y
Afghanistan 25,824,882 3.95% 39.88 persons/
km2
4.19%
1.70%
1.46% 47.82 years
46.82 years
14.06%
Albania 3,364,571 1.05% 122.79 persons/
km2
2.07%
0.74%
-0.29% 65.92 years
72.33 years
4.29%
Algeria 31,133,486 2.10% 13.07 persons/
km2
2.70%
0.55%
-0.05% 68.07 years
70.46 years
4.38%
American Samoa
63,786 2.64% 320.53 persons/
km2
2.65%
0.40%
0.39% 71.23 years
79.95 years
1.02%
Andorra 65,939 2.24% 146.53 persons/
km2
1.03%
0.55%
1.76% 80.55 years
86.55 years
0.41%
Angola 11,510 2.84% 8.97 persons/
km2
4.31%
1.64%
0.16% 46.08 years
50.82 years
12.92%
… … … … … … … … … …
Western Sahara 239,333 2.34% 0.90 persons/
km2
4.54%
1.66%
-0.54% 47.98 years
50.57 years
13.67%
World 5,995,544,836
1.30% 14.42 persons/
km2
2.20%
0.90%
? 61.00 years
65.00 years
5.60%
Yemen 16,942,230 3.34% 32.09 persons/
km2
4.33%
0.99%
0.00% 58.17 years
61.88 years
6.98%
Zambia 9,663,535 2.12% 13.05 persons/
km2
4.45%
2.26%
0.08% 36.72 years
37 21 years
9.19%
Zimbabwe 11,163,160 1.02% 28.87 persons/
km2
3.06%
2.04%
? 38.77 years
38.94 years
6.12%
![Page 66: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/66.jpg)
66
How to understand tables
Captions – in vicinity of table (above, below etc)
Footnotes – on annotated column labels or data cells
Embedded information – in rows, columns or cells {e.g., $, %, (1,000), billions, etc}
Links to other views of the table, possibly with new information
![Page 67: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/67.jpg)
67
Use of normalized data Take a table as an input and produce standard records
in the form of attribute-value pairs as output Discover constraints among columns Understand the data values
Country GDP/PPP GDP/PPP Per
Capita
Real-Growth Rate
Inflation
Afghanistan
$21,000,000,000 $800 ? ?
Albania $13,200,000,000 $3,800 7.3% 3.0%
Algeria $177,000,000,000
$5,600 3.8% 3.0%
Andorra $1,300,000,000 $19,000 3.8% 4.3%
Angola $13,300,000,000 $1,330 5.4% 110.0%
Antigua and Barbuda
$674,000,000 $10,000 3.5% 0.4%
… … … … …
{has(Country, GDP/PPP),has(Country,GDP/PPP Per Capita),has(Country,Real-growth rate*), has(Country, Inflation*)
Left-most, primary key
Dollar amount(from data frame)
Percentage(from data frame)
Country names(from data frame)
{<Country: Afghanistan>, <GDP/PPP: $21,000,000,000>, <GDP/PPP per capita: $800>, <Real-growth rate: ?>, <Inflation: ?>}
![Page 68: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/68.jpg)
68
Ontology generation overview
Concepts of Interest
Concepts with Relations
Data extraction ontology
Sample Documents
![Page 69: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/69.jpg)
69
Example:Creating a domain ontology
Has associateddata frames
Includes proceduralknowledge
Distances
Duration betweenTime zones
Name Geopolitical Entity
Time
Location
Longitude Latitude
hasnames
Latitude and longitudedesignates location
Country City
HasGMT
![Page 70: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/70.jpg)
70
Example:Table understanding to mini-ontology generation
Agglomeration Population
Continent Country
Tokyo 31,139,900
Asia Japan
New York-Philadelphia
30,286,900
The Americas
United States of America
Mexico 21,233,900
The Americas
Mexico
Seoul 19,969,100
Asia Korea (South)
Sao Paulo 18,847,400
The Americas
Brazil
Jakarta 17,891,000
Asia Indonesia
Osaka-Kobe-Kyoto
17,621,500
Asia Japan
… … … …
Niigata 503,500 Asia Japan
Raurkela 503,300 Asia India
Homjel 502,200 Europe Belarus
Zunyi 501,900 Asia China
Santiago 501,800 The Americas
Dominican Republic
Pingdingshan 501,500 Asia China
Fargona 501,000 Asia Uzbekistan
Kirov 500,200 Europe Russia
Newcastle 500,000 Australia /Oceania
Australia
Agglomeration Population
Country Continent
![Page 71: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/71.jpg)
71
Example:Concept matching to ontology Merging
Merge
Results
Agglomeration Population
Country Continent
Time
Location
Longitude Latitude
hasnames
Latitude and longitudedesignates location
Country City
Name Geopolitical Entity
Continent
Location
Longitude Latitude
Latitude and longitudedesignates location
Name Geopolitical Entity
Population
CityAgglomerationCountry
HasGMT
Time
Location
Longitude Latitude
hasnames
Latitude and longitudedesignates location
Country City
Name Geopolitical Entity
HasGMT
![Page 72: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/72.jpg)
72
Ontology merging/growing Direct merge (no conflicts)
Use results of matching phase to find similar concepts in ontologies (e.g., data value similarities, data frames, NLP, etc)
Conflict resolution Interactively identify evidence and counter
evidence of functional relationships among mini-ontologies using constraint resolution
IDS Interaction with human knowledge engineer Issues – identify Default strategy – apply Suggestions – make
![Page 73: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/73.jpg)
73
Example: Another mini-ontology generation
Place
Longitude Latitude
Elevation
USGS Quad
Area
MineReservoirLakeCity/town
Country
State
Place Name
⊎
![Page 74: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/74.jpg)
74
Example: Another mini-ontology generation
Place
Longitude Latitude
Elevation
USGS Quad
Area
MineReservoirLakeCity/town
Country
State
Place Name
⊎
Location
Longitude Latitude
Latitude and longitudedesignates location
Name Geopolitical Entity
Population
CityAgglomerationCountry
Merge
Continent
Time
hasnameshasGMT
![Page 75: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/75.jpg)
75
Example: Concept Mapping to Ontology Merging
Place
Elevation
USGS Quad
Area
MineReservoirLake
Country
State
⊎
Location
Longitude Latitude
Latitude and longitudedesignates location
Name Geopolitical Entity
Population
AgglomerationCountryContinent
Time
hasnameshasGMT
GeopoliticalEntity with population
City/town
![Page 76: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/76.jpg)
76
Recognize Table Information
Religion Population Albanian Roman Shi’a SunniCountry (July 2001 est.) Orthodox Muslim Catholic Muslim Muslim other
Afganistan 26,813,057 15% 84% 1%Albania 3,510,484 20% 70% 30%
![Page 77: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/77.jpg)
77
Construct Mini-Ontology Religion Population Albanian Roman Shi’a SunniCountry (July 2001 est.) Orthodox Muslim Catholic Muslim Muslim other
Afganistan 26,813,057 15% 84% 1%Albania 3,510,484 20% 70% 30%
![Page 78: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/78.jpg)
78
Discover Mappings
![Page 79: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/79.jpg)
79
Merge
![Page 80: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/80.jpg)
80
Review: the TANGO process
Start out with normalized table Generate likely candidates for:
Object Sets Relationship Sets Functional Constraints Inclusion Constraints/Hierarchical Structure
Get help from user when needed Choose best candidate for the ontology
![Page 81: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/81.jpg)
81
Generate concepts
Create list of candidate concepts (usually column names)
![Page 82: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/82.jpg)
82
Example 1: Generate Concepts
Determine lexicalization (columns with associated values are lexical)
![Page 83: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/83.jpg)
83
Example 1: Generate Concepts
Current ontology
![Page 84: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/84.jpg)
84
Example 1: Generate Relationships
Decide relationship sets Exponential number of combinations Basic assumption: one main concept relates to all
others (attributes) Goal: find central column of interest
![Page 85: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/85.jpg)
85
Example 1: Generate Relationships
Look for mapping between one column and title of table
![Page 86: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/86.jpg)
86
Example 1: Generate Relationships
Current ontology
![Page 87: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/87.jpg)
87
Example 1: Generate Constraints
FDs and Participation Constraints FD definition: X → Y iff (X[i] = X[j]) → (Y[i] = Y[j]) for all
row indexes i and j. Unless solid case (two or more same values), only
consider FDs from central object to attributes Use heuristics for setting exact participation (0:1,1:*, etc)
![Page 88: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/88.jpg)
88
Example 1: Generate Concepts
Numerical values are usually functionally determined by column of interest and have 0:* participation constraint.
![Page 89: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/89.jpg)
89
Example 1: Generate Constraints
Completed mini-ontology
![Page 90: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/90.jpg)
90
Example 2: Generate Concepts
SubFamily, Group, and SubGroup are generic types
Enumerate column values as object sets because less than 5 divisions (recursively)
![Page 91: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/91.jpg)
91
Example 2: Generate Relationships
Found mapping of central column of interest to title (Language)
Exceptions to basic assumption Hierarchy
(enumerated object sets)
Transitive FDs (X → Y, Y → Z, remove X → Z)
Create ISA hierarchy from table structure
![Page 92: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/92.jpg)
92
Example 2: Generate Relationships
Current ontology
![Page 93: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/93.jpg)
93
Example 2: Generate Hierarchical Constraints
Assign members to each object set for easy calculation
Find inclusion dependencies: Union – All
members of parents are members of one or more child
Intersection (Less common) – Child members are always in both parents
Mutual exclusion – Intersection of any two child members is empty.
![Page 94: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/94.jpg)
94
Example 2: Generate Hierarchical Constraints
Completed mini-ontology
![Page 95: 1 Concepts, Ontologies, and Project TANGO Deryle Lonsdale BYU Linguistics and English Language lonz@byu.edu.](https://reader031.fdocuments.us/reader031/viewer/2022032800/56649d2a5503460f949ffae9/html5/thumbnails/95.jpg)
95
Future direction
Start with multiple tables (or URLs) and generate mini-ontologies
Identify most suitable mini-ontologies to merge by calculating which tables have most overlap of concepts
Generate multiple domain ontologies Integrate with form-based data
extraction tools (smarter Web search engines)