A First Attempt towards a Logical Model for the PBMS
description
Transcript of A First Attempt towards a Logical Model for the PBMS
![Page 1: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/1.jpg)
A First Attempt towards a Logical Model for the
PBMS
PANDA Meeting, Milano, 18 April 2002National Technical University of Athens
Patterns for Next-Generation Database Systems
PANDA
![Page 2: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/2.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
2
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective• Conclusions
![Page 3: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/3.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
3
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective• Conclusions
![Page 4: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/4.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
4
General Framework
Meta-Pattern Type + Patter Types = PBMS Catalog
Pattern Layer =PBMS Content
Raw Data
Cluster 3
Cluster 2
Cluster 1
Assoc. Rule n
Assoc. Rule 2
Assoc. Rule 1
Decision Tree 1
Ass. Rule Algorithm
Dec. Tree Algorithm
DBSCAN Cluster
Algorithm
belong to
belongs tobelongto
Association Rule Type
DBSCAN Cluster Type
Decision Tree Type
belong to
Meta_Pattern Type
PBMS
Pattern TypeLayer
Meta-Pattern TypeLayer Language
![Page 5: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/5.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
5
General IdeaMeta-Pattern Type+ Language Relation + Language
• a Name • a Condensed Expression • an Extension and
Language
• a Name • a Schema • an Extension and
Relational Calculus
Pattern Type Relational Table• AssociationRuleType • head :- body
• ext(AssociationRuleType)
• Buys• session_id,date,item, price
• ext(Buys)
Pattern Tuple
Buys(x,_,beer,_):- Buys(x,_,pampers,_)
Buys(34,4/4/2002,beer,2)
![Page 6: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/6.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
6
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective • Conclusions
![Page 7: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/7.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
7
Mathematical Background
Assumptions from the definition:• There exists a data space and a pattern space.• There always exist M:N relationships among data and
patterns.
Data Space Pattern Space
![Page 8: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/8.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
8
Characteristics of data and pattern space
• Each data item is characterized by a finite number of features N.
• dom(x) the domain of each feature. • Data space DN dom(A1)x…xdom(AN)• Proposal: all dom(x) are infinitely countable +
consider cases for DN (whether it is finite or not).
• Each pattern is characterized by a finite number of features M.
• Pattern space DM dom(A1)x…xdom(AM)• Proposal: all dom(x) are infinitely countable + DM is
clearly finite.
![Page 9: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/9.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
9
Statistical Measures
The data-pattern relationship fDP has:
• participation measures for the relationship;• importance measures for a data item;• importance measures for a pattern.
Data Space Pattern Space
![Page 10: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/10.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
10
Statistical Measures
• Richness of representation =relationships captured by the condensed representation
total number of relationships
• Compactness of the representation = size(DM)*M
size(DN)*N
![Page 11: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/11.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
11
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective• Conclusions
![Page 12: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/12.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
12
General Framework
Meta-Pattern Type + Patter Types = PBMS Catalog
Pattern Layer =PBMS Content
Raw Data
Cluster 3
Cluster 2
Cluster 1
Assoc. Rule n
Assoc. Rule 2
Assoc. Rule 1
Decision Tree 1
Ass. Rule Algorithm
Dec. Tree Algorithm
DBSCAN Cluster
Algorithm
belong to
belongs tobelongto
Association Rule Type
DBSCAN Cluster Type
Decision Tree Type
belong to
Meta_Pattern Type
PBMS
Pattern TypeLayer
Meta-Pattern TypeLayer Language
![Page 13: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/13.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
13
Pattern Types
• Intentional Description of a Pattern Type as follows:– PID
– Explicit Relationship: fDPi:DN→Di
M.
– Relationship Expression
– Statistical Measures.
• Extensional Description (or Pattern Extension) of a Pattern Type : a finite set of patterns
• Data extension of of a Pattern Type : a countable? set of data items
![Page 14: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/14.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
14
Example
Pattern Type Intentional Description
[small part of] Pattern Type Extensional Description
• PID• Explicit Relationship• Relationship Expression
• Statistical Measures
• PID123
• fDPi:DN→Di
M ={(PID123,RID124),…}
• Buys(x,_,beer,_):-
Buys(x,_,pampers,_) • Coverage=80%, Confidence=90%
![Page 15: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/15.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
15
General Framework
Meta-Pattern Type + Patter Types = PBMS Catalog
Pattern Layer =PBMS Content
Raw Data
Cluster 3
Cluster 2
Cluster 1
Assoc. Rule n
Assoc. Rule 2
Assoc. Rule 1
Decision Tree 1
Ass. Rule Algorithm
Dec. Tree Algorithm
DBSCAN Cluster
Algorithm
belong to
belongs tobelongto
Association Rule Type
DBSCAN Cluster Type
Decision Tree Type
belong to
Meta_Pattern Type
PBMS
Pattern TypeLayer
Meta-Pattern TypeLayer Language
![Page 16: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/16.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
16
Meta-Pattern Types
• Intentional Description of a Pattern Type as follows:– Name
– Condensed Expression
– [Meta]Statistical Measures.
– ?? Schema Attributes ??
• Extensional Description of a Meta-Pattern Type : a finite set of pattern types
![Page 17: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/17.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
17
ExampleMeta-Pattern Type Intentional Description
[small part of] Meta-Pattern Type Extensional Description
• Name• Condensed Expression• [Meta]Statistical
Measures• Schema Attributes??
•AssociationRuleType •head :- body•Coverage: Float[0..1],
Confidence: Float[0..1]•PID, Head, Body ??
Pattern Type Intentional Description
[small part of] Pattern Type Extensional Description
• PID• Explicit Relationship• Relationship Expression
• Statistical Measures
• PID123
• fDPi:DN→Di
M ={(PID123,RID124),…}
• Buys(x,_,beer,_):-
Buys(x,_,pampers,_) • Coverage=80%, Confidence=90%
![Page 18: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/18.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
18
Which language to choose?
• Relational Calculus, Datalog and Stratified Datalog ?– Powerful but not elegant for all the patterns that we
might want to express…
• Constraint database approach ?– We cannot guarantee a finite representation of the
result for non-linear constraints…
![Page 19: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/19.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
19
Which language to choose?
![Page 20: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/20.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
20
Which language to choose?• Remove recursion ?
– Cannot express interesting patterns like transitive closure…
• Only linear constraints ?– Cannot express interesting patterns like cyclic clusters…
– Approximation of polynomials through sets of linear constraints ? Not elegant…
• Forget constraints and describe every pattern type as a simple predicate ?– Loss of all the declarative information on the nature of the
pattern type …
• So, what to do? Possible dead-end due to the paradigm?
![Page 21: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/21.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
21
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective• Conclusions
![Page 22: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/22.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
22
How to build it?
• Each of the pattern types implemented as a Class. • The different pattern types defined as specializations
of a Generic Pattern Class.
• Treat pattern types as predicates, with semantics computed by a computationally complete procedural language [e.g., PL/SQL, C++, …]? – Instead of fundamental research we turn to feasibility
issues…
• What about behavior?
![Page 23: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/23.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
23
General Framework
Meta-Pattern Type + Patter Types = PBMS Catalog
Pattern Layer =PBMS Content
PBMS
Cluster 3
Cluster 2
Cluster 1
Assoc. Rule n
Assoc. Rule 2
Assoc. Rule 1
Decision Tree 1
IN ININ
Association Rule Class
Cluster Class
Decision Tree Class
ISA
GenericClass
Set of DDL/DMLLanguages
How to build it?
![Page 24: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/24.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
24
Overview
• General Understanding of the PBMS• Mathematical Background• MetaModel: Entities and Language• The Software Engineering Perspective• Conclusions
![Page 25: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/25.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
25
Conclusions• Followed the Datalog paradigm (need for deductive
capabilities) enhanced with constraints (need for elegance)
• Reduced the problem to the specification of a proper language for the description of pattern types
• Fundamental language limitations when considered constraints
• Dilemma: – Change paradigm?
– Stick with this paradigm and focus on engineering issues?
– …Any other suggestions ?…
![Page 26: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/26.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
26
Thank you …
![Page 27: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/27.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
27
Definitions from the minutes of Athens meeting
• Pattern is a compact and rich in semantics representation of raw data.
• A Pattern-Based Management System (PBMS) is a system for handling (storing / processing / retrieving) patterns extracted from raw data in order to efficiently support pattern matching and to exploit pattern- related operations generating intentional information.
![Page 28: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/28.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
28
Issues around the pattern definition
• The mapping from original raw data space to less populated ( compact) pattern space is always possible preserving (or, documenting) as much knowledge as possible from raw data space ( rich in semantics).
• A M:N mapping between raw data space and pattern space is permitted
• Perhaps, several levels of representation / abstraction exist (different levels of granularity, multi-dimensionality, recursion, hierarchies, etc.)
![Page 29: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/29.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
29
Issues around the PBMS definition
• A PBMS will cooperate with a DBMS storing raw data;
• A PBMS processes different kinds of queries (because of different user needs) on raw data and returns more intuitive results to users;
• A PBMS is useful in order to process those queries more efficiently than a normal DBMS would do;
• A PBMS will have its own mechanisms for representing and storing its entries (patterns), posing and processing queries, efficiently retrieving its entries.
![Page 30: A First Attempt towards a Logical Model for the PBMS](https://reader036.fdocuments.us/reader036/viewer/2022062803/56814689550346895db3ab5d/html5/thumbnails/30.jpg)
P. Vassiliadis. PANDA Meeting, Milano, 18 April 2002
30
Query Language Issues
• Given a datum, which pattern does it refer to? Which are the data that correspond to this pattern?
• Zoom-in, zoom-out a pattern. Pattern union, difference.• Composition of patterns (i.e., if A B and B C, then derive A
C). • What are values of the statistical measures for this pattern?
Which patterns fulfill a certain constraint on a statistical measure?
• Which are the patterns in the PBMS catalog? Which are the attributes or the statistical measures for this pattern type? Which pattern types relate to a certain statistical measure?
• Closed Form of the Language.