¾ 5.1 Validation of Graph-Based Models (Analysis...

97
Fakultät Informatik, Institut für Software- und Multimediatechnik, Lehrstuhl für Softwaretechnologie ¾5.1 Validation of Graph-Based Models (Analysis and Consistency) ¾Prof. Dr. U. Aßmann ¾Technische Universität Dresden ¾Institut für Software- und Multimediatechnik ¾Institut für Software und Multimediatechnik ¾Gruppe Softwaretechnologie ¾http://st.inf.tu-dresden.de

Transcript of ¾ 5.1 Validation of Graph-Based Models (Analysis...

Page 1: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Fakultät Informatik, Institut für Software- und Multimediatechnik, Lehrstuhl für Softwaretechnologie

5.1 Validation of Graph-Based Modelsp

(Analysis and Consistency)

Prof. Dr. U. AßmannTechnische Universität DresdenInstitut für Software- und MultimediatechnikInstitut für Software und MultimediatechnikGruppe Softwaretechnologiehttp://st.inf.tu-dresden.de

Page 2: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Contents

Different kinds of relations: Lists, Trees, Dags, GraphsTreating graph-based models – The graph-logic isomorphism

l i i hi h b d d lAnalysis, querying, searching graph-based modelsThe Same Generation ProblemDatalog and EARSTransitive ClosureTransitive Closure

Consistency checking of graph-based specifications (aka model validation)

Projections of graphsProjections of graphsTransformation of graphs

TU Dresden, 28.04.2009

Sebastian Richly

Folie

2 von 101

Model Consistency

Page 3: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Obligatory Reading

Jazayeri Chap 3If you have Balzert, Macasziek or Pfleeger, read the lecture slides carefully and do the exercise sheetscarefully and do the exercise sheetsJ. Pan et. al. Ontology Driven Architectures and Potential Uses of the Semantic Web in Systems and Software Engineering

http://www.w3.org/2001/sw/BestPractices/SE/ODA/p g

D. Calvanese, M. Lenzerini, D. Nardi. Description Logics for Data Modeling. In J. Chomicki, G. Saale. Logics for Databases and Information Systems. Kluwer, 1998.Uwe Aßmann, Steffen Zschaler, and Gerd Wagner. Ontologies, Meta-Models, and the Model-Driven Paradigm. Handbook of Ontologies in Software Engineering. Springer, 2006.Holger Knublauch Daniel Oberle Phil Tetlow Evan Wallace (ed ) A Holger Knublauch, Daniel Oberle, Phil Tetlow, Evan Wallace (ed.). A Semantic Web Primer for Object-Oriented Software Developers http://www.w3.org/2001/sw/BestPractices/SE/ODSD/

TU Dresden, 28.04.2009

Sebastian Richly

Folie

3 von 101

Model Consistency

Page 4: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

References

S. Ceri, G. Gottlob, L. Tanca. What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE Transactions on Knowledge And Data Engineering. March 1989, (1) 1, pp. 146-166.S. Ceri, G. Gottlob, L. Tanca. Logic Programming and Databases. Springer, 1989.Ullman, J. D. Principles of Database and Knowledge Base Systems. Computer Science Press 1989.Benjamin Grosof, Ian Horrocks, Raphael Volz, and Stefan Decker. Description logic programs: Combining logic programs with description logics In Proc of World Wide Web Conference (WWW) description logics. In Proc. of World Wide Web Conference (WWW) 2003, Budapest, Hungary, 05 2003. ACM Press.Preprints available on my home page:

U Aßmann On Edge Addition Rewrite Systems and their Relevance for U. Aßmann. On Edge Addition Rewrite Systems and their Relevance for Program Analysis. 1994. U. Aßmann. Graph Rewrite Systems for Program Optimization. ACM Transactions on Programming Languages and Systems, June 2000.g g g g y ,U. Aßmann. OPTIMIX, A Tool for Rewriting and Optimizing Programs. Graph Grammar Handbook, Vol. II, 1999. Chapman&Hall.U. Aßmann. Reuse in Semantic Applications. REWERSE Summer School. July 2005. Malta. LNCS, Springer.

TU Dresden, 28.04.2009

Sebastian Richly

Folie

4 von 101

Model Consistency

Page 5: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Goals

Understand that software models can become very largethe need for appropriate techniques to handle large modelsthe need for appropriate techniques to handle large models

in hand developmentautomatic analysis of the models

Learn how to use graph based techniques (Datalog Description Learn how to use graph-based techniques (Datalog, Description Logic, EARS, graph transformations) to analyze and check models for consistency, well-formedness, integrityUnderstand some basic concepts of simplicity in software modelsUnderstand some basic concepts of simplicity in software models

TU Dresden, 28.04.2009

Sebastian Richly

Folie

5 von 101

Model Consistency

Page 6: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Motivation

Software engineers must be able to handle big design specifications (design models) during developmentwork with consistent modelswork with consistent modelsmeasure models and implementationsvalidate models and implementations

Real models and systems become very complexMost specifications are graph-based

We have to deal with basic graph theory to be able to measure well

Every analysis method is very welcomeEvery structuring method is very welcome

TU Dresden, 28.04.2009

Sebastian Richly

Folie

6 von 101

Model Consistency

Page 7: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

The Problem: How to Master Large Models

Large models have large graphs

They can be hard to understand

Figures taken from Goose Reengineering Tool, analysing a Java class system [Goose FZI Karlsruhe]class system [Goose, FZI Karlsruhe]

TU Dresden, 28.04.2009

Sebastian Richly

Folie

7 von 101

Model Consistency

Page 8: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE
Page 9: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Partially Collapsed

TU Dresden, 28.04.2009

Sebastian Richly

Folie

9 von 101

Model Consistency

Page 10: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Totally Collapsed

TU Dresden, 28.04.2009

Sebastian Richly

Folie

10 von 101

Model Consistency

Page 11: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Requirements for Modeling in Requirements and Design

We need guidelines how to develop simple models

We need analysis techniques to Check the consistency of the modelsCheck the consistency of the modelsFind out about their complexityFind out about simplifications

TU Dresden, 28.04.2009

Sebastian Richly

Folie

11 von 101

Model Consistency

Page 12: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

What Happens in a Software Tool?

Some Relationships (Graphs) in Software Systems

TU Dresden, 28.04.2009

Sebastian Richly

Folie

12 von 101

Model Consistency

Page 13: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

All Specifications Have an Internal Graph-Based Representation

Texts are parsed to abstract syntax trees (AST)Two step procedure

Concrete Syntax TreeConcrete Syntax Tree

Abstract Syntax Tree

Through name analysis they become abstract syntax graphs (ASG)Through name analysis, they become abstract syntax graphs (ASG)

Through def-use-analysis, they become Use-def-Use Graphs (UDUG)

AST ASG UDUG

.......

AST

.......

ASG

.......

UDUG

....... ....... .......

TU Dresden, 28.04.2009

Sebastian Richly

Folie

13 von 101

Model Consistency

Page 14: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

CST – Example

Expr ::= ‘(’ Expr ‘)’Expr ::= ( Expr )| Expr ‘&&’ Expr| Expr ‘||’ expr| ‘!’ Expr| ! Expr| Lit .

Lit ::= Var | ‘true’ | ‘false’.Var ::= [a-z][a-z 0-9 ]+ .Var :: [a z][a z 0 9_]+ .

Parsing this string:(( looking || true) && !found )

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

14 von 101

Page 15: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

CST - Example

Expr ::= ‘(’ Expr ‘)’| Expr ‘&&’ Expr| Expr ‘||’ expr| ‘!’ Expr| Lit .

Lit ::= Var | ‘true’ | ‘false’

Parsing this string:(( looking || true) && !found )

Expr

Lit ::= Var | true | false .Var ::= [a-z][a-z 0-9_]+ .

( Expr

Expr && Expr

)

( Expr ) ! Expr

Expr

Varid = looking

|| Expr

true

Varid = found

id = looking

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

15 von 101

Page 16: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

AST

&&Expr

( Expr )

||

Var

!

Var

Expr && Expr

Varid = looking

True

Varid = found( Expr

Expr || Expr

) ! Expr

Varid f d TrueExpr

Varid = looking

|| Expr

true

id = found

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

16 von 101

Page 17: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

AST

Parse trees waste a fair amount of space for representation of terminal symbols and productionsand productions

Compilers post-process parse trees into ASTsinto ASTsASTs are the fundamental data structure of IDEs (ASTView in Eclipse JDT)

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

17 von 101

Page 18: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

AST

Problem with ASTs: They do not support static semantic checks, re-factoring and browsing operations, e.g:• Have all used variables been declared• Have all used variables been declared• Have all Classes used been imported• Are the types used in expressions / assignments compatible?• Navigate to the declaration of method call / variable reference / typeg yp

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

18 von 101

Page 19: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

ASG

boolean looking, found;…if (l ki && !f d ) { }if (looking && !found ) {…}

Block

VarDecltype=boolean

VarDeclType=boolean

IfStmtyp

VarNameid=looking

yp

VarNameid=found && Block

looking !

found

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

19 von 101

Page 20: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

ASG

Abstract Syntax Graphs have additional edges that reflect semanticrelationships, e.g. declare/useThese edges are maintained during static semantic checksThese edges are maintained during static semantic checksThey are used in refactoring operations (e.g. renaming a class).

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

20 von 101

Page 21: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example: Rename Refactorings in Programs

Refactor the name Person to Human:

class Person { .. }

class Co rse {Definition

class Course {

Person teacher = new Person(“Jim”); Reference (Use)

Person student = new Person(“John”);

}

class Human { .. }class Course { Human teacher = new Human(“Jim”);( );Human student = new Human(“John”);}

TU Dresden, 28.04.2009

Sebastian Richly

Folie

21 von 101

Model Consistency

Page 22: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Name-Resolved Graphs (Use-Definition Graphs, Use-Def Graphs)

Every language and notation has Definitions of items (definition of the variable Foo)Uses of items (references to Foo)Uses of items (references to Foo)

We talk in specifications or programs about names of objects and their use

Definitions are done in a data definition language (DDL)Uses are part of a data manipulation language (DML)

Starting from the abstract syntax, the name analysis finds out about the definitions, uses, and their relations (the Use-Def graph)

h l h f d d fThis revolves the meaning of used names to definitions

TU Dresden, 28.04.2009

Sebastian Richly

Folie

22 von 101

Model Consistency

Page 23: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Refactoring on Complete Name-Resolved Graphs (Def-Use Graphs, Use-Def-Use Graphs)

For renaming of a definition, all uses have to be changed, tooWe need to trace all uses of a definition in the Use-Def-graph, resulting in its inverse, the Def-Use-graphRefactoring works always on Def-Use-graphs and Use-Def-graphs, the complete name-resolved graph (the Use-Def-Use graphs)

Refactoring works always in the same way:Ch d fi itiChange a definitionFind all dependent references Change themRecurse handling other dependent definitionsRecurse handling other dependent definitions

Refactoring can be supported by toolsThe Use-Def-Use-graph forms the basis of refactoring tools

However, building the Use-Def-Use-Graph for a complete program However, building the Use Def Use Graph for a complete program costs a lot of space and is a difficult program analysis task

Every method that structures this graph benefits immediately the refactoringeither simplifying or accelerating it

TU Dresden, 28.04.2009

Sebastian Richly

Folie

23 von 101

Model Consistency

Page 24: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

All Specifications Have an Internal Graph-Based Representation

Control-flow Analysis -> CFG, CLG

Data-Flow Analysis -> DFGThe same remarks holds for graphic specificationsHence, all specifications are graph-based!

C G C G GCFG, CLG DFG

....... .......

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

24 von 101

Page 25: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Control-Flow Graphs

Describe the control flow in a programTypically, if statements and switch statements split control flow

Their ends join control flowTheir ends join control flow

Control-Flow Graphs resolve symbolic labelsNested loops are described by nested control flow graphs

if

whilea+=5;

print a print a++

return

TU Dresden, 28.04.2009

Sebastian Richly

Folie

25 von 101

Model Consistency

Page 26: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Simple (Flow-Insensitive) Call Graph (CLG)

Describe the call relationship between the procedures

main = procedure () {array int[] a = read();print(a);

main

print(a);quicksort(a);print(a);

} quicksort

read

}quicksort = procedure(a: array[0..n]) {int pivot = searchPivot(a);quicksort(a[0], a[pivot-1]);

quicksortprint

qu c so (a[0], a[p o ]);quicksort(a[pivot+1,n]);

} searchPivot

TU Dresden, 28.04.2009

Sebastian Richly

Folie

26 von 101

Model Consistency

Page 27: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Data-Flow Graphs (DFG)

Describe the flow of data through the variablesAre based on control-flow graphs

Building the data-flow graph is called data-flow analysisBuilding the data flow graph is called data flow analysis

a=0

if

whilea=a+5;

print a print a++print a print a++

b=a

TU Dresden, 28.04.2009

Sebastian Richly

Folie

27 von 101

Model Consistency

Page 28: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Inheritance Tree or Inheritance Lattice

A lattice is a partial order with largest and smallest element

ObjectObject

Don’t Know

Person

Man Woman

UndefinedI h it

TU Dresden, 28.04.2009

Sebastian Richly

Folie

29 von 101

Model Consistency

Inheritance

Page 29: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

UML Graphs

All diagram sublanguages of UML are graphsThey can be analyzed and checked with graph techniques

Hence, graph techniques are an essential tool of the software engineer

TU Dresden, 28.04.2009

Sebastian Richly

Folie

30 von 101

Model Consistency

Page 30: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Remark: All Specifications Have a Graph-Based Representation

Texts are parsed to abstract syntax trees (AST)

Through name analysis, they become abstract syntax graphs (ASG)

Through def-use-analysis, they become Use-def-Use Graphs (UDUG)

Control-flow Analysis -> CFG, CLG

Data-Flow Analysis -> DFG

AST ASG UDUG

....... ....... .......

CFG, CLG DFG

....... .......

TU Dresden, 28.04.2009

Sebastian Richly

Folie

31 von 101

Model Consistency

Page 31: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Types of Graphs in Specifications

i hLists, Trees, Dags, Graphs Structural constrains on graphs

(background information)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

32 von 101

Model Consistency

Page 32: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Modeling Graphs on Two Abstraction Levels

We deal here mostly with directed graphs (digraphs)lists, trees, dags, overlay graphs, reducible (di-)graphs, graphs

Th t diff t b t ti l l i t t d i th There are two different abstraction levels; we are interested in the logical level:

Logical level (abstract, often declarative, problem oriented)

Methods to specify graph and algorithms on graphs:Methods to specify graph and algorithms on graphs:Relational algebraDatalog, description logicGraph rewrite systems graph grammarsGraph rewrite systems, graph grammarsRecursion schemas

Physical level (concrete, often imperative, machine oriented)

Data type adjacency list, boolean (bit)matrix, Data type adjacency list, boolean (bit)matrix, Imperative algorithmsPointer based algorithm

TU Dresden, 28.04.2009

Sebastian Richly

Folie

33 von 101

Model Consistency

Page 33: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Definitions

Fan-inIn-degree of node under a certain relationFan-in(n = 0): n is root node (source)Fan-in(n = 0): n is root node (source)Fan-in(n) > 0: n is reachable from other nodes

Fan-outOut-degree of node under a certain relation Out degree of node under a certain relation Fan-out(n) = 0: n is leaf node (sink)An inner node is neither a root nor a leaf

PathA path p = (n1, n2,…,nk) is a sequence of nodes of length k

TU Dresden, 28.04.2009

Sebastian Richly

Folie

34 von 101

Model Consistency

Page 34: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Lists

One source (root)

One sinkh d h f i fEvery other node has fan-in 1, fan-out 1

Represents a total order (sequentialization)root

GivesPrioritization

Execution order

sink

TU Dresden, 28.04.2009

Sebastian Richly

Folie

35 von 101

Model Consistency

Page 35: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Trees

One source (root)

Many sinks (leaves)d h f iEvery node has fan-in <= 1

Hierarchical abstraction:root

A node represents or abstractsall nodes of a sub tree

.......Example

SA function trees

Organization trees (line organization)

.......

..............

sinks

TU Dresden, 28.04.2009

Sebastian Richly

Folie

36 von 101

Model Consistency

Page 36: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Directed Acyclic Graphs

Many sourcesA jungle (term graph) is a dag with one root

rootsMany sinksFan-in, fan-out arbitraryRepresents a partial order

roots

p pLess constraints that in a total order

Weaker hierarchical abstraction feature

.......

Can be layered

ExampleUML inheritance dagsI h it l ttiInheritance lattices

..............

sinks

TU Dresden, 28.04.2009

Sebastian Richly

Folie

37 von 101

Model Consistency

Page 37: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Skeleton Trees with Overlay Graphs (Trees with Secondary Graphs)

Skeleton tree with overlay graph (secondary links)

Skeleton tree is primaryp yOverlay graph is secondary: “less important”

Advantage of an Overlay GraphTree can be used as a conceptual hierarchy

roots

References to other parts are possible

ExampleXML, e.g., XHTML. Structure is describedby Xschema/DTD links form the

.......by Xschema/DTD, links form the secondary relationsAST with name relationships after name analysis (name-resolved trees, b t t t h )abstract syntax graphs)

..............

sinks

TU Dresden, 28.04.2009

Sebastian Richly

Folie

38 von 101

Model Consistency

Page 38: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Reducible Graphs (Graphs with Skeleton Trees)

Graph with cycles, however, only between sisters

No cycles between hierarchy levels y y

Graph can be “reduced” to one nodeAdvantage

Tree can be used as a conceptual hierarchy

roots

ExampleUML statechartsControl-flow graphs of Modula, Ada, Java ( t C C )

.......(not C, C++)SA data flow diagrams

..............

sinks

TU Dresden, 28.04.2009

Sebastian Richly

Folie

39 von 101

Model Consistency

Page 39: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Reducible Graph

B1B1

B2 B1a B1aB1b

B3 B3a B3a

B4B4

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

40 von 101

Page 40: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Layerable Graphs with Skeleton Dags

Like reducible graphs, however, sharing between different parts of the skeleton trees

Graph cannot be “reduced” to one nodeGraph cannot be reduced to one nodeAdvantage

Skeleton can be used to layer the graphCycles only within one layerCycles only within one layer

ExampleLayered system architectures .......

..............

TU Dresden, 28.04.2009

Sebastian Richly

Folie

41 von 101

Model Consistency

Page 41: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Wild Unstructured (Directed) Graphs

Wild, unstructured graphs are the worst structure we can get

Wild, unstructured, irreducible cycles, , yUnlayerable, no abstraction possibleNo overview possible

Many rootsA digraph with one source is called flow graph

Many sinksExample .......

Many diagrammatic methods in Software Engineering UML class diagrams

..............

TU Dresden, 28.04.2009

Sebastian Richly

Folie

42 von 101

Model Consistency

Page 42: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Strength of Assertions in Models

Ease of Understanding

List: strong assertion: total order Seq entialList: strong assertion: total order

Tree: still abstraction possible

Sequential

Hierarchies

Dag: still layering possible

Tree: still abstraction possible

Partial order

Hierarchies

Graph: the worst case

g y g pLayered

Unstructuredp

TU Dresden, 28.04.2009

Sebastian Richly

Folie

43 von 101

Model Consistency

Page 43: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Strength of Assertions in Models

Saying that a relation isA list: very strong assertion, total order!A tree: still a strong assertion: hierarchies possible easy to think A tree: still a strong assertion: hierarchies possible, easy to think A dag: still layering possible, still a partial orderA layerable graph: still layering possible, but no partial orderA graph: hopefully, some structuring or analysis is possible. Otherwise, it’s the worst case

And those propositions hold for every kind of diagram in Software Engineering!

Try to achieve dags, trees, or lists in your specifications, models, and designs

Systems will be easier, more efficient

TU Dresden, 28.04.2009

Sebastian Richly

Folie

44 von 101

Model Consistency

Page 44: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Structuring Improves Worst Case

List: strong assertion: total order SequentialEase of Understanding

Tree: still abstraction possible Hierarchies

Dag: still layering possible Partial orderLayered

StructuredStructured graph

UnstructuredGraph with analyzed features

Graph: the worst case Unstructured

TU Dresden, 28.04.2009

Sebastian Richly

Folie

45 von 101

Model Consistency

Page 45: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Methods and Tools for Analysis of Graph-Based Models

TU Dresden, 28.04.2009

Sebastian Richly

Folie

46 von 101

Model Consistency

Page 46: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

The Graph-Logic Isomorphism

In the following, we will make use of the graph-logic isomorphism:Graphs can be used to represent logic

Nodes correspond to constantsNodes correspond to constants(Directed) edges correspond to binary predicatesHyperedges (n-edges) correspond to n-ary predicates

Consequence:qGraph algorithms can be used to test logic queries on graph-based specificationsGraph rewrite systems can be used for deduction

Carl Gustav fatheri d(C lG t Sil i )

Victoriamarried

married(CarlGustav,Silvia).married(Silvia, CarlGustav).father(CarlGustav,Victoria).mother(Silvia Victoria)

Silvia mothermother(Silvia,Victoria).

TU Dresden, 28.04.2009

Sebastian Richly

Folie

47 von 101

Model Consistency

Page 47: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Graphs and Fact Data Bases

Graphs can also be noted textuallyGraphs consist of nodes, relationsRelations link nodes

Fact data bases consist of constants (data) and

di tRelations link nodes predicates

Nodes of graphs can be regarded as constants, edges as predicates between as predicates between constants (facts):

// TriplesGustavAdolf

isParentOf// F t

// TriplesAdam isParentOf GustavAdolf.Adam isParentOf Sibylla.

Adam// FactsisParentOf(Adam,GustavAdolf).isParentOf(Adam,Sibylla).

SibyllaisParentOf

TU Dresden, 28.04.2009

Sebastian Richly

Folie

48 von 101

Model Consistency

Page 48: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Queries on Graph-Based Models Make Implicit Knowledge Explicit

Since graph-based models are a mess, we try to analyze them

Knowledge is eitherE li it I t d i th d l d d dExplicit, I.e., represented in the model as edges and nodes

Implicit, I.e., hidden, not directly represented, and must be analyzed

Query and analysis problems try to make implicit knowledge explicitexplicit

E.g. Does the graph have one root? How many leaves do we have? Is this subgraph a tree? Can I reach that node from this node?

Determining features of nodes and edgesDetermining features of nodes and edgesFinding certain nodes, or patterns

Determining global features of the modelFinding paths between two nodes (e g connected reachable)Finding paths between two nodes (e.g., connected, reachable)

Finding paths that satisfy additional constraints

Finding subgraphs that satisfy additional constraints

TU Dresden, 28.04.2009

Sebastian Richly

Folie

49 von 101

Model Consistency

Page 49: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Queries for Checking Consistency (Model Validation)

Queries can be used to find out whether a graph is consistent (i.e., valid, well-formed)

Due to the graph-logic isomorphism, constraint specifications can be phrased in g p g p , p plogic and applied to graphsBusiness people call these constraint specifications business rules

Example:if a person hasn't died yet, its town should not list her in the list of dead peopleif a car is exported to England, steering wheel and pedals should be on the right side; otherwise on the left

TU Dresden, 28.04.2009

Sebastian Richly

Folie

50 von 101

Model Consistency

Page 50: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example: How to Analyze a System for Layers

And the Same Generation ProblemHow to query a dag and search in a dag H t l d i l t t i blHow to layer a dag – a simple structuring problemhttp://susning.nu/Drottning_Silvia

TU Dresden, 28.04.2009

Sebastian Richly

Folie

51 von 101

Model Consistency

Page 51: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Layering of Systems

To be comprehensible, a system should be structured in layersSeveral relations in a system can be used to structure it, e.g., the

Call graph: layered call graphCall graph: layered call graphLayered definition-use graph Layered USES relationship

A layered architecture is the dominating style for large systemsOuter, upper layers use inner, lower layers (USES relationship)Legacy systems can be analyzed for la ering and if the do not ha e alayering, and if they do not have a layered architecture, their structure can be improved towards this principle

TU Dresden, 28.04.2009

Sebastian Richly

Folie

52 von 101

Model Consistency

Page 52: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Layering of Acyclic Graphs

Given any acyclic relation, it can be made layeredSameGeneration analysis layers in trees or dags

Example: layering a family tree:Example: layering a family tree:Who is whose contemporary?Who is ancestor of whom?

Desiree

GustavAdolf

Desiree

Carl Gustav

VictoriaMadeleineSibylla

Silvia

Walter

Adam

RalfAlice

TU Dresden, 28.04.2009

Sebastian Richly

Folie

53 von 101

Model Consistency

Page 53: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Pattern and Rules

Parenthood can be described by a graph pattern

We can write the graph pattern also in logic:

isParentOf(Parent,Child1) && isParentOf(Parent,Child2)

And define the rule

if isParentOf(Parent,Child1) && isParentOf(Parent,Child2)h i ( hild1 hild2)then sameGeneration(Child1,Child2)

Child 1 Child 1

Parent Parent

Child 2 Child 2

TU Dresden, 28.04.2009

Sebastian Richly

Folie

54 von 101

Model Consistency

Page 54: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Impact of Rule on Family Graph

Desiree Desiree

CarlGustav

Vi t i

GustavAdolf CarlGustavGustavAdolf

VictoriaMadeleine

Silvia

Sibylla

W lt

Adam

VictoriaMadeleine

Silvia

Sibylla

W lt

Adam

Ralf

Walter

Ralf

Walter

Alice Alice

TU Dresden, 28.04.2009

Sebastian Richly

Folie

55 von 101

Model Consistency

Page 55: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Same Generation

Base rule: Beyond sisters and brothers we can link all people of same generation

Parent

Child 1

Parent

Child 1

Additional rule (transitive): Enters new levels into the graph

Child 2 Child 2

Parent 2 Child 2 Parent 2 Child 2

Parent 1 Child 1 Parent 1 Child 1

TU Dresden, 28.04.2009

Sebastian Richly

Folie

56 von 101

Model Consistency

Page 56: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Impact of Transitive Rule

Desiree

CarlGustav

GustavAdolf

VictoriaMadeleineSibylla

SilviaWalter

Adam

Ralf

Alice

TU Dresden, 28.04.2009

Sebastian Richly

Folie

57 von 101

Model Consistency

Page 57: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Same Generation Introduces Layers

Computes all nodes that belong to one layer of a dagIf backedges are neglected, also for an arbitrary graph

Algorithm:Algorithm:Compute Same GenerationGo through all layers and number them

Applications: Applications: Compute layers in a call graph

Find out the call depth of a procedure from the main procedureRestructuring of legacy software (refactoring)g g y ( g)

Compute layers of systems by analyzing the USES relationships (ST-I)Insert facade classes for each layer (Facade design pattern)

Every call into the layer must go through the facadeAs a result, the application is much more structured

TU Dresden, 28.04.2009

Sebastian Richly

Folie

58 von 101

Model Consistency

Page 58: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Searching Graphs – Searching in Specifications with Datalog and EARS

TU Dresden, 28.04.2009

Sebastian Richly

Folie

59 von 101

Model Consistency

Page 59: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

SameGeneration as a Graph Rewrite System

The rule system SameGeneration only adds edges. Edge addition rewrite system (EARS) add edges

They enlarge the graph, but the new edges can be marked such that they are not put permanently into the graphEARS are declarative (no specification of control flow and an abstract representation)representation)

Confluence: The result is independent of the order in which rules are appliedRecursion: The system is recursive since relation sameGenerationRecursion: The system is recursive, since relation sameGenerationis used and definedTermination: terminates, if all possible edges are added, latest, when graph is completewhen graph is complete

EARS compute reachabilities (graph query, graph analysis)SameGeneration can be used for graph analysis

TU Dresden, 28.04.2009

Sebastian Richly

Folie

60 von 101

Model Consistency

Page 60: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Rule Systems in EARS and Datalog

Rule systems can be noted textually or graphically (DATALOG or EARS)EARS (edge addition rewrite

Datalog contains textual if-then rules, which test predicates b t th t t

EARS (edge addition rewrite systems) contain graph rewrite rules, which add edges Rule nodes contain variables

about the constants rules contain variables to match many constants

//// conclusionsameGeneration(Child1, Child2):- // say: "if" // i

Child1

P t

Child1// premiseisParentOf(Parent,Child1),isParentOf(Parent,Child2).

Parent

Child2

Parent

Child2

// premiseif isParentOf(Parent,Child1) &&Child2 Child2isParentOf(Parent,Child2)then // conclusion

G ti (Child1 Child2)sameGeneration(Child1,Child2)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

61 von 101

Model Consistency

Page 61: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Same Generation Datalog Program

isParentOf(Adam,GustavAdolf).isParentOf(Adam,Sibylla)......if isParentOf(Parent,Child1), isParentOf(Parent,Child2)then sameGeneration(Child1, Child2).if sameGeneration(Parent1,Parent2),isParentOf(Parent1,Child1), isParentOf(Parent2,Child2)

then sameGeneration(Child1, Child2).( , )

TU Dresden, 28.04.2009

Sebastian Richly

Folie

62 von 101

Model Consistency

Page 62: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Searching is Easy With Datalog

# A SMPP problem (searching for single source a set of multiple targets)descendant(Adam,X)? X={ Silvia, Carl-Gustav, Victoria, ....}

# An MSPP problem (multiple source, single target)descendant(X,Silvia)?X={Walter, Adam, Alice}

# An MMPP problem (multiple source, multiple target)ancestor(X,Y)?{X=Walter, Y={Adam}{ , { }X=Victoria, Y={CarlGustav, Silvia, Sibylla, ...}

Y = Adam, Walter, ... Y Adam, Walter, ... # Victoria, Madeleine, CarlPhilip not in the set

TU Dresden, 28.04.2009

Sebastian Richly

Folie

63 von 101

Model Consistency

Page 63: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Description Logic (DL)

A special form of typed binary Datalog (typed EARS)Only with unary and binary relationsClasses and objects, instead of untyped Datalog constantsClasses and objects, instead of untyped Datalog constantsRelationship types and relationsAll knowledge is specified with triples, simple sentences of Verb-Predicate-ObjectOWL (Web Ontology Language) Desiree:

GustavAdol:Person

Person

Carl Gustav:King

Adam instanceOf Person.Sibylla instanceOf Person

Victoria:PrincessMadeleine:

PrincessSibylla:Person

KingSibylla instanceOf Person.GustavAdolf instanceOf Person.King isA Person.Princess isA Person.

Silvia:Person

Walter:Person

Adam:Person

...Adam parentOf GustavAdolf.Adam parentOf Sibylla....

Ralf:P

Alice:P PersonPerson

TU Dresden, 28.04.2009

Sebastian Richly

Folie

64 von 101

Model Consistency

Page 64: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Datalog, DL, OCL, and EARS:Extended Relational Algebra

Datalog, DL and EARS correspond to relational Algebra with recursion (see lecture on data bases).

SQL has no recursion SQL-3 has

Relational Algebra (SQL)

SQL has no recursion, SQL 3 hasNegation can be addedDatalog is a simple variant of Prolog

DL languages:

DesciptionLogic (OWL)

DL languages:OWL (ontology web language)OIL (ontology interchange language)

Datalog (withrecusion; SQL3)

l h

language)OCL does not have transitive closure, but iteration

Datalog withnegation andrecursion

OCL

TU Dresden, 28.04.2009

Sebastian Richly

Folie

65 von 101

Model Consistency

Page 65: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Datalog, DL, OCL, and EARS:Extended Relational Algebra

decidable

"Business rules”

RelationalAlgebra OCLOCL classes = (SQL)

Description Logic(OWL)

unary predicates

Binary Datalog

Datalog (with recursion)(SQL3)

D t l ith ti

Binary Datalog(EARS) binary

predicatesDatalog with negation and recursion

Prolog with negation and recursionand recursion

TU Dresden, 28.04.2009

Sebastian Richly

Folie

66 von 101

Model Consistency

Page 66: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Application Areas of Datalog, DL, OCL, and EARS

Graph query problems (searching graphs)Reachability of nodes (transitive closure)SSPP etcSSPP, etc.

Consistency checking of graph-based specificationsName analysis (building def-use graphs)Data analysis ata a a ys sProgram analysis

Building control-flow graphsData-flow analysis

Model analysis (UML, OWL)

Structurings and algorithms on structured graphsLayering of system relations R d ibilitReducibilityStrongly connected components

Specification of contracts for procedures and servicesProver can statically prove the validity of the contractProver can statically prove the validity of the contract

TU Dresden, 28.04.2009

Sebastian Richly

Folie

67 von 101

Model Consistency

Page 67: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example for Model Validation: Search in UML Diagrams

Step 1: encode the diagram into a Datalog or DL fact baseStep 2: define integrity constraint rules St 3 l t th l Step 3: let the rules run

TU Dresden, 28.04.2009

Sebastian Richly

Folie

68 von 101

Model Consistency

Page 68: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example: The Domain Model of the Web-Based Course System

PupilEducation Teacher

name

teacherhasPupil

descriptionlastChanged

CourseSatus Course

CourseOwner

hasCourse

beginDateendDatereadyresultProcent

namedescriptionlastChangedchangedBy

CourseModifierlinksTo

resultProcent

ModuleStatus

changedByactive

ModuleAnswerAlt ti

hasModule

endDateready

namedescriptionlastChangedchangedBy

Alternative

categorytext

{OR}

QuestionStatus

status

Question

category

changedByactive

Link

namedescription

{OR}

linksTo

linksTostatus g y

text URL

TU Dresden, 28.04.2009

Sebastian Richly

Folie

69 von 101

Model Consistency

Page 69: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Searching with Datalog or DLQueries on UML Class Diagrams

// Step 1: construct fact base: the UML class diagramp g// in Datalog fact syntax.teacher(programming,john).hasCourse(programming, lisp).hasPupil(programming,mary).hasPupil(programming,mary).hasModule(lisp,closures).

// Step 2: construct integrity constraint rulesreads(Person Module) :-reads(Person,Module) :-

hasPupil(Person,E), hasCourse(E,C), hasModule(C,Module).

// Step 3: let rules run: form and execute a queryeads(ma Mod le) :- reads(mary, Module)

// the answerModule = closures

TU Dresden, 28.04.2009

Sebastian Richly

Folie

70 von 101

Model Consistency

Page 70: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example:Web Queries with Logic

The Web is a gigantic graphPages are trees, but links create real graphs

Links are a secondary structure which overlays the primary tree structureLinks are a secondary structure which overlays the primary tree structureInterpret tree and links as relationsGraph algorithms and queries can be applied to the web!

RDF (resource description framework, a simple graph language)RDF (resource description framework, a simple graph language)OWL (description logic, www.w3c.org) adds classes, inheritance, inheritance on binary relations, expressions and queries on binary relationsOther experimental languages SPARQL (Manchester), Flora/XSB (NY Stony Brook, www.ontoprise.com), Florijd (Freiburg)New languages are being developed

I th E t k REWERSE ( t)In the European network REWERSE (www.rewerse.net)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

71 von 101

Model Consistency

Page 71: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Reachability Queries with Transitive Closure

The Swiss-Knife of Graph Analysis

TU Dresden, 28.04.2009

Sebastian Richly

Folie

72 von 101

Model Consistency

Page 72: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Who is Descendant of Whom?

Sometimes we need to know transitive edges, I.e., edges after edges of the same color

Question: what is reachable from a node?Question: what is reachable from a node?Which descendants has Adam?

Answer: Transitive closure calculates reachability over nodesIt contracts a graph, inserting masses of edges to all reachable nodesg p , g gIt contracts all paths to single edgesIt makes reachability information explicit

After transitive closure, it can easily be decided whether a node is reachable or not

Basic premise: base relation is not changed (offline problem)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

73 von 101

Model Consistency

Page 73: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Transitive Closure as Datalog Rule System

Basic rule descendant(V,N) :- isChildOf(V,N).

Parent

Child

Parent

Child

Transitive rule (recursion rule)left recursive: descendant(V,N) :- descendant(V,X),isChildOf(X,N).

right recursive: descendant(V,N) :- isChildOf(V,X), descendant(X,N).

Child Child

Parent GrandCh Parent GrandCh

TU Dresden, 28.04.2009

Sebastian Richly

Folie

74 von 101

Model Consistency

Page 74: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Impact of Basic Rule

Desiree Desiree

CarlGustavGustavAdolf CarlGustavGustavAdolf

VictoriaMadeleineSibylla

VictoriaMadeleineSibylla

SilviaWalter

AdamSilvia

y

Walter

Adam

Ralf Ralf

Alice Alice

TU Dresden, 28.04.2009

Sebastian Richly

Folie

75 von 101

Model Consistency

Page 75: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Impact of Recursion Rule

Desiree

CarlGustavGustavAdolf

VictoriaMadeleineSibylla

SilviaWalter

Adam

Ralf

Walter

AliceImpact only shown for Adam, but is applied to other nodes too

TU Dresden, 28.04.2009

Sebastian Richly

Folie

76 von 101

Model Consistency

Page 76: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

[S|M][S|M]PP Path Problems:Variants of Graph Reachability

Single Source Single Target Path Problem, SSPP: Test, whether there is a path from a source to a target

Single Source Multiple Target SMPP: Single Source Multiple Target SMPP: Test, whether there is a path from a source to several targets Or: find n targets, reachable from one source

Multiple Source Single Target MSPP: p g gTest, whether a path from n sources to one target

Multiple Source Multiple Target MMPP: Test, whether a path of n sources to n targets exists

All can be computed with transitive closure:Compute transitive closureTest sources and targets on direct neighborship

TU Dresden, 28.04.2009

Sebastian Richly

Folie

77 von 101

Model Consistency

Page 77: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Exercise: Railway Routes

Base (Facts):directlyLinked(Berlin, Potsdam).

directlyLinked(Potsdam Braunschweig)directlyLinked(Potsdam,Braunschweig).

directlyLinked(Braunschweig, Hannover).

Define the predicatesDefine the predicateslinked(A,B)

alsoLinked(A,B)

unreachable(A,B)u eac ab e( , )

Answer the querieslinked(Berlin,X)

unreachable(Berlin, Hannover)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

78 von 101

Model Consistency

Page 78: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Cost of Transitive Closure

Transitive closure (TC) has many implementationsNaive: multiplication of boolean matrices O(n3) Multiplication of boolean matrices with Russian Method is Multiplication of boolean matrices with Russian Method is O(n2.4)Nested-loop joins from relational algebra: O(n3)

Gets better with semi-naive evaluation, hashed joins, semi-joins, and indices/ d l h l l bMunro/Purdue algorithm is almost linear, but costs space

TU Dresden, 28.04.2009

Sebastian Richly

Folie

79 von 101

Model Consistency

Page 79: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Transitive Closure and Several Relations

Transitive closure works on one relationIf we want to know, whether a certain node is reachable under several relationsseveral relations

Compute transitive closure on all of themTest neighbor ship directly

This delivers an implementation of the existential quantifier for p qlogic

TU Dresden, 28.04.2009

Sebastian Richly

Folie

80 von 101

Model Consistency

Page 80: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Central Theorem of Datalog/DL/EARS

Any Datalog program or EARS graph rewrite system can be transformed into an equivalent one

That is free of recursionThat is free of recursionAnd only applies the operator Transitive Closure (The transitive closure uses direct recursion, but encapsulates it)

What does this mean in practice? (Remember, Datalog/EARS can be used to specify consistency constraint on graph-based p y y g pspecifications)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

81 von 101

Model Consistency

Page 81: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

SameGeneration as Non-Recursive System

Basic rule as before

Parent

Child 1

Parent

Child 1isChildOf

i ChildOf

isChildOf

Additional non-recursive rule (descendant is transitive closure of i ChildOf)

Child 2 Child 2isChildOf isChildOf

isChildOf)

Child 1descendant

Child 1

Parent 1

Child 1(isChildOf*)

Parent 1

Child 1descendant

Parent 1

Child 2descendant

Parent 1

Child 2descendant

TU Dresden, 28.04.2009

Sebastian Richly

Folie

82 von 101

Model Consistency

Page 82: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Applications of Graph Reachability in Consistency Checking

Corollary: To solve an arbitrary rechability problem, use a non-recursive query and the operator TransitiveClosure.Consequence: should a graph-based specification be checked on Consequence: should a graph-based specification be checked on consistency (by evaluation of consistency constraints),

it can be done with non-recursive Datalog query and the operator TransitiveClosureAnd solved with the complexity of a good TransitiveClosure algorithm

Precondition: the input graphs are fix, i.e., do not change (static problem)p )

Since the relation is one of the qualities of the world this is a central problem of computer science and IT

Similar to searching and sorting

TU Dresden, 28.04.2009

Sebastian Richly

Folie

83 von 101

Model Consistency

Page 83: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Generic Datalog Queries

Transitive closure is a general graph operator Computing reachability Can be applied generically to all relations!Can be applied generically to all relations!

Many other Datalog rule systems are also genericsameGeneration stronglyConnectedComponentsstronglyConnectedComponentsdominators

And that’s why we consider them here: They can be applied to design graphsThey can be applied to design graphsIs class X reachable from class Y?Show me the ancestors in the inheritance graph of class YIs there a cycle in this cross-referencing graph?

TU Dresden, 28.04.2009

Sebastian Richly

Folie

85 von 101

Model Consistency

Page 84: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Application: Consistency Checking of Graph-Based Models

When a specification becomes big...

TU Dresden, 28.04.2009

Sebastian Richly

Folie

86 von 101

Model Consistency

Page 85: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example 1: Consistency Checking for Car Specifications

Car data specifications in the MOST standardThousands of parts, described for an entire supplier industryMany inconsistencies possibleMany inconsistencies possibleDue to human errors

Global variants of the cars must be describedExamples of context conditions for global variants of cars:p g

The problem of English cars: A steering wheel on the right implies accelerator, brake, clutch on the rightAutomatic gears: an automatic gear box requires an automatic gear-shift lever

TU Dresden, 28.04.2009

Sebastian Richly

Folie

87 von 101

Model Consistency

Page 86: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

First Idea

Define a context free grammar for the car dataFrom that, derive a XML schema for the car data

Enrich the grammar nonterminals with attributesEnrich the grammar nonterminals with attributes

Parse the data and validate it according to its context free structure

TU Dresden, 28.04.2009

Sebastian Richly

Folie

88 von 101

Model Consistency

Page 87: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Second Idea

Analyze consistency of the specifications by regarding them as graphsCheck definition criterion (name analysis)Check definition criterion (name analysis)

“is every name I refer to defined elsewhere”?

Analyze layers with SameGeneration How many layers does my car specification have?Is it acyclic?

Write a query that checks the consistency global variantsIf the car is to be exported to England, the steering wheel, the pedals should be on th i ht idthe right sideIf the car has an automatic gear box, it must have an automatic gear-shift lever

TU Dresden, 28.04.2009

Sebastian Richly

Folie

89 von 101

Model Consistency

Page 88: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Third Idea: Use Ontology Language

OWL (description logic) can be used for consistency constraints, also of car specifications

Result: an ontology, a vocabulary of classes with consistency constraintsgy, y yOWL engines (RACER, Triple) can evaluate the consistency of car specificationsOntologies can formulate consistency criteria for an entire supplier chain [Aßmann2005]

TU Dresden, 28.04.2009

Sebastian Richly

Folie

90 von 101

Model Consistency

Page 89: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example 2: Consistency Checking of Tax Declarations

Task: you have been hired by the tax authorities. Write a program that checks the income task declarations on consistency

Represent the tax declarations with graphs. How many graphs will you get?H bi th ?How big are they?How much memory do you need at least?

TU Dresden, 28.04.2009

Sebastian Richly

Folie

91 von 101

Model Consistency

Page 90: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

First Idea

Write a context free grammar for the tax declarationsFrom that, derive a XML schema

Enrich the grammar nonterminals with attributesEnrich the grammar nonterminals with attributes

Check context free structure of the tax declarations with the XML parser (contextfree consistency)This is usually assured by the tax formThis is usually assured by the tax form

It is, however, nevertheless necessary, if the forms have been fed into a computer, to avoid feeding problems.

TU Dresden, 28.04.2009

Sebastian Richly

Folie

92 von 101

Model Consistency

Page 91: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Second Idea

Write queries that checks document-local, but global constraintsAre there bills for all claimed tax reductions?Are the appendices consistent with the main tax document?Are the appendices consistent with the main tax document?

Global Constraints over all tax Declarations:Have all bills for all claimed tax reductions really been payed by the tax payer? Is a reduction for a debt reduced only once per couple?....

Write an OCL invariant specification for the tax UML class diagram that checks the constraints

Use the Dresden OCL toolkit to solve the problem http://dresden-ocl.sf.net

TU Dresden, 28.04.2009

Sebastian Richly

Folie

93 von 101

Model Consistency

Page 92: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Third Idea: Use Ontology Language

OWL (description logic) can be used for consistency constraints, also of tax declarations

Result: a tax ontology, a vocabulary of classes with consistency constraintsgy, y yOWL engines (RACER, Triple) can evaluate the consistency of tax specificationsOntologies can formulate consistency criteria for an entire administrative workflow [Aßmann2005]

TU Dresden, 28.04.2009

Sebastian Richly

Folie

94 von 101

Model Consistency

Page 93: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Example 3: UML Specifications in Software Engineering

Imagine a UML model of the Java Development Kit JDK.7000 classesInheritance tree on classesInheritance tree on classesInheritance dag on interfacesDefinition-use graph: how big?

Task: You are the release manager of the new JDK 1.6. It has 1000 classes more.

Ensure consistency please. - How?

TU Dresden, 28.04.2009

Sebastian Richly

Folie

95 von 101

Model Consistency

Page 94: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Ideas

Build up inheritance graphs and definition-use graphsin a database

Analyse conditions such asAnalyse conditions such asDepth of inheritance tree: how easy is it to use the library?Hot-spot methods and classes: Most-used methods and classes (e.g., String)

Optimize themDoes every class/package have a tutorial?Is every class containt in a roadmap for a certain user group? (i.e., does the documentation explain how to use a class?)

TU Dresden, 28.04.2009

Sebastian Richly

Folie

96 von 101

Model Consistency

Page 95: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

Exam Enrollment

Check if a student can enroll to a lectureCheck if a student has passed his master degree

TU Dresden, 28.04.2009

Sebastian Richly

Model Consistency Folie

97 von 101

Page 96: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

What Have We Learned

Graphs and Logic are isomorphic to each otherUsing logic or graph rewrite systems, models can be validated

AnalyzedAnalyzedQueriedChecked for consistencyStructured

Applications are many-fold, using all kinds of system relationshipsConsistency of UML class models (domain, requirement, design models)Structuring (layering) of USES relationships

Logic and graph rewriting technology involves reachabilityquestions

Logic and graph rewrite systems are the Swiss army knife of the validating modeler

TU Dresden, 28.04.2009

Sebastian Richly

Folie

105 von 101

Model Consistency

Page 97: ¾ 5.1 Validation of Graph-Based Models (Analysis …st.inf.tu-dresden.de/files/teaching/ss09/stII09/05a...What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE

The End

TU Dresden, 28.04.2009

Sebastian Richly

Folie

106 von 101

Model Consistency