¾ 5.1 Validation of Graph-Based Models (Analysis...
Transcript of ¾ 5.1 Validation of Graph-Based Models (Analysis...
Fakultät Informatik, Institut für Software- und Multimediatechnik, Lehrstuhl für Softwaretechnologie
5.1 Validation of Graph-Based Modelsp
(Analysis and Consistency)
Prof. Dr. U. AßmannTechnische Universität DresdenInstitut für Software- und MultimediatechnikInstitut für Software und MultimediatechnikGruppe Softwaretechnologiehttp://st.inf.tu-dresden.de
Contents
Different kinds of relations: Lists, Trees, Dags, GraphsTreating graph-based models – The graph-logic isomorphism
l i i hi h b d d lAnalysis, querying, searching graph-based modelsThe Same Generation ProblemDatalog and EARSTransitive ClosureTransitive Closure
Consistency checking of graph-based specifications (aka model validation)
Projections of graphsProjections of graphsTransformation of graphs
TU Dresden, 28.04.2009
Sebastian Richly
Folie
2 von 101
Model Consistency
Obligatory Reading
Jazayeri Chap 3If you have Balzert, Macasziek or Pfleeger, read the lecture slides carefully and do the exercise sheetscarefully and do the exercise sheetsJ. Pan et. al. Ontology Driven Architectures and Potential Uses of the Semantic Web in Systems and Software Engineering
http://www.w3.org/2001/sw/BestPractices/SE/ODA/p g
D. Calvanese, M. Lenzerini, D. Nardi. Description Logics for Data Modeling. In J. Chomicki, G. Saale. Logics for Databases and Information Systems. Kluwer, 1998.Uwe Aßmann, Steffen Zschaler, and Gerd Wagner. Ontologies, Meta-Models, and the Model-Driven Paradigm. Handbook of Ontologies in Software Engineering. Springer, 2006.Holger Knublauch Daniel Oberle Phil Tetlow Evan Wallace (ed ) A Holger Knublauch, Daniel Oberle, Phil Tetlow, Evan Wallace (ed.). A Semantic Web Primer for Object-Oriented Software Developers http://www.w3.org/2001/sw/BestPractices/SE/ODSD/
TU Dresden, 28.04.2009
Sebastian Richly
Folie
3 von 101
Model Consistency
References
S. Ceri, G. Gottlob, L. Tanca. What You Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE Transactions on Knowledge And Data Engineering. March 1989, (1) 1, pp. 146-166.S. Ceri, G. Gottlob, L. Tanca. Logic Programming and Databases. Springer, 1989.Ullman, J. D. Principles of Database and Knowledge Base Systems. Computer Science Press 1989.Benjamin Grosof, Ian Horrocks, Raphael Volz, and Stefan Decker. Description logic programs: Combining logic programs with description logics In Proc of World Wide Web Conference (WWW) description logics. In Proc. of World Wide Web Conference (WWW) 2003, Budapest, Hungary, 05 2003. ACM Press.Preprints available on my home page:
U Aßmann On Edge Addition Rewrite Systems and their Relevance for U. Aßmann. On Edge Addition Rewrite Systems and their Relevance for Program Analysis. 1994. U. Aßmann. Graph Rewrite Systems for Program Optimization. ACM Transactions on Programming Languages and Systems, June 2000.g g g g y ,U. Aßmann. OPTIMIX, A Tool for Rewriting and Optimizing Programs. Graph Grammar Handbook, Vol. II, 1999. Chapman&Hall.U. Aßmann. Reuse in Semantic Applications. REWERSE Summer School. July 2005. Malta. LNCS, Springer.
TU Dresden, 28.04.2009
Sebastian Richly
Folie
4 von 101
Model Consistency
Goals
Understand that software models can become very largethe need for appropriate techniques to handle large modelsthe need for appropriate techniques to handle large models
in hand developmentautomatic analysis of the models
Learn how to use graph based techniques (Datalog Description Learn how to use graph-based techniques (Datalog, Description Logic, EARS, graph transformations) to analyze and check models for consistency, well-formedness, integrityUnderstand some basic concepts of simplicity in software modelsUnderstand some basic concepts of simplicity in software models
TU Dresden, 28.04.2009
Sebastian Richly
Folie
5 von 101
Model Consistency
Motivation
Software engineers must be able to handle big design specifications (design models) during developmentwork with consistent modelswork with consistent modelsmeasure models and implementationsvalidate models and implementations
Real models and systems become very complexMost specifications are graph-based
We have to deal with basic graph theory to be able to measure well
Every analysis method is very welcomeEvery structuring method is very welcome
TU Dresden, 28.04.2009
Sebastian Richly
Folie
6 von 101
Model Consistency
The Problem: How to Master Large Models
Large models have large graphs
They can be hard to understand
Figures taken from Goose Reengineering Tool, analysing a Java class system [Goose FZI Karlsruhe]class system [Goose, FZI Karlsruhe]
TU Dresden, 28.04.2009
Sebastian Richly
Folie
7 von 101
Model Consistency
Partially Collapsed
TU Dresden, 28.04.2009
Sebastian Richly
Folie
9 von 101
Model Consistency
Totally Collapsed
TU Dresden, 28.04.2009
Sebastian Richly
Folie
10 von 101
Model Consistency
Requirements for Modeling in Requirements and Design
We need guidelines how to develop simple models
We need analysis techniques to Check the consistency of the modelsCheck the consistency of the modelsFind out about their complexityFind out about simplifications
TU Dresden, 28.04.2009
Sebastian Richly
Folie
11 von 101
Model Consistency
What Happens in a Software Tool?
Some Relationships (Graphs) in Software Systems
TU Dresden, 28.04.2009
Sebastian Richly
Folie
12 von 101
Model Consistency
All Specifications Have an Internal Graph-Based Representation
Texts are parsed to abstract syntax trees (AST)Two step procedure
Concrete Syntax TreeConcrete Syntax Tree
Abstract Syntax Tree
Through name analysis they become abstract syntax graphs (ASG)Through name analysis, they become abstract syntax graphs (ASG)
Through def-use-analysis, they become Use-def-Use Graphs (UDUG)
AST ASG UDUG
.......
AST
.......
ASG
.......
UDUG
....... ....... .......
TU Dresden, 28.04.2009
Sebastian Richly
Folie
13 von 101
Model Consistency
CST – Example
Expr ::= ‘(’ Expr ‘)’Expr ::= ( Expr )| Expr ‘&&’ Expr| Expr ‘||’ expr| ‘!’ Expr| ! Expr| Lit .
Lit ::= Var | ‘true’ | ‘false’.Var ::= [a-z][a-z 0-9 ]+ .Var :: [a z][a z 0 9_]+ .
Parsing this string:(( looking || true) && !found )
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
14 von 101
CST - Example
Expr ::= ‘(’ Expr ‘)’| Expr ‘&&’ Expr| Expr ‘||’ expr| ‘!’ Expr| Lit .
Lit ::= Var | ‘true’ | ‘false’
Parsing this string:(( looking || true) && !found )
Expr
Lit ::= Var | true | false .Var ::= [a-z][a-z 0-9_]+ .
( Expr
Expr && Expr
)
( Expr ) ! Expr
Expr
Varid = looking
|| Expr
true
Varid = found
id = looking
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
15 von 101
AST
&&Expr
( Expr )
||
Var
!
Var
Expr && Expr
Varid = looking
True
Varid = found( Expr
Expr || Expr
) ! Expr
Varid f d TrueExpr
Varid = looking
|| Expr
true
id = found
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
16 von 101
AST
Parse trees waste a fair amount of space for representation of terminal symbols and productionsand productions
Compilers post-process parse trees into ASTsinto ASTsASTs are the fundamental data structure of IDEs (ASTView in Eclipse JDT)
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
17 von 101
AST
Problem with ASTs: They do not support static semantic checks, re-factoring and browsing operations, e.g:• Have all used variables been declared• Have all used variables been declared• Have all Classes used been imported• Are the types used in expressions / assignments compatible?• Navigate to the declaration of method call / variable reference / typeg yp
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
18 von 101
ASG
boolean looking, found;…if (l ki && !f d ) { }if (looking && !found ) {…}
Block
VarDecltype=boolean
VarDeclType=boolean
IfStmtyp
VarNameid=looking
yp
VarNameid=found && Block
looking !
found
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
19 von 101
ASG
Abstract Syntax Graphs have additional edges that reflect semanticrelationships, e.g. declare/useThese edges are maintained during static semantic checksThese edges are maintained during static semantic checksThey are used in refactoring operations (e.g. renaming a class).
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
20 von 101
Example: Rename Refactorings in Programs
Refactor the name Person to Human:
class Person { .. }
class Co rse {Definition
class Course {
Person teacher = new Person(“Jim”); Reference (Use)
Person student = new Person(“John”);
}
class Human { .. }class Course { Human teacher = new Human(“Jim”);( );Human student = new Human(“John”);}
TU Dresden, 28.04.2009
Sebastian Richly
Folie
21 von 101
Model Consistency
Name-Resolved Graphs (Use-Definition Graphs, Use-Def Graphs)
Every language and notation has Definitions of items (definition of the variable Foo)Uses of items (references to Foo)Uses of items (references to Foo)
We talk in specifications or programs about names of objects and their use
Definitions are done in a data definition language (DDL)Uses are part of a data manipulation language (DML)
Starting from the abstract syntax, the name analysis finds out about the definitions, uses, and their relations (the Use-Def graph)
h l h f d d fThis revolves the meaning of used names to definitions
TU Dresden, 28.04.2009
Sebastian Richly
Folie
22 von 101
Model Consistency
Refactoring on Complete Name-Resolved Graphs (Def-Use Graphs, Use-Def-Use Graphs)
For renaming of a definition, all uses have to be changed, tooWe need to trace all uses of a definition in the Use-Def-graph, resulting in its inverse, the Def-Use-graphRefactoring works always on Def-Use-graphs and Use-Def-graphs, the complete name-resolved graph (the Use-Def-Use graphs)
Refactoring works always in the same way:Ch d fi itiChange a definitionFind all dependent references Change themRecurse handling other dependent definitionsRecurse handling other dependent definitions
Refactoring can be supported by toolsThe Use-Def-Use-graph forms the basis of refactoring tools
However, building the Use-Def-Use-Graph for a complete program However, building the Use Def Use Graph for a complete program costs a lot of space and is a difficult program analysis task
Every method that structures this graph benefits immediately the refactoringeither simplifying or accelerating it
TU Dresden, 28.04.2009
Sebastian Richly
Folie
23 von 101
Model Consistency
All Specifications Have an Internal Graph-Based Representation
Control-flow Analysis -> CFG, CLG
Data-Flow Analysis -> DFGThe same remarks holds for graphic specificationsHence, all specifications are graph-based!
C G C G GCFG, CLG DFG
....... .......
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
24 von 101
Control-Flow Graphs
Describe the control flow in a programTypically, if statements and switch statements split control flow
Their ends join control flowTheir ends join control flow
Control-Flow Graphs resolve symbolic labelsNested loops are described by nested control flow graphs
if
whilea+=5;
print a print a++
return
TU Dresden, 28.04.2009
Sebastian Richly
Folie
25 von 101
Model Consistency
Simple (Flow-Insensitive) Call Graph (CLG)
Describe the call relationship between the procedures
main = procedure () {array int[] a = read();print(a);
main
print(a);quicksort(a);print(a);
} quicksort
read
}quicksort = procedure(a: array[0..n]) {int pivot = searchPivot(a);quicksort(a[0], a[pivot-1]);
quicksortprint
qu c so (a[0], a[p o ]);quicksort(a[pivot+1,n]);
} searchPivot
TU Dresden, 28.04.2009
Sebastian Richly
Folie
26 von 101
Model Consistency
Data-Flow Graphs (DFG)
Describe the flow of data through the variablesAre based on control-flow graphs
Building the data-flow graph is called data-flow analysisBuilding the data flow graph is called data flow analysis
a=0
if
whilea=a+5;
print a print a++print a print a++
b=a
TU Dresden, 28.04.2009
Sebastian Richly
Folie
27 von 101
Model Consistency
Inheritance Tree or Inheritance Lattice
A lattice is a partial order with largest and smallest element
ObjectObject
Don’t Know
Person
Man Woman
UndefinedI h it
TU Dresden, 28.04.2009
Sebastian Richly
Folie
29 von 101
Model Consistency
Inheritance
UML Graphs
All diagram sublanguages of UML are graphsThey can be analyzed and checked with graph techniques
Hence, graph techniques are an essential tool of the software engineer
TU Dresden, 28.04.2009
Sebastian Richly
Folie
30 von 101
Model Consistency
Remark: All Specifications Have a Graph-Based Representation
Texts are parsed to abstract syntax trees (AST)
Through name analysis, they become abstract syntax graphs (ASG)
Through def-use-analysis, they become Use-def-Use Graphs (UDUG)
Control-flow Analysis -> CFG, CLG
Data-Flow Analysis -> DFG
AST ASG UDUG
....... ....... .......
CFG, CLG DFG
....... .......
TU Dresden, 28.04.2009
Sebastian Richly
Folie
31 von 101
Model Consistency
Types of Graphs in Specifications
i hLists, Trees, Dags, Graphs Structural constrains on graphs
(background information)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
32 von 101
Model Consistency
Modeling Graphs on Two Abstraction Levels
We deal here mostly with directed graphs (digraphs)lists, trees, dags, overlay graphs, reducible (di-)graphs, graphs
Th t diff t b t ti l l i t t d i th There are two different abstraction levels; we are interested in the logical level:
Logical level (abstract, often declarative, problem oriented)
Methods to specify graph and algorithms on graphs:Methods to specify graph and algorithms on graphs:Relational algebraDatalog, description logicGraph rewrite systems graph grammarsGraph rewrite systems, graph grammarsRecursion schemas
Physical level (concrete, often imperative, machine oriented)
Data type adjacency list, boolean (bit)matrix, Data type adjacency list, boolean (bit)matrix, Imperative algorithmsPointer based algorithm
TU Dresden, 28.04.2009
Sebastian Richly
Folie
33 von 101
Model Consistency
Definitions
Fan-inIn-degree of node under a certain relationFan-in(n = 0): n is root node (source)Fan-in(n = 0): n is root node (source)Fan-in(n) > 0: n is reachable from other nodes
Fan-outOut-degree of node under a certain relation Out degree of node under a certain relation Fan-out(n) = 0: n is leaf node (sink)An inner node is neither a root nor a leaf
PathA path p = (n1, n2,…,nk) is a sequence of nodes of length k
TU Dresden, 28.04.2009
Sebastian Richly
Folie
34 von 101
Model Consistency
Lists
One source (root)
One sinkh d h f i fEvery other node has fan-in 1, fan-out 1
Represents a total order (sequentialization)root
GivesPrioritization
Execution order
sink
TU Dresden, 28.04.2009
Sebastian Richly
Folie
35 von 101
Model Consistency
Trees
One source (root)
Many sinks (leaves)d h f iEvery node has fan-in <= 1
Hierarchical abstraction:root
A node represents or abstractsall nodes of a sub tree
.......Example
SA function trees
Organization trees (line organization)
.......
..............
sinks
TU Dresden, 28.04.2009
Sebastian Richly
Folie
36 von 101
Model Consistency
Directed Acyclic Graphs
Many sourcesA jungle (term graph) is a dag with one root
rootsMany sinksFan-in, fan-out arbitraryRepresents a partial order
roots
p pLess constraints that in a total order
Weaker hierarchical abstraction feature
.......
Can be layered
ExampleUML inheritance dagsI h it l ttiInheritance lattices
..............
sinks
TU Dresden, 28.04.2009
Sebastian Richly
Folie
37 von 101
Model Consistency
Skeleton Trees with Overlay Graphs (Trees with Secondary Graphs)
Skeleton tree with overlay graph (secondary links)
Skeleton tree is primaryp yOverlay graph is secondary: “less important”
Advantage of an Overlay GraphTree can be used as a conceptual hierarchy
roots
References to other parts are possible
ExampleXML, e.g., XHTML. Structure is describedby Xschema/DTD links form the
.......by Xschema/DTD, links form the secondary relationsAST with name relationships after name analysis (name-resolved trees, b t t t h )abstract syntax graphs)
..............
sinks
TU Dresden, 28.04.2009
Sebastian Richly
Folie
38 von 101
Model Consistency
Reducible Graphs (Graphs with Skeleton Trees)
Graph with cycles, however, only between sisters
No cycles between hierarchy levels y y
Graph can be “reduced” to one nodeAdvantage
Tree can be used as a conceptual hierarchy
roots
ExampleUML statechartsControl-flow graphs of Modula, Ada, Java ( t C C )
.......(not C, C++)SA data flow diagrams
..............
sinks
TU Dresden, 28.04.2009
Sebastian Richly
Folie
39 von 101
Model Consistency
Reducible Graph
B1B1
B2 B1a B1aB1b
B3 B3a B3a
B4B4
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
40 von 101
Layerable Graphs with Skeleton Dags
Like reducible graphs, however, sharing between different parts of the skeleton trees
Graph cannot be “reduced” to one nodeGraph cannot be reduced to one nodeAdvantage
Skeleton can be used to layer the graphCycles only within one layerCycles only within one layer
ExampleLayered system architectures .......
..............
TU Dresden, 28.04.2009
Sebastian Richly
Folie
41 von 101
Model Consistency
Wild Unstructured (Directed) Graphs
Wild, unstructured graphs are the worst structure we can get
Wild, unstructured, irreducible cycles, , yUnlayerable, no abstraction possibleNo overview possible
Many rootsA digraph with one source is called flow graph
Many sinksExample .......
Many diagrammatic methods in Software Engineering UML class diagrams
..............
TU Dresden, 28.04.2009
Sebastian Richly
Folie
42 von 101
Model Consistency
Strength of Assertions in Models
Ease of Understanding
List: strong assertion: total order Seq entialList: strong assertion: total order
Tree: still abstraction possible
Sequential
Hierarchies
Dag: still layering possible
Tree: still abstraction possible
Partial order
Hierarchies
Graph: the worst case
g y g pLayered
Unstructuredp
TU Dresden, 28.04.2009
Sebastian Richly
Folie
43 von 101
Model Consistency
Strength of Assertions in Models
Saying that a relation isA list: very strong assertion, total order!A tree: still a strong assertion: hierarchies possible easy to think A tree: still a strong assertion: hierarchies possible, easy to think A dag: still layering possible, still a partial orderA layerable graph: still layering possible, but no partial orderA graph: hopefully, some structuring or analysis is possible. Otherwise, it’s the worst case
And those propositions hold for every kind of diagram in Software Engineering!
Try to achieve dags, trees, or lists in your specifications, models, and designs
Systems will be easier, more efficient
TU Dresden, 28.04.2009
Sebastian Richly
Folie
44 von 101
Model Consistency
Structuring Improves Worst Case
List: strong assertion: total order SequentialEase of Understanding
Tree: still abstraction possible Hierarchies
Dag: still layering possible Partial orderLayered
StructuredStructured graph
UnstructuredGraph with analyzed features
Graph: the worst case Unstructured
TU Dresden, 28.04.2009
Sebastian Richly
Folie
45 von 101
Model Consistency
Methods and Tools for Analysis of Graph-Based Models
TU Dresden, 28.04.2009
Sebastian Richly
Folie
46 von 101
Model Consistency
The Graph-Logic Isomorphism
In the following, we will make use of the graph-logic isomorphism:Graphs can be used to represent logic
Nodes correspond to constantsNodes correspond to constants(Directed) edges correspond to binary predicatesHyperedges (n-edges) correspond to n-ary predicates
Consequence:qGraph algorithms can be used to test logic queries on graph-based specificationsGraph rewrite systems can be used for deduction
Carl Gustav fatheri d(C lG t Sil i )
Victoriamarried
married(CarlGustav,Silvia).married(Silvia, CarlGustav).father(CarlGustav,Victoria).mother(Silvia Victoria)
Silvia mothermother(Silvia,Victoria).
TU Dresden, 28.04.2009
Sebastian Richly
Folie
47 von 101
Model Consistency
Graphs and Fact Data Bases
Graphs can also be noted textuallyGraphs consist of nodes, relationsRelations link nodes
Fact data bases consist of constants (data) and
di tRelations link nodes predicates
Nodes of graphs can be regarded as constants, edges as predicates between as predicates between constants (facts):
// TriplesGustavAdolf
isParentOf// F t
// TriplesAdam isParentOf GustavAdolf.Adam isParentOf Sibylla.
Adam// FactsisParentOf(Adam,GustavAdolf).isParentOf(Adam,Sibylla).
SibyllaisParentOf
TU Dresden, 28.04.2009
Sebastian Richly
Folie
48 von 101
Model Consistency
Queries on Graph-Based Models Make Implicit Knowledge Explicit
Since graph-based models are a mess, we try to analyze them
Knowledge is eitherE li it I t d i th d l d d dExplicit, I.e., represented in the model as edges and nodes
Implicit, I.e., hidden, not directly represented, and must be analyzed
Query and analysis problems try to make implicit knowledge explicitexplicit
E.g. Does the graph have one root? How many leaves do we have? Is this subgraph a tree? Can I reach that node from this node?
Determining features of nodes and edgesDetermining features of nodes and edgesFinding certain nodes, or patterns
Determining global features of the modelFinding paths between two nodes (e g connected reachable)Finding paths between two nodes (e.g., connected, reachable)
Finding paths that satisfy additional constraints
Finding subgraphs that satisfy additional constraints
TU Dresden, 28.04.2009
Sebastian Richly
Folie
49 von 101
Model Consistency
Queries for Checking Consistency (Model Validation)
Queries can be used to find out whether a graph is consistent (i.e., valid, well-formed)
Due to the graph-logic isomorphism, constraint specifications can be phrased in g p g p , p plogic and applied to graphsBusiness people call these constraint specifications business rules
Example:if a person hasn't died yet, its town should not list her in the list of dead peopleif a car is exported to England, steering wheel and pedals should be on the right side; otherwise on the left
TU Dresden, 28.04.2009
Sebastian Richly
Folie
50 von 101
Model Consistency
Example: How to Analyze a System for Layers
And the Same Generation ProblemHow to query a dag and search in a dag H t l d i l t t i blHow to layer a dag – a simple structuring problemhttp://susning.nu/Drottning_Silvia
TU Dresden, 28.04.2009
Sebastian Richly
Folie
51 von 101
Model Consistency
Layering of Systems
To be comprehensible, a system should be structured in layersSeveral relations in a system can be used to structure it, e.g., the
Call graph: layered call graphCall graph: layered call graphLayered definition-use graph Layered USES relationship
A layered architecture is the dominating style for large systemsOuter, upper layers use inner, lower layers (USES relationship)Legacy systems can be analyzed for la ering and if the do not ha e alayering, and if they do not have a layered architecture, their structure can be improved towards this principle
TU Dresden, 28.04.2009
Sebastian Richly
Folie
52 von 101
Model Consistency
Layering of Acyclic Graphs
Given any acyclic relation, it can be made layeredSameGeneration analysis layers in trees or dags
Example: layering a family tree:Example: layering a family tree:Who is whose contemporary?Who is ancestor of whom?
Desiree
GustavAdolf
Desiree
Carl Gustav
VictoriaMadeleineSibylla
Silvia
Walter
Adam
RalfAlice
TU Dresden, 28.04.2009
Sebastian Richly
Folie
53 von 101
Model Consistency
Pattern and Rules
Parenthood can be described by a graph pattern
We can write the graph pattern also in logic:
isParentOf(Parent,Child1) && isParentOf(Parent,Child2)
And define the rule
if isParentOf(Parent,Child1) && isParentOf(Parent,Child2)h i ( hild1 hild2)then sameGeneration(Child1,Child2)
Child 1 Child 1
Parent Parent
Child 2 Child 2
TU Dresden, 28.04.2009
Sebastian Richly
Folie
54 von 101
Model Consistency
Impact of Rule on Family Graph
Desiree Desiree
CarlGustav
Vi t i
GustavAdolf CarlGustavGustavAdolf
VictoriaMadeleine
Silvia
Sibylla
W lt
Adam
VictoriaMadeleine
Silvia
Sibylla
W lt
Adam
Ralf
Walter
Ralf
Walter
Alice Alice
TU Dresden, 28.04.2009
Sebastian Richly
Folie
55 von 101
Model Consistency
Same Generation
Base rule: Beyond sisters and brothers we can link all people of same generation
Parent
Child 1
Parent
Child 1
Additional rule (transitive): Enters new levels into the graph
Child 2 Child 2
Parent 2 Child 2 Parent 2 Child 2
Parent 1 Child 1 Parent 1 Child 1
TU Dresden, 28.04.2009
Sebastian Richly
Folie
56 von 101
Model Consistency
Impact of Transitive Rule
Desiree
CarlGustav
GustavAdolf
VictoriaMadeleineSibylla
SilviaWalter
Adam
Ralf
Alice
TU Dresden, 28.04.2009
Sebastian Richly
Folie
57 von 101
Model Consistency
Same Generation Introduces Layers
Computes all nodes that belong to one layer of a dagIf backedges are neglected, also for an arbitrary graph
Algorithm:Algorithm:Compute Same GenerationGo through all layers and number them
Applications: Applications: Compute layers in a call graph
Find out the call depth of a procedure from the main procedureRestructuring of legacy software (refactoring)g g y ( g)
Compute layers of systems by analyzing the USES relationships (ST-I)Insert facade classes for each layer (Facade design pattern)
Every call into the layer must go through the facadeAs a result, the application is much more structured
TU Dresden, 28.04.2009
Sebastian Richly
Folie
58 von 101
Model Consistency
Searching Graphs – Searching in Specifications with Datalog and EARS
TU Dresden, 28.04.2009
Sebastian Richly
Folie
59 von 101
Model Consistency
SameGeneration as a Graph Rewrite System
The rule system SameGeneration only adds edges. Edge addition rewrite system (EARS) add edges
They enlarge the graph, but the new edges can be marked such that they are not put permanently into the graphEARS are declarative (no specification of control flow and an abstract representation)representation)
Confluence: The result is independent of the order in which rules are appliedRecursion: The system is recursive since relation sameGenerationRecursion: The system is recursive, since relation sameGenerationis used and definedTermination: terminates, if all possible edges are added, latest, when graph is completewhen graph is complete
EARS compute reachabilities (graph query, graph analysis)SameGeneration can be used for graph analysis
TU Dresden, 28.04.2009
Sebastian Richly
Folie
60 von 101
Model Consistency
Rule Systems in EARS and Datalog
Rule systems can be noted textually or graphically (DATALOG or EARS)EARS (edge addition rewrite
Datalog contains textual if-then rules, which test predicates b t th t t
EARS (edge addition rewrite systems) contain graph rewrite rules, which add edges Rule nodes contain variables
about the constants rules contain variables to match many constants
//// conclusionsameGeneration(Child1, Child2):- // say: "if" // i
Child1
P t
Child1// premiseisParentOf(Parent,Child1),isParentOf(Parent,Child2).
Parent
Child2
Parent
Child2
// premiseif isParentOf(Parent,Child1) &&Child2 Child2isParentOf(Parent,Child2)then // conclusion
G ti (Child1 Child2)sameGeneration(Child1,Child2)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
61 von 101
Model Consistency
Same Generation Datalog Program
isParentOf(Adam,GustavAdolf).isParentOf(Adam,Sibylla)......if isParentOf(Parent,Child1), isParentOf(Parent,Child2)then sameGeneration(Child1, Child2).if sameGeneration(Parent1,Parent2),isParentOf(Parent1,Child1), isParentOf(Parent2,Child2)
then sameGeneration(Child1, Child2).( , )
TU Dresden, 28.04.2009
Sebastian Richly
Folie
62 von 101
Model Consistency
Searching is Easy With Datalog
# A SMPP problem (searching for single source a set of multiple targets)descendant(Adam,X)? X={ Silvia, Carl-Gustav, Victoria, ....}
# An MSPP problem (multiple source, single target)descendant(X,Silvia)?X={Walter, Adam, Alice}
# An MMPP problem (multiple source, multiple target)ancestor(X,Y)?{X=Walter, Y={Adam}{ , { }X=Victoria, Y={CarlGustav, Silvia, Sibylla, ...}
Y = Adam, Walter, ... Y Adam, Walter, ... # Victoria, Madeleine, CarlPhilip not in the set
TU Dresden, 28.04.2009
Sebastian Richly
Folie
63 von 101
Model Consistency
Description Logic (DL)
A special form of typed binary Datalog (typed EARS)Only with unary and binary relationsClasses and objects, instead of untyped Datalog constantsClasses and objects, instead of untyped Datalog constantsRelationship types and relationsAll knowledge is specified with triples, simple sentences of Verb-Predicate-ObjectOWL (Web Ontology Language) Desiree:
GustavAdol:Person
Person
Carl Gustav:King
Adam instanceOf Person.Sibylla instanceOf Person
Victoria:PrincessMadeleine:
PrincessSibylla:Person
KingSibylla instanceOf Person.GustavAdolf instanceOf Person.King isA Person.Princess isA Person.
Silvia:Person
Walter:Person
Adam:Person
...Adam parentOf GustavAdolf.Adam parentOf Sibylla....
Ralf:P
Alice:P PersonPerson
TU Dresden, 28.04.2009
Sebastian Richly
Folie
64 von 101
Model Consistency
Datalog, DL, OCL, and EARS:Extended Relational Algebra
Datalog, DL and EARS correspond to relational Algebra with recursion (see lecture on data bases).
SQL has no recursion SQL-3 has
Relational Algebra (SQL)
SQL has no recursion, SQL 3 hasNegation can be addedDatalog is a simple variant of Prolog
DL languages:
DesciptionLogic (OWL)
DL languages:OWL (ontology web language)OIL (ontology interchange language)
Datalog (withrecusion; SQL3)
l h
language)OCL does not have transitive closure, but iteration
Datalog withnegation andrecursion
OCL
TU Dresden, 28.04.2009
Sebastian Richly
Folie
65 von 101
Model Consistency
Datalog, DL, OCL, and EARS:Extended Relational Algebra
decidable
"Business rules”
RelationalAlgebra OCLOCL classes = (SQL)
Description Logic(OWL)
unary predicates
Binary Datalog
Datalog (with recursion)(SQL3)
D t l ith ti
Binary Datalog(EARS) binary
predicatesDatalog with negation and recursion
Prolog with negation and recursionand recursion
TU Dresden, 28.04.2009
Sebastian Richly
Folie
66 von 101
Model Consistency
Application Areas of Datalog, DL, OCL, and EARS
Graph query problems (searching graphs)Reachability of nodes (transitive closure)SSPP etcSSPP, etc.
Consistency checking of graph-based specificationsName analysis (building def-use graphs)Data analysis ata a a ys sProgram analysis
Building control-flow graphsData-flow analysis
Model analysis (UML, OWL)
Structurings and algorithms on structured graphsLayering of system relations R d ibilitReducibilityStrongly connected components
Specification of contracts for procedures and servicesProver can statically prove the validity of the contractProver can statically prove the validity of the contract
TU Dresden, 28.04.2009
Sebastian Richly
Folie
67 von 101
Model Consistency
Example for Model Validation: Search in UML Diagrams
Step 1: encode the diagram into a Datalog or DL fact baseStep 2: define integrity constraint rules St 3 l t th l Step 3: let the rules run
TU Dresden, 28.04.2009
Sebastian Richly
Folie
68 von 101
Model Consistency
Example: The Domain Model of the Web-Based Course System
PupilEducation Teacher
name
teacherhasPupil
descriptionlastChanged
CourseSatus Course
CourseOwner
hasCourse
beginDateendDatereadyresultProcent
namedescriptionlastChangedchangedBy
CourseModifierlinksTo
resultProcent
ModuleStatus
changedByactive
ModuleAnswerAlt ti
hasModule
endDateready
namedescriptionlastChangedchangedBy
Alternative
categorytext
{OR}
QuestionStatus
status
Question
category
changedByactive
Link
namedescription
{OR}
linksTo
linksTostatus g y
text URL
TU Dresden, 28.04.2009
Sebastian Richly
Folie
69 von 101
Model Consistency
Searching with Datalog or DLQueries on UML Class Diagrams
// Step 1: construct fact base: the UML class diagramp g// in Datalog fact syntax.teacher(programming,john).hasCourse(programming, lisp).hasPupil(programming,mary).hasPupil(programming,mary).hasModule(lisp,closures).
// Step 2: construct integrity constraint rulesreads(Person Module) :-reads(Person,Module) :-
hasPupil(Person,E), hasCourse(E,C), hasModule(C,Module).
// Step 3: let rules run: form and execute a queryeads(ma Mod le) :- reads(mary, Module)
// the answerModule = closures
TU Dresden, 28.04.2009
Sebastian Richly
Folie
70 von 101
Model Consistency
Example:Web Queries with Logic
The Web is a gigantic graphPages are trees, but links create real graphs
Links are a secondary structure which overlays the primary tree structureLinks are a secondary structure which overlays the primary tree structureInterpret tree and links as relationsGraph algorithms and queries can be applied to the web!
RDF (resource description framework, a simple graph language)RDF (resource description framework, a simple graph language)OWL (description logic, www.w3c.org) adds classes, inheritance, inheritance on binary relations, expressions and queries on binary relationsOther experimental languages SPARQL (Manchester), Flora/XSB (NY Stony Brook, www.ontoprise.com), Florijd (Freiburg)New languages are being developed
I th E t k REWERSE ( t)In the European network REWERSE (www.rewerse.net)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
71 von 101
Model Consistency
Reachability Queries with Transitive Closure
The Swiss-Knife of Graph Analysis
TU Dresden, 28.04.2009
Sebastian Richly
Folie
72 von 101
Model Consistency
Who is Descendant of Whom?
Sometimes we need to know transitive edges, I.e., edges after edges of the same color
Question: what is reachable from a node?Question: what is reachable from a node?Which descendants has Adam?
Answer: Transitive closure calculates reachability over nodesIt contracts a graph, inserting masses of edges to all reachable nodesg p , g gIt contracts all paths to single edgesIt makes reachability information explicit
After transitive closure, it can easily be decided whether a node is reachable or not
Basic premise: base relation is not changed (offline problem)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
73 von 101
Model Consistency
Transitive Closure as Datalog Rule System
Basic rule descendant(V,N) :- isChildOf(V,N).
Parent
Child
Parent
Child
Transitive rule (recursion rule)left recursive: descendant(V,N) :- descendant(V,X),isChildOf(X,N).
right recursive: descendant(V,N) :- isChildOf(V,X), descendant(X,N).
Child Child
Parent GrandCh Parent GrandCh
TU Dresden, 28.04.2009
Sebastian Richly
Folie
74 von 101
Model Consistency
Impact of Basic Rule
Desiree Desiree
CarlGustavGustavAdolf CarlGustavGustavAdolf
VictoriaMadeleineSibylla
VictoriaMadeleineSibylla
SilviaWalter
AdamSilvia
y
Walter
Adam
Ralf Ralf
Alice Alice
TU Dresden, 28.04.2009
Sebastian Richly
Folie
75 von 101
Model Consistency
Impact of Recursion Rule
Desiree
CarlGustavGustavAdolf
VictoriaMadeleineSibylla
SilviaWalter
Adam
Ralf
Walter
AliceImpact only shown for Adam, but is applied to other nodes too
TU Dresden, 28.04.2009
Sebastian Richly
Folie
76 von 101
Model Consistency
[S|M][S|M]PP Path Problems:Variants of Graph Reachability
Single Source Single Target Path Problem, SSPP: Test, whether there is a path from a source to a target
Single Source Multiple Target SMPP: Single Source Multiple Target SMPP: Test, whether there is a path from a source to several targets Or: find n targets, reachable from one source
Multiple Source Single Target MSPP: p g gTest, whether a path from n sources to one target
Multiple Source Multiple Target MMPP: Test, whether a path of n sources to n targets exists
All can be computed with transitive closure:Compute transitive closureTest sources and targets on direct neighborship
TU Dresden, 28.04.2009
Sebastian Richly
Folie
77 von 101
Model Consistency
Exercise: Railway Routes
Base (Facts):directlyLinked(Berlin, Potsdam).
directlyLinked(Potsdam Braunschweig)directlyLinked(Potsdam,Braunschweig).
directlyLinked(Braunschweig, Hannover).
Define the predicatesDefine the predicateslinked(A,B)
alsoLinked(A,B)
unreachable(A,B)u eac ab e( , )
Answer the querieslinked(Berlin,X)
unreachable(Berlin, Hannover)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
78 von 101
Model Consistency
Cost of Transitive Closure
Transitive closure (TC) has many implementationsNaive: multiplication of boolean matrices O(n3) Multiplication of boolean matrices with Russian Method is Multiplication of boolean matrices with Russian Method is O(n2.4)Nested-loop joins from relational algebra: O(n3)
Gets better with semi-naive evaluation, hashed joins, semi-joins, and indices/ d l h l l bMunro/Purdue algorithm is almost linear, but costs space
TU Dresden, 28.04.2009
Sebastian Richly
Folie
79 von 101
Model Consistency
Transitive Closure and Several Relations
Transitive closure works on one relationIf we want to know, whether a certain node is reachable under several relationsseveral relations
Compute transitive closure on all of themTest neighbor ship directly
This delivers an implementation of the existential quantifier for p qlogic
TU Dresden, 28.04.2009
Sebastian Richly
Folie
80 von 101
Model Consistency
Central Theorem of Datalog/DL/EARS
Any Datalog program or EARS graph rewrite system can be transformed into an equivalent one
That is free of recursionThat is free of recursionAnd only applies the operator Transitive Closure (The transitive closure uses direct recursion, but encapsulates it)
What does this mean in practice? (Remember, Datalog/EARS can be used to specify consistency constraint on graph-based p y y g pspecifications)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
81 von 101
Model Consistency
SameGeneration as Non-Recursive System
Basic rule as before
Parent
Child 1
Parent
Child 1isChildOf
i ChildOf
isChildOf
Additional non-recursive rule (descendant is transitive closure of i ChildOf)
Child 2 Child 2isChildOf isChildOf
isChildOf)
Child 1descendant
Child 1
Parent 1
Child 1(isChildOf*)
Parent 1
Child 1descendant
Parent 1
Child 2descendant
Parent 1
Child 2descendant
TU Dresden, 28.04.2009
Sebastian Richly
Folie
82 von 101
Model Consistency
Applications of Graph Reachability in Consistency Checking
Corollary: To solve an arbitrary rechability problem, use a non-recursive query and the operator TransitiveClosure.Consequence: should a graph-based specification be checked on Consequence: should a graph-based specification be checked on consistency (by evaluation of consistency constraints),
it can be done with non-recursive Datalog query and the operator TransitiveClosureAnd solved with the complexity of a good TransitiveClosure algorithm
Precondition: the input graphs are fix, i.e., do not change (static problem)p )
Since the relation is one of the qualities of the world this is a central problem of computer science and IT
Similar to searching and sorting
TU Dresden, 28.04.2009
Sebastian Richly
Folie
83 von 101
Model Consistency
Generic Datalog Queries
Transitive closure is a general graph operator Computing reachability Can be applied generically to all relations!Can be applied generically to all relations!
Many other Datalog rule systems are also genericsameGeneration stronglyConnectedComponentsstronglyConnectedComponentsdominators
And that’s why we consider them here: They can be applied to design graphsThey can be applied to design graphsIs class X reachable from class Y?Show me the ancestors in the inheritance graph of class YIs there a cycle in this cross-referencing graph?
TU Dresden, 28.04.2009
Sebastian Richly
Folie
85 von 101
Model Consistency
Application: Consistency Checking of Graph-Based Models
When a specification becomes big...
TU Dresden, 28.04.2009
Sebastian Richly
Folie
86 von 101
Model Consistency
Example 1: Consistency Checking for Car Specifications
Car data specifications in the MOST standardThousands of parts, described for an entire supplier industryMany inconsistencies possibleMany inconsistencies possibleDue to human errors
Global variants of the cars must be describedExamples of context conditions for global variants of cars:p g
The problem of English cars: A steering wheel on the right implies accelerator, brake, clutch on the rightAutomatic gears: an automatic gear box requires an automatic gear-shift lever
TU Dresden, 28.04.2009
Sebastian Richly
Folie
87 von 101
Model Consistency
First Idea
Define a context free grammar for the car dataFrom that, derive a XML schema for the car data
Enrich the grammar nonterminals with attributesEnrich the grammar nonterminals with attributes
Parse the data and validate it according to its context free structure
TU Dresden, 28.04.2009
Sebastian Richly
Folie
88 von 101
Model Consistency
Second Idea
Analyze consistency of the specifications by regarding them as graphsCheck definition criterion (name analysis)Check definition criterion (name analysis)
“is every name I refer to defined elsewhere”?
Analyze layers with SameGeneration How many layers does my car specification have?Is it acyclic?
Write a query that checks the consistency global variantsIf the car is to be exported to England, the steering wheel, the pedals should be on th i ht idthe right sideIf the car has an automatic gear box, it must have an automatic gear-shift lever
TU Dresden, 28.04.2009
Sebastian Richly
Folie
89 von 101
Model Consistency
Third Idea: Use Ontology Language
OWL (description logic) can be used for consistency constraints, also of car specifications
Result: an ontology, a vocabulary of classes with consistency constraintsgy, y yOWL engines (RACER, Triple) can evaluate the consistency of car specificationsOntologies can formulate consistency criteria for an entire supplier chain [Aßmann2005]
TU Dresden, 28.04.2009
Sebastian Richly
Folie
90 von 101
Model Consistency
Example 2: Consistency Checking of Tax Declarations
Task: you have been hired by the tax authorities. Write a program that checks the income task declarations on consistency
Represent the tax declarations with graphs. How many graphs will you get?H bi th ?How big are they?How much memory do you need at least?
TU Dresden, 28.04.2009
Sebastian Richly
Folie
91 von 101
Model Consistency
First Idea
Write a context free grammar for the tax declarationsFrom that, derive a XML schema
Enrich the grammar nonterminals with attributesEnrich the grammar nonterminals with attributes
Check context free structure of the tax declarations with the XML parser (contextfree consistency)This is usually assured by the tax formThis is usually assured by the tax form
It is, however, nevertheless necessary, if the forms have been fed into a computer, to avoid feeding problems.
TU Dresden, 28.04.2009
Sebastian Richly
Folie
92 von 101
Model Consistency
Second Idea
Write queries that checks document-local, but global constraintsAre there bills for all claimed tax reductions?Are the appendices consistent with the main tax document?Are the appendices consistent with the main tax document?
Global Constraints over all tax Declarations:Have all bills for all claimed tax reductions really been payed by the tax payer? Is a reduction for a debt reduced only once per couple?....
Write an OCL invariant specification for the tax UML class diagram that checks the constraints
Use the Dresden OCL toolkit to solve the problem http://dresden-ocl.sf.net
TU Dresden, 28.04.2009
Sebastian Richly
Folie
93 von 101
Model Consistency
Third Idea: Use Ontology Language
OWL (description logic) can be used for consistency constraints, also of tax declarations
Result: a tax ontology, a vocabulary of classes with consistency constraintsgy, y yOWL engines (RACER, Triple) can evaluate the consistency of tax specificationsOntologies can formulate consistency criteria for an entire administrative workflow [Aßmann2005]
TU Dresden, 28.04.2009
Sebastian Richly
Folie
94 von 101
Model Consistency
Example 3: UML Specifications in Software Engineering
Imagine a UML model of the Java Development Kit JDK.7000 classesInheritance tree on classesInheritance tree on classesInheritance dag on interfacesDefinition-use graph: how big?
Task: You are the release manager of the new JDK 1.6. It has 1000 classes more.
Ensure consistency please. - How?
TU Dresden, 28.04.2009
Sebastian Richly
Folie
95 von 101
Model Consistency
Ideas
Build up inheritance graphs and definition-use graphsin a database
Analyse conditions such asAnalyse conditions such asDepth of inheritance tree: how easy is it to use the library?Hot-spot methods and classes: Most-used methods and classes (e.g., String)
Optimize themDoes every class/package have a tutorial?Is every class containt in a roadmap for a certain user group? (i.e., does the documentation explain how to use a class?)
TU Dresden, 28.04.2009
Sebastian Richly
Folie
96 von 101
Model Consistency
Exam Enrollment
Check if a student can enroll to a lectureCheck if a student has passed his master degree
TU Dresden, 28.04.2009
Sebastian Richly
Model Consistency Folie
97 von 101
What Have We Learned
Graphs and Logic are isomorphic to each otherUsing logic or graph rewrite systems, models can be validated
AnalyzedAnalyzedQueriedChecked for consistencyStructured
Applications are many-fold, using all kinds of system relationshipsConsistency of UML class models (domain, requirement, design models)Structuring (layering) of USES relationships
Logic and graph rewriting technology involves reachabilityquestions
Logic and graph rewrite systems are the Swiss army knife of the validating modeler
TU Dresden, 28.04.2009
Sebastian Richly
Folie
105 von 101
Model Consistency
The End
TU Dresden, 28.04.2009
Sebastian Richly
Folie
106 von 101
Model Consistency