Planning to Learn with a Knowledge Discovery Ontology

Planning to Learn with a Knowledge Discovery Ontology Monika Žáková, Petr Křemen, Filip Železný (Czech Technical University, Prague) Nada Lavrač (Institute Jozef Stefan, Ljubljana)


Planning to Learn with

a Knowledge Discovery Ontology

Monika Žáková, Petr Křemen, Filip Železný(Czech Technical University, Prague)

Nada Lavrač(Institute Jozef Stefan, Ljubljana)

FP6 SEVENPRO project: “semantic engineering environment”

- integration of knowledge from various sources e.g. different CAD software, ERP, etc. by means of a layer of semantic annotations

- a significant part of engineering knowledge has a rich relational structure (CAD designs, documents, simulation models, ERP databases) traditional ML techniques and tools unsuitable

Goals:- making implicit knowledge contained e.g. in

CAD designs explicit for reuse, training, quality control

- develop a tool for RDM capable of dealing with semantic annotations and producing results in a semantic format

Design Example

in the CAD ontology:<rdfs:Class rdf:ID="PrismSolFeature"></rdfs:Class><rdfs:Class rdf:ID="SolidExtrude"> <rdfs:subClassOf rdf:resource="#PrismSolFeature"/></rdfs:Class>declaring it in background knowledge:subclass(prismSolFeature, solidExtrude).hasFeature(B, F1):-hasFeature(B,F2),subclassTC(F1,F2).

problem with subsumption:C = liner(P):-hasBody(P,B),hasFeature(B,prismSolFeature).D = liner(P):-hasBody(P,B),hasFeature(B,solidExtrude).

it does not hold C D clause D not obtained by applying a specialization refinement operator onto clause C our approach: extend refinement operator with taxonomies on predicates and terms

Sorted Refinement

Downward Δ,Σ-refinement- extension of sorted refinement proposed by

Frisch- defined using 3 refinement rules:1.adding a literal to the conjunction2.replacing a sort with pred1(x1:τ1,…,xn:τn)

with one of its direct subsorts pred1 (x1:τ1’,…,xn:τn)

3.replacing a literal pred1 (x1:τ1,…,xn:τn) with one of its direct subrelations pred2 (x1:τ1,…,xn:τn)

Feature Taxonomy

- information about feature subsumption hierarchy stored and passed to the propositional learner

- assume that features f1,…, fn have been generated with corresponding conjunctive bodies b1,…, bn

- elementary subsumption matrix E of n rows and n columns is defined such that Ei,j = 1 whenever bi ∈ ρΔ,Σ(bi) and Ei,j = 0 otherwise

- exclusion matrix X of n rows and n columns is defined such that Xi,j = 1 whenever i = j or bi ∈ ρΔ,Σ (ρΔ,Σ (… ρΔ,Σ(bj) …)) and Xi,j = 0 otherwise.

Propositional Rule Learning

2 propositional algorithms adapted to utilize matrices E, X

1. Top-down deterministic algorithm– stems from the rule inducer of RSD

2. Stochastic local DNF algorithm – (Rückert 2003, Paes 2006)– search in the space of DNF formulas – refinement done by local non-deterministic

DNF term changes - using matrices E, X can:

– prevent the combination of a feature and its subsumee within the conjunction (both)

– specialize a conjunction by replacing a feature with its direct subsumee (Top-down only)

RDM Core Overview

Feature subsumption


Feature construction

Propositional rule learning (adapted)


Subsumption and exclusion


Predicates declarationsmode hasBody( +CADPart, -Body).

mode hasMaterial(+CADPart, -Material).mode hasSketch(+CADPart, -Sketch).mode hasLength(+Sketch, -float).

Sort theory subClassOf(CADPart,CADEntity).


subPropertyOf(hasCircularSketch, hasSketch).

subPropertyOf(firstFeature, hasFeature).

Examples eItem(eItemT_BA1341).eItem(eItemT_BA1342).eItem(eItemT_BA1343).

Background knowledge (Horn logic)






















has Constraint








x1, y1







x2, y2



x2, y2




Propositional learning

(Weka, R)

RDM Manager

= tool developed for running the RDM tasksFunctionalities:1. Obtaining relevant data by means of SPARQL query to

semantic repository2. Converting data from semantic representation into

format acceptable by the DM algorithms (Prolog, arff, csv, etc.)

3. Propositionalization by generating first order features4. Enhanced propositional rule learning algorithms5. Third party propositional learning algorithms

integrated by means of wrappers e.g.– rule learner RIPPER (Cohen 1995)– association rules - Apriori– decision trees – J48 algorithm

(for all above WEKA implementation used)– clustering – distance-based PCA (implemented in

R)6. Storing information about DM processes and

their results in semantic representation

Knowledge Discovery Ontology

Foreseen queries that guided the design of the ontology- User:

– Give me all rule-based classifiers found for class C on dataset D with error estimate < 5%

– Give me the rule-based algorithm with shortest average runtime for datasets D, E and F

- Developer:– Give me all pairs of model classes with equivalent

expressiveness for which no conversion program is available

– Give me all parameter settings for experiments with dataset D and algorithm A and their respective runtimes accuracy results

Example Queries to the KD ontology

- Obvious idea: if the system knows all it can do, it can plan complex KD workflows

- Example: a planning system queries to the ontology for generating decision tree from a relational dataset through propositionalization

– Give me a program that takes a classified relational dataset represented as Prolog facts and produces an arff file

– A program that take an arff file and produces a decision tree

Motivation for Workflow Generation

- user:– RDM algorithms utilizing background knowledge

and relational learning through propositionalization and subsequent propositional learning quite complex we want to hide as much of complexity as possible from the user

- developer/data miner:– storing information about the whole process

repeatability of experiments– individual components developed by different

people can focus on experimenting with parameters of some components and view other as black box

Main Classes of KD Ontology

- main notions : Knowledge and Algorithm

- representation language: OWL-DL– densely interlinked knowledge

structures, not just taxonomies – highly optimized reasoners available

(Pellet, RacerPro, Fact++, ...)

5 subclasses:- Dataset

- LogicalKnowledge

- NonLogicalKnowledge

- Pattern = MiningResult

- multiple formats may be attached to each Knowledge class– each knowledge instance has a specified KnowledgeFormat

Knowledge and example some Example

subclassOf Knowledgeand hasExpressivity some Expressivityand hasFormat some KnowledgeFormat

Knowledge and not LogicalKnowledge

Knowledge and producedBy some AlgorithmExecution

Expressivity hierarchy Protégé

Algorithm- a mapping from knowledge to knowledge- not just induction, all executable elements

incl. preprocessing, ...- definition of inputs, outputs and parametersApriori subclassOf NamedAlgorithm and input some (Dataset

and hasExpressivity only SingleRelationStructure

and format only {ARFF,CSV}) andoutput some (MiningResult and contains only AssociationRule) and minMetric some double and minSupport some double and numOfRules some positiveInteger

Algorithms (2)

- atomic (named) vs. composite (workflows)- types of algorithms modeled as classes e.g. ClusteringAlgorithm

- each algorithm description is modeled as a subclass of class NamedAlgorithm (like Apriori above)

- instances of class AlgorithmExecution represent executions of algorithms

- thus, to access a particular algorithm, we need to pose a schema query to the OWL ontology – SPARQL-DL

- Result of a data mining algorithm- Describes a mapping from knowledge to knowledge- Defined as:

- Example: association rules

AssociationRule subclassOf AtomicKnowledgeand antecedent some Andand consequent some Andand confidence some doubleand support some double

Knowledge and producedBy some AlgorithmExecution

subclassOf contains only (AtomicKnowledge and singleResultAnnotation some anySimpleType)

MiningResult and producedBy only AssociationRulesAlgorithmExecution

and contains only AssociationRule

Anticipated Usage of the KD Ontology

- a specialization of relevant OWL-S ontology parts – mainly the Process class.

- during the planning inputs and outputs will be matched w.r.t. their format and expressivity to filter out invalid algorithm bindings

- beyond the workflow generation :– management of the SoA knowledge in

the KD domain– storing and managing KD workflow

results – for example for meta-learning, experiment repeatibility

Workflow Construction

Automatic workflow construction1. Converting KD task described using

classes from the KD ontology into a planning problem described in PDDL

2. Generating a plan using a planning algorithm

3. Storing the generated abstract workflow in form of semantic annotation

4. Instantiating the abstract workflow with specific algorithm configurations available in the KD ontology

Workflow-related Classes of KD Ontology

KD ontology extended with workflow-related classes:

- ProblemDescription – defined using properties– init specifying the available input data and

knowledge – goal specifying the desired results

- Action – defined by– Algorithm, which is executed– startTime, duration and – immediately preceeding Actions

- Workflow – currently a DAG of Actions with a link to ProblemDescription from which it was generated

Problem Description Example- Example: generating relational association rules from

a classified relational dataset with relational background knowledge expressed in OWL-DL

RelationalAssociationRules subClassOf ProblemDescriptionand goal some (MiningResult

and contains only AssociationRule)and init some (LogicalKnowledge

and hasExpressivity some OWL-DL and hasFormat some {RDFXML})

and init some (LogicalKnowledge and hasExpressivity some

RelationalStructure and hasFormat some {RDFXML})

and init some (ClassifiedInstanceSet and hasFormat some {RDFXML})

Conversion into a Planning Task Described in PDDL

- ontology classified using FACT reasoner to generate inferred hierarchy on algorithms, knowledge and patterns

- names generated for classes defined using OWL restrictions

- domain description in PDDL – generated by converting Algorithms into PDDL

actions, with inputs specifying the preconditions and outputs specifying the effects

– both inputs and outputs are currently restricted to conjunction of OWL classes

-  problem description in PDDL– generated in the same way from ProblemDescription

Algorithm Definition Example

Description in KD ontology (in DL formalism )

Description used for planning (in PDDL )(:action AprioriAlgorithm:parameters ( ?v0 – Dataset_SingleRelationStructure

?v1 – ARFF?v2 – MiningResult_contains_AssociationRule)

:precondition (and (available ?v0) (format (?v0 ?v1)):effect (and (available ?v2))

Apriori subClassOf NamedAlgorithm and input some (Dataset

and hasExpressivity only SingleRelationStructure

and format only {ARFF}) andoutput some (MiningResult and contains only AssociationRule) and minMetric some double and minSupport some double and numOfRules some positiveInteger

Planning Algorithm

- based on Fast-Forward planning system (Hoffman, 2001)

- enforced hill climbing algorithm to perform forward state space search

- goal distances estimated using relaxed GRAPHPLAN

– i.e. ignoring delete lists of the operators- returns the discovered workflows with

lowest number of processing steps

Generated Workflow for CAD Designs

RDM Manager implementation

RDM GUI Semantic Server Agent

RDM Manager Tool

RDM Web Service

RDM Engine

Algorithm Implementation 1

Algorithm Implementation n…



Related Work (planning to learn)

- Most relevant: NEXT System [Bernstein & Deanzer]- (Our best understanding:)

– Linear plans– Preprocessing-Induction-Postprocessing template

- We try for a template-free plan (DAG)

Multi- relational


Feature constructio

n (inductive)

Feature evaluation(deductive)

Propositional learning(inductive)

Propositionalized Data

Related Work (DM workflows and DM assistants )

- workflows for DM – myGrid/Taverna, Triana, DataMiningGrid,

Kepler, KnowledgeGrid, CAMLET, Pegasus, MiningMart

– manual workflow composition, focus on workflow execution

– focus on DM from relational databases– relevant efforts in formalization of DM

processes- DM assistants

– MetaL, StatLog - classification of DM methods, metrics for comparing the methods, finding suitable methods for a given dataset

Related Work (DM ontologies)

- existing DM ontologies– ontologies for classical DM - 3 stages:

induction, pre- and post-processing– focus on hierarchy of DM algorithms

and propositional dataset description – DAMON – KnowledgeGrid project [Cannataro & Comito]– DataMiningGrid application description schema

[Stankovski et al.]– DM ontology for IDEA [Bernstein et al.] – myGrid ontology – for bioinformatics, includes

biological domain concepts

- other work towards KD process formalization– CinQ and IQ projects (EU FP6)– Sašo Džeroski: Towards a General Framework

for Data Mining

Related Work (Semantic Web Service Composition)

- essentially creating workflows based on semantic description of the ingredients

- popular approach: convert semantic description to PDDL and use suitably adapted planning techniques [Klusch et al.], [Liu et al.]

- we have adapted this approach for DM workflows using KD ontology

- future work: individual DM algorithms as web services?

Open Issues

- Reactive planning / exploration– Currently planning towards a desired

kind of result, not quality

- Conversion of knowledge – From more to less expressive– How can we constrain what should

remain from the original information?– Can this be done at all without semantic


Open Issues

- Tighter integration of the ontology with planning

– Currently: simple rewriting of algorithm annotations into PDDL actions

– Work-in-progress: planner poses SPARQL queries to retrieve relevant actions

- Computational platform:– GRID or web services?