Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

33
Knowledge Base Knowledge Base Diagnostics Diagnostics Richard Fikes (Stanford KSL) Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc.) Mala Mehrotra (Pragati Synergetic Research Inc.) Yolanda Gil (USC ISI) Yolanda Gil (USC ISI) Deborah McGuinness (Stanford KSL) Deborah McGuinness (Stanford KSL) 10/18/01

description

Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc.) Yolanda Gil (USC ISI) Deborah McGuinness (Stanford KSL). 10/18/01. Knowledge Evolution Tools. KB development requires knowledge evolution - PowerPoint PPT Presentation

Transcript of Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Page 1: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge BaseKnowledge Base

DiagnosticsDiagnosticsRichard Fikes (Stanford KSL)Richard Fikes (Stanford KSL)

Adam Pease (Teknowledge)Adam Pease (Teknowledge)

Mala Mehrotra (Pragati Synergetic Research Inc.)Mala Mehrotra (Pragati Synergetic Research Inc.)

Yolanda Gil (USC ISI)Yolanda Gil (USC ISI)

Deborah McGuinness (Stanford KSL)Deborah McGuinness (Stanford KSL)

10/18/01

Page 2: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

2

Knowledge Evolution ToolsKnowledge Evolution Tools KB development requires knowledge evolutionKB development requires knowledge evolution

Debugging, refining, structuring, modularizing, …Debugging, refining, structuring, modularizing, …

Power tools are needed to support KB evolutionPower tools are needed to support KB evolution KB diagnosisKB diagnosis

> Bugs, omissions, heuristic warnings, architectural adviceBugs, omissions, heuristic warnings, architectural advice

KB partitioningKB partitioning> To enable effective reasoningTo enable effective reasoning> To produce reusable KB building blocksTo produce reusable KB building blocks

KB mergingKB merging> To enable interoperation of KBs with overlapping contentTo enable interoperation of KBs with overlapping content

KSL is developing knowledge evolution toolsKSL is developing knowledge evolution tools

Page 3: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

3

ChimaeraChimaera A Knowledge Evolution Tool EnvironmentA Knowledge Evolution Tool Environment

Tools for KB diagnosis and mergingTools for KB diagnosis and merging

Available as a Web service or an OKBC clientAvailable as a Web service or an OKBC client www.ksl.stanford.edu/software/chimaerawww.ksl.stanford.edu/software/chimaera Usable from a Web browserUsable from a Web browser Online user manual, tutorial, and demonstration movieOnline user manual, tutorial, and demonstration movie

Performs KB diagnostics in batch modePerforms KB diagnostics in batch mode Uploads and analyzes user’s KBUploads and analyzes user’s KB Accepts KBs in OKBC, KIF, MELD, RDF, DAML, …Accepts KBs in OKBC, KIF, MELD, RDF, DAML, … Provides results as HTML pages linked to frames and axioms Provides results as HTML pages linked to frames and axioms Provides user selectable set of diagnostic testsProvides user selectable set of diagnostic tests

Analyzes both the structure and content of a KBAnalyzes both the structure and content of a KB Uses reasoners to analyze contentUses reasoners to analyze content

Page 4: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

4

Classification of Diagnostic ResultsClassification of Diagnostic Results Errors

Logical inconsistenciesE.g., contradictory type constraints

Content structure errorsE.g., terms used but not defined

Anomalies Missing information

E.g., type constraints Redundancies

E.g., redundant superclass and type links Extraneous structure or content

E.g., terms defined but not used

SummariesE.g., counts of term references

SuggestionsE.g., use consistent naming conventions

Page 5: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

5

““Background” Reasoning AnalysisBackground” Reasoning Analysis Reasoning diagnostics that may take substantial timeReasoning diagnostics that may take substantial time

Performed in background Results incrementally posted on Web page Completion notification sent to user via e-mail

Example reasoning diagnosticsExample reasoning diagnostics Redundant axioms that are inferred by the KB (anomaly)

Inconsistent axioms whose negations are inferred by the KB (error)

Determine which relations in KB are primitive and non-primitive (summary)

> Show relations on which each non-primitive relation depend

Determine classes that are disjoint (suggest adding results to KB)

Derive subclass and instance links (suggest adding links to KB)

I.e., classification and recognition

Suggest reordering of an implication’s antecedents based on number of

inferable instances of each antecedent (suggestion)

Page 6: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

6

Integration Into SHAKENIntegration Into SHAKEN

Chimaera is a KB diagnostics tool in the SHAKEN systemChimaera is a KB diagnostics tool in the SHAKEN system Used to diagnose both pump priming and SME KBsUsed to diagnose both pump priming and SME KBs

OKBC was used to do the integrationOKBC was used to do the integration Chimaera is an OKBC clientChimaera is an OKBC client

> Interacts with any OKBC server using the OKBC APIInteracts with any OKBC server using the OKBC API

> The Chimaera Web service uses Ontolingua as its OKBC serverThe Chimaera Web service uses Ontolingua as its OKBC server

SRI added an OKBC wrapper to the KM systemSRI added an OKBC wrapper to the KM system> Enabled KM to be an OKBC server usable by OKBC clientsEnabled KM to be an OKBC server usable by OKBC clients

> Enabled Chimaera’s diagnostics to run directly on KM KBsEnabled Chimaera’s diagnostics to run directly on KM KBs

Page 7: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

7

Chimaera Useful To SRI TeamChimaera Useful To SRI Team

“Overall, we found that Chimaera was quite useful. It

found 2 concepts (Indole and Imidazole) that were corrupted, several occurrences of redundant superclasses, and several incorrect domain and range constraints (due to our poor representation of "Information").

We're currently fixing the bugs it revealed. It would be helpful if we could run Chimera on the component library frequently.”

– Bruce Porter

Page 8: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

8

Next Steps: SME-Oriented SupportNext Steps: SME-Oriented Support Provide interactive repair oriented follow-up to diagnostics

Identify KB content on which diagnosis result is based Suggest repairs or repair strategies Guide user through repair procedure

Examples Class is a direct subclass of “THING”

> Provide direct subclasses of THING as candidate superclasses

> Step down through the class hierarchy

Class has redundant superclass links> Suggest removal of link(s) to most general classes

Type, cardinality, or bounds conflict> Suggest changing local conflicting constraint(s)

Missing information> Initiate acquisition dialogues for missing information

Page 9: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

9

Next Steps: Architectural AnalysisNext Steps: Architectural Analysis Summarize architectural features of a KB

Percentage of > Relations that are functions > Axioms that are propositional, first order, higher order> Axioms that are not horn clauses

Distribution of > Axioms by type (using the HPKB, RKF types)> Axiom lengths by number of literals> Functions by number of arguments> Relations by number of arguments> Direct subclasses per class> Direct subproperties per property> Restrictions per object> Property values per object

Page 10: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

10

Next Steps: Partitioning and BeyondNext Steps: Partitioning and Beyond Integration of KB partitioning tools into Chimaera

Provide automatic KB partitioning to enhance usability

Automatic running of test casesE.g., queries and expected answers

Support regression testing of evolving KB Provide result summaries from failed tests

Help with typographical errors Spelling correction for undefined names

E.g., classes, slots, relations, functions, constants

Spelling correction for anomalously occurring variables> Suggest is the same as another variable in the sentence

Page 11: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

11

SummarySummary

KSL is developing Chimaera to support KB evolution

Chimaera was integrated into the SHAKEN Y1 system

Using OKBC(!)

Incrementally adding diagnostics

E.g., “background” diagnostics that use sophisticated reasoning

Next steps

KB partitioning tools

Repair dialogues for SMEs

KB architectural analysis

Regression testing

Page 12: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

12

Role of Diagnostics in SystemsRole of Diagnostics in Systems

KE support SME support Increase productivity (“lightly trained”)

Step in managing KB development

Focus attention (e.g., redundant links) Evaluation support

Diagnose KBs produced during evaluation

Batch mode Foreground

Background

Changes in “patterns” in the KB between versions

Page 13: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

13

Sharing Diagnostics InformationSharing Diagnostics Information Diagnostic specifications

Logical specifications English specifications Test cases

Diagnostic classifications Learnings Tricks of the trade Sharing facilitators:

Working group Mailing list

Findings data Author, group, or team specific

Repair strategies Alignments during collaborative development

Page 14: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

14

Developer Needs and DesiresDeveloper Needs and Desires Reasoner-specific diagnostics Highly informative diagnostic results Reporting architectural bias in a KB

Binary versus higher order relations

First order versus higher order axioms> Weakly versus strongly higher order

Disjunctions or conjunctions

Existential versus universal quantifiers

Frames to axioms ratios

Horn clauses

Axiom lengths

Functions

Confusion of existential and universal quantifiers Type restrictions too general Misspelling of variables

Page 15: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

15

Developer Needs and DesiresDeveloper Needs and Desires

Domain-specific tests

Semantic tests

Maintainability measures

Recognizing typographical errors

Spell check undefined or unused terms

Redefining (e.g., breaking up) a predicate Large scale modification techniques

Prioritizing diagnostics

Page 16: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

16

Integration IssuesIntegration Issues

Architecture Use hosted services (like KSL)

Integrate special code

Take specifications from library

API Interaction Mode - Batch versus Interactive/Repair Translation issues

One major use of diagnostics is also in testing translators

Certain translations need to be done to do better analysis

Output integration

Page 17: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

17

EvaluationEvaluation

Record types and numbers of errors Comparing KBs produced by SMEs versus KEs

Record use of repair strategies

Evaluate during testing

Feedback from SMEs about diagnostics

Page 18: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

18

Classification of Diagnostic ResultsClassification of Diagnostic Results Errors

Logical inconsistencies

Content structure errors

(See Randy Davis thesis)

Anomalies Missing information

> Missing portions of descriptions

Redundancies

Extraneous structure or content

Summaries Architectural biases

Suggestions Stylistic suggestions

Static versus operational tests Use of expertise about KR paradigms

Page 19: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

19

Diagnostic Issues/GoalsDiagnostic Issues/Goals Role of Diagnostics in Systems

KE support, SME support Evaluators of KBs

How to Share Diagnostics Working Group? Logical specification, English descriptions, tests, …

Know the Main Contributors Possible Diagnostics

What do users want? What can tool builders provide?

Integration Issues Developer Needs/Desires Evaluation

Page 20: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

20

The Role of KB DiagnosticsThe Role of KB Diagnostics KE support SME support Increase productivity (“lightly trained”) Mgmt of kb Inference dependent quality improvement Focus attention (ex. Redundant links) Evaluation support Abstract patterns – average fanout of specialization,

statistics of number of uses of a predicate – big picture view

Version comparison Regression testing

Page 21: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

21

Diagnostic SharingDiagnostic Sharing Diagnostic specifications

Logical specifications English specifications Test cases

Diagnostic classifications Taxonomy of errors – bottlenecks,

Quantification Alignments across systems – inconsistencies among smes Repair strategies How informative a system is (core dump vs. useful explanation) Learnings Tricks of the trade

Sharing Facilitators: Working Group Mailing list

Page 22: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

22

Sharing facilitiesSharing facilities

Working groupMailing listPosting of papersUtilize Teknowledge

Page 23: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

23

biasesbiases Binary vs. higher arity First order vs higher order

Weakly vs strongly higher order Universal over existential Disjunction vs. conjunction Frame-ism Horn clauses Lisp style Relations -> functions Depth vs. breadth in hierarchy …. Maybe report in summarizations.. At least document biases

Page 24: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

24

Organizations/PeopleOrganizations/People

Cycorp – many special purpose - Kahlert ISI – Why Not? – Chalupsky

– KANAL – Gil

- expect - Gil Pragati – Clustering - Mehrotra Stanford FRG/KSL – Partitioning – McCarthy,

Amir, McIlraith Stanford KSL – Chimaera - Fikes, McGuinness

Page 25: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

25

DiagnosticsDiagnostics Errors – provable logical inconsistencies Anomalies – redundancies, cycles,… Summaries – word counts, … Suggestions – naming conventions Incompletenesses – explicit salient assertions or statistics Stylistics - length of rule, … bad factoring, Randy davis – errors – incompleteness, inconsistent Get this - Top ten list of things people do wrong in cyc -

goolsbeyPerspectives/units:

Frame-like content vs. axioms vs. problem solving technology vs. learning to correct components

Page 26: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

26

stylestyle

Static ReasonerSimulation / executionUsing examplesSummarization/improvements/critiquer

Page 27: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

27

Integration IssuesIntegration Issues Architecuture

Use hosted services (like KSL) Integrate special code Take specifications from library

API Interaction Mode – Batch vs. Interactive/Repair Translation issues

one major use of diagnostics is also in testing translators Certain translations need to be done to do better analysis Background ontologies – meld starter ontology

Output integration

Page 28: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

28

Developer Needs/DesiresDeveloper Needs/Desires

Missing existentialsToo high a type specificationVariable name mismatch

Semantic requests:Wrong semantic paradigm?TyposSpell check

Large scale modification tools and their integrationexample removal/ fixing top level

priotizing

Diagnostics to minimize cost, ease maintenance

Page 29: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

29

EvaluationEvaluation Record types of errors

Fine granularity Kb differences across sme vs. ke developed

ontologies across team Record use of repair strategies… Evaluate during testing… Feedback from smes on features, usefulness, etc. Attempt to keep extremely complete audit trails for

future analysis Important to be careful with diagnostic reporting

Page 30: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

30

Action ItemsAction Items

Working GroupDiagnostics repositoryWeb siteFollow up briefingMailing list

Page 31: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

31

ChimaeraChimaera A Knowledge Evolution Environment

Tools for KB diagnosis and merging

Available as a Web service www-ksl-svc.stanford.edu www.ksl.stanford.edu/software/chimaera Usable from a Web browser Online user manual, tutorial, and demonstration movie Provides user selectable set of diagnostic tests

Performs kb diagnostics in batch mode Uploads and analyzes user’s KB Accepts KBs in MELD, KIF, OKBC, DAML, RDF, XML, … Provides results as HTML pages linked to frames and axioms

Analyzes both the structure and content of a KB Uses hybrid reasoners to analyze content Currently runs 28 diagnostic tests

Page 32: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

32

Collection/SpecificationCollection/Specification

Logical Specification of diagnosticEnglish SpecificationExample kb that triggers diagnostic output

Page 33: Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge)

Knowledge Systems Laboratory, Stanford University

33

Classification of Diagnostic Results IIClassification of Diagnostic Results II

Axiom Analysis Axiom Syntax Problems

E.g., no consequent to a implications Axiom Redundancy

E.g., 1. A =>B 2. A=>C 3. C =>B means 1 is redundant Axiom Variable Usage

E.g., Variable used in antecedent but not in consequent Axiom Consistency

E.g., A => not A Axiom Tautology

E.g., consequent repeats (portion of) antecedent