ECIA - Presentation by Edgar Garcia Casellas (Barcelona 6 juny 2013)
Http:// AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep...
-
Upload
montana-dominic -
Category
Documents
-
view
215 -
download
0
Transcript of Http:// AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep...
http://www.sekt-project.com
AIF
B
Iuriservice II Ontology Iuriservice II Ontology DevelopmentDevelopment
Núria Casellas, Núria Casellas, Denny VrandeDenny Vrandečić, Joan Josep Vallbé, čić, Joan Josep Vallbé, Aleks Jakulin, Mercedes BlázquezAleks Jakulin, Mercedes Blázquez
Workshop on Artificial Intelligence and LawWorkshop on Artificial Intelligence and LawXXII World Congress of Philosophy of Law and Social PhilosophyXXII World Congress of Philosophy of Law and Social Philosophy
Granada, May 2005Granada, May 2005
May 25, 2005 2 http://www.sekt-project.com
AIF
B
• Introduction to SEKT Project and Legal Case Study
• Methodology• OPJK• Improving knowledge discovery
on the competency questions• Architecture
Agenda
May 25, 2005 3http://www.sekt-project.com
The inSEKTs
BT
University of Sheffield
Vrije Universiteit Amsterdam
Sirma AI
Empolis
Universität Karlsruhe
Ontoprise
Universitat Autònomade Barcelona
Universität Innsbruck
Jozef Stefan Institute
iSOCO
Kea-pro
May 25, 2005 4 http://www.sekt-project.com
AIF
BSEKT
• Main goals of SEKT• European Leadership in Semantic
Technologies• Core Research
• Combine Human Language Technologies, Knowledge Discovery and Ontology Technologies
• Provide intelligent knowledge access
May 25, 2005 5 http://www.sekt-project.com
AIF
BDescription of the Problem:
Legal Domain• In General:
• Complaint about diligence of legal administration. • The Judges are overworked.
• In Particular:• New Judges • A lot of theoretical knowledge, but few practical knowledge• On Duty.
• When they are confronted with situations in which they are not sure what to do
• “Disturb” experienced judges with typical questions. • Usually his/her former tutor (Preparador)
• Existing Technology • Legal Databases
• Essential in their daily work• Based on keywords and boolean operators• A search retrieves a huge number of hits
May 25, 2005 6http://www.sekt-project.com
Description of the Problem: Legal Domain
• Solution:• Design an intelligent system to help new judges with their
typical problems.• Extended FAQ system using Semantic Web technologies• Connect the FAQ system with the exiting jurisprudence.
• Search Jurisprudence using Semantic Web technologies.
May 25, 2005 7 http://www.sekt-project.com
AIF
B
• LLD [Language for Legal Discourse, L.T. McCarty, 1989]: Atomic formula, Rules and Modalities.
• NOR [Norma, R.K. Stamper, 1991, 1996]: Agents Behavioral invariants, Realizations.
• LFU [Functional Ontology for Law, R.W. van Kranlinger; P.R.S. Visser, 1995]: Normative Knowledge, World knowledge, Responsibility knowledge, Reactive knowledge and Creative knowledge.
• FBO [Frame-Based Ontology of Law, A. Valente, 1995]: Norms, Acts and Concepts Descriptions].
• LRI-Core Legal Ontology [J. Breuker et al., 2002]: Objects, Processes, Physical entities, Mental entities, Agents, Communicative Acts.
• IKF-IF-LEX Ontology for Norm Comparaison [A. Gangemi et al., 2001]: Agents, Institutive Norms, Instrumental provisions; Regulative norms; Open-textured legal notions, Norm dynamics.
State of the Art in Legal Ontologies
May 25, 2005 8 http://www.sekt-project.com
AIF
B
• Professional Knowledge (PK)• Legal Knowledge (LK)
Legal Core Ontologies (LCO) [based on General Theories of Law]
• Legal Professional Knowledge (LPK) OPLK
• Judicial Professional Knowledge (JPK) OPJK
Conceptual distinctions
May 25, 2005 9http://www.sekt-project.com
14
7
5
298
16
8
8
10
12 10
6 116
Total Autonomous Communities: 14 (out of 17)
Ethnographic survey
May 25, 2005 10 http://www.sekt-project.com
AIF
B
Statistical analysis of results
• Judicial units: heterogeneity
• Judge’s profile
Protocols of analysis
• Literal transcripts
• Completed questionnaires
• List of extracted questions
Preliminary exploitation of data
May 25, 2005 11 http://www.sekt-project.com
AIF
B
• Identification of possible concepts through ALCESTE’s results and TextToOnto conceptual distribution
• Domain detection
• Competency questions discussion and concept extraction
OPJK Modeling
May 25, 2005 12 http://www.sekt-project.com
AIF
B
JUDGE
ON-DUTYFAMILY ISSUES
IMMIGRATION
REAL ESTATE
DECISION-MAKING &
JUDGMENTS
PROCEEDINGSJUDICIAL CLERKS
COMMERCIAL LAW
CONTRACT LAW
CRIMINAL LAW GENDER
VIOLENCE
ORDER OF PROTECTION
/ INJUNCTION
Intuitive ontological subdomains
May 25, 2005 13 http://www.sekt-project.com
AIF
BTerm extraction using
TextToOnto
May 25, 2005 14 http://www.sekt-project.com
AIF
BTerm extraction using
TextToOnto and Spanish Gate
May 25, 2005 15 http://www.sekt-project.com
AIF
B
1. Identify important concepts that
should be represented2. Hierarchy construction3. Identify relations between them 4. Redefine the ontology repeting steps
1-4
May 25, 2005 16 http://www.sekt-project.com
AIF
B
Selecting (underlying) all the nouns (usually concepts) and adjectives (usually properties) contained in the competency questions.
• ¿Cuál es el tratamiento de las denuncias manifiestamente inverosímiles o relativas a hechos que evidentemente carecen de tipicidad?
• ¿Y si se trata de una querella que reúne todos los demás presupuestos procesales pero los hechos objeto de la misma carecen de relevancia penal o manifiestamente falsos?
• ¿Qué ocurre si comparece en el juzgado una persona que quiere denunciar hechos difícilmente creíbles, sin relación entre sí, dudándose por el juez de la capacidad mental del denunciante?
• ¿Ante quién debe interponerse el recurso de reforma contra la prisión, delante del juez de guardia o del juez que dictó el correspondiente auto de prisión?
Competency question discussion
May 25, 2005 17 http://www.sekt-project.com
AIF
BOPJK classes identified
May 25, 2005 18 http://www.sekt-project.com
AIF
BOPJK and Proton Integration
http://www.sekt-project.com
AIF
B
Improving knowledge discovery Improving knowledge discovery on the competency questionson the competency questions
May 25, 2005 20 http://www.sekt-project.com
AIF
B
Data: 3 text corpora (judges’ questions):
• Corpus 1: Scholar “on duty” questions (Spanish Judicial School = 99)
• Corpus 2: Practical “on duty” questions (= 163) (field work)
• Corpus 3: All practical questions (=756)(field work)
Method: • TEXT GARDEN (J. Stefan Institute,
Ljubljana)• ALCESTE -Analysis of the co-occurring
lexemes within the simple statements of a text [Reinert 2002, 2003]
Data and Method
May 25, 2005 21 http://www.sekt-project.com
AIF
B
The text needs to be represented in an appropriate way for statistical analysis:
1. Breaking text into “units” (lines, sentences, …)
2. Morphological categorization (adjectives, prepositions, …)
3. Putting words into canonical form:a) Lemmatization (is,was,are → be)b) Stemming (loved, loving → lov+)
4. Analysis:a) Clusteringb) Latent semantic indexingc) Correspondence analysisd) Classificatione) Visualization
Analysis of Text
ALCESTE (Reinert,1988)
Corpus
Segmented in chunks
Classes of related chunks
List of typical words related to each class
{ }
{ }
{ }
Geometric representation
Hierarchical descending clustering
Correspondence analysis
Folch & Habert (2000)
Example of Correspondence Analysis and Visualization
+-----|---------|---------|---------+---------|---------|---------|-----+• 20| solo| |• 19| | parte+ |• 18| | monitorio demand+ |• 17| | archiv+accion+ |• 16| present+ | falta+ vehiculo+fase+ |• 15| | seguir procurador+ |• 14| |recurso+ pago+quiebra+ |• 13| ofici+| gasto+ . .ejecut+ejecucion+ |• 12| sido dia+ .finca+embarg+verbal+ |• 11| interes+traficoacto+.notificacionentrega+ |• 10| momentocelebr+hall+ cuantia+resolver |• 9 | valor+ |auto+admit+qued+.juicio+deposit+ |• 8 | lesion+ venirdinero.. notific+pericial+ |• 7 | | si vista+aport+inform+ |• 6 madreacord+viviend+ | cabo solicit+ |• 5 | victima+maridoempresa+ | llev+ ya prueba+abogado+ |• 4 | ..tratosproteccion | |• 3 | .senor+alejamiento | responsabili |• 2 tema+mujer+malo+violencia | |• 1 | denunci+medida+visitas | |• 0 +--.separacion+orden+---------------+-----venirfiscal+------------------+• 1 | pidepresun+ | |• 2 | | |• 3 | | |• 4 | | |• 5 | | |• 6 | | |• 7 | dict+ | |• 8 | | |• 9 | | |• 10| | |• 11| | |• 12| | |• 13| | |• 14| | un |• 15| | |• 16| | levantamient |• 17| | tenerdeten+ libertadforense |• 18| |person+ .. . ..hacercausa+asunto+ |• 19| servicio+ ......judicial+actuacion+ |• 20| guardia+. juezllam+ .. .policiadetenido+ |• 21| | partido+ |• +-----|---------|---------|---------+---------|---------|---------|-----+
ALCESTE
TEXT GARDEN
Example of ClusteringClass 1: Judicial unit
funcionar+ (21), juzgar(26), oficina(11), trabaj+(13), decir(26), llam+(16), mand+(12), acudir(11), adjunto(4), busc+(4), consult+(4), dato(6), hablar(4), jurisprudencia(3), local+(3), material(6), necesit+(7), policia(14), prensa(4), sala(4), funerari+(2), hurto(3), informacion(5), miedo(3), robo(3), servicio+(7), sustitu+(4), tecnico(2), venir(15)
Class 2: Family lawalejamiento(22), malo(22), medida(16), orden+(23), proteccion(17), senor+(13), trat+(22), victima(11), mujer(11), padre(7), denunci+(12), domestico(8), violencia(8), agresor(4), dict+(10), madre(7), marido(6), nino(5), pension(4), psicolog+(5), separacion(5), abus+(5), alimento(3), ayud+(4), casa(3), cautelar+(3), divorcio(2), empresa(3), hijo(4), lesion+(6)
Class 3: Proceedingsescrit+(9), fiscal+(13), instruccion(9), ordinario(5), seguir(11), acumular(5), audiencia-provincia(2), conform+(2), contradictori+(3), criterio+(10), cuantia(5), falt+(7), injusto(3), interpretacion(3), ley(6), motiv+(3), pendiente(2), perito(5)
Class 4: Enforcement (judgment)ejecucion(14), ejecut+(15), embarg+(11), finca+(9), depositar+(6), interes+(6), pago(6), suspension(5), deposito(6), entreg+(6), quiebra(5), sentencia(9), solicit+(9), vehiculo(4), acreedor(3), administracion(4), cantidad(4), conden+(4), cost+(4), dinero(4), edicto(2), imposibilidad(3), multa(3), notificacion(4), pagar+(4)
May 25, 2005 25http://www.sekt-project.com
Stemming: the longest string of characters that is common to different words:
For all the variants of ‘love’, but also for ‘lover’ (noun), ‘lovely’ (adverb), it can offer the stem: lov+
Lemmatization respects the category:
3 different lemma: love (verb), lover (noun) lovely (adv)
If we apply this process to Spanish or Catalan (or every Romanesque language), which have a high flection capacity (60 forms for verbs, without taking into account the composed forms), stemming would hide a lot of information.
Stem Lemaacumulacion acumulaciónacumularseacumularacumul+ ---admision admisiónadmit+ admitircelebracion celebracióncelebr+ celebrarmisma+ mismomismo+ ---suspenderse suspendersuspend+ ---
EXAMPLES
Stemming vs Lemmatization
Quantitative ComparisonStemmed
Corpus
Lemmatized Corpus
Num. different
forms
3074 2064
Num. Ocurren
ces
19861 19946
Max. Freq. Of a form
1230 2208
Hapax 1666 934
• Lemmatized corpus has fewer word-forms than the stemmed version.• The LSI on the lemmatized corpus is able to reconstruct documents better, especially in few dimensions.• The lemmatized corpus clustering is more detailed.
May 25, 2005 27 http://www.sekt-project.com
AIF
B
1. Clustering with stemmed corpus offers us 4 classes:1. ‘On-duty’ actions (mixed with Judicial Office)
(54,06%)2. Proceedings and Trial (18,10%)3. Enforcement (judgements) (14,39%)4. Family Law (gender violence, divorce,
separation…) (13,46%)
2. Clustering with lemmatized corpus is more detailed and offers 6 classes:1. Judicial Office (20,11%)2. ‘On-duty’ actions (27,25%)3. Family Law (gender violence, divorce,
separation…)(14,55%)4. Proceedings (15,61%)5. Trial (8,47%)6. Enforcement (judgements) (14,02%)
Comparision of Clustering Results
May 25, 2005 28 http://www.sekt-project.com
AIF
BTake-Home Messages
• Do text analysis of legal documents!
• If you do that, Do lemmatization!
http://www.sekt-project.com
AIF
B
MethodologyMethodology
May 25, 2005 30 http://www.sekt-project.com
AIF
BInitial Methodology
+ Based on 800 competency questions
+ Questions were clustered
+ Middle-out strategy
– Usage of ontology not considered
– Repetitive discussions
– Long discussions
May 25, 2005 31 http://www.sekt-project.com
AIF
BConsidering the “Why”
• No normative knowledge
• Stick to the questions as sources
• Model the questions, not the answers
May 25, 2005 32http://www.sekt-project.com
Wiki visualization
May 25, 2005 33 http://www.sekt-project.com
AIF
BDiligent Argumentation
Ontology
• Argumentation ontology defined
• Based on Case Studies to identify the most effective types of arguments
• Argument type recognition based on RST
May 25, 2005 34 http://www.sekt-project.com
AIF
BMethodology changes
Using DILIGENT made the ontology engineering…
• … much faster
• … amenable to distributed development
• … better documented
• … trackable
• … better manageable
Also DILIGENT itself got changed!
May 25, 2005 35 http://www.sekt-project.com
AIF
BOutlook
• Better tool support – off-the-shelf wiki had weaknesses
• Moderator support in discussions
• Competency question clustering
• Gathering further experience from legal and other case studies
http://www.sekt-project.com
AIF
B
ArchitectureArchitecture
May 25, 2005 37http://www.sekt-project.com
High Level Requirements
• Judges should not be bothered with a complex user interface. • A simple natural language interface is probably appropriate.
• The decision as to whether a new question is similar to a stored question (with its corresponding answer) should be based on semantics rather than on simple word matching. • An ontology can be used to perform this semantic matching of
questions.• The questions included in the system should be of high
quality.• Be rather exhaustive and reflect the actual situation• As extensive survey with more than 250 Spanish judges forms the
basis for the questions.• Justify the answer provided by the system with existing
Jurisprudence.• Jurisprudence databases.• Metadata and Ontology process of documents.
• Knowledge Management at all levels
May 25, 2005 38 http://www.sekt-project.com
AIF
BExample Question-Answer
• Question: • What problems can we foresee with the analysis of
small amounts of drugs, where the identification test destroys the drugs?
• Answer: • This is an unrepeatable piece of evidence at the trial. In
these cases, the Spanish Criminal Procedure Act states that the adversarial principle should be respected. While the trial proceedings are prepared, the judge must explain to all parties that they may choose an expert to perform these tests.
May 25, 2005 39http://www.sekt-project.com
Court and docket number
Names of the magistrates
Date and place
Prefatory statement
History of the Case
Grounds of Decision
Example of judgment: parts
May 25, 2005 40http://www.sekt-project.com
Question
Answer
FAQFAQ
JudgementJudgement
Summary
Case History
Decision Grounds
Ruling
OPJKOPJK
Practical Practical KnowledgeKnowledgeInstancesInstances
Relations between the Question/Answer &
Judgment
May 25, 2005 41 http://www.sekt-project.com
AIF
BArchitecture
Questions-
Answers
Expert Knowledge
Semantic
Matching
DB 1
Decisions
DB N
Decisions
Ontology Learning
& feeding
Ontology Merging
Jurisprudence
Ontology
Alignment
Web
browserNatural
Language
May 25, 2005 42 http://www.sekt-project.com
AIF
BExpert Knowledge Retrieval
Design - Technological considerations
Ontology
Domain
Detection
Keyword
Matching
Ontology
Grapth
Path Matching
iFAQ System
Multistage Searching Subsystem
Ontology TechnologyNatural Language Processing
Caching subsystem Persistence subsystem
Efi
cien
cyA
ccuracy
May 25, 2005 43http://www.sekt-project.com
Expert Knowledge Retrieval
• Chain of Resposability pattern
FAQ
Candidates
FAQ FAQ FAQ
UserQuestion
iFAQ Search Engine
Ontology Domain
Detection
FAQ
Search Factory
Other search engines ...
Keyword/synonym
matching stage
Ontology graph
path matching
Plugged Searching Stages
May 25, 2005 44http://www.sekt-project.com
Expert Knowledge Retrieval
Ontology
Linking
NLPNL query POS list
(lemmas)
Semantic
Distance
Calculation
Semantic distance
Between queries
Term Coverage
Calculation between
queries
Best match
of stored queries
Ontology
Semantic Similarity: Main steps
May 25, 2005 45 http://www.sekt-project.com
AIF
BExpert Knowledge Retrieval
• The semantic distance is based on the weighted navigation distance between terms in the ontology.
• Navigation through the ontology means that one moves from one concept to another concept, via one of its relations or attributes.• Is a
• Follows
• Actor
• Etc.
• The task of associating distance costs:• Is a domain specific
• Needs to be performed by legal expert.
Semantic Similarity
Ontology
Accuse
Actions
Follow
Denounce
MotherSonSon
Mother
May 25, 2005 46 http://www.sekt-project.com
AIF
BConclusions
• Decision support system for unexperienced judges
• Using Semantic Web technology for handling knowledge• Provide knowledge for decision making process• Capture knowledge from experts• Share knowledge among all users
• Extended understanding capacities• Background knowledge: Professional Legal Ontology• Decision Explanation• Improved Knowledge Acquisition