From legal Language to computer language (2009)
-
Upload
radboud-winkels -
Category
Documents
-
view
206 -
download
0
description
Transcript of From legal Language to computer language (2009)
From Legal Language toComputer Language
Radboud Winkels
Emile de Maat
Outline
Leibniz Center for Law
From sources of law to ICT
applications
Structure
References
Content
Empirical results
Conclusions and current
research
09/08/20102
Leibniz Center for Law
Computational Legal Theory and
Legal Knowledge Management
(Formal) Models of:
Legal Knowledge
Sources? Elementary legal concepts?
Constituents of norms, coherence, …
Valid Legal Reasoning
Case assessment, causality, legal comparison,
…
Leibniz Center for Law -2
Applied Topics:
Improve quality of legal products
Legislation; decisions; advises, etc.
Inprove access to legal information
and knowledge
Support teaching and learning of legal
knowledge and skills
Legal organisations and change
management
09/08/2010
“Legal Engineering”
Legislation can be seen as
specification of a
normative system.
Legislation is
underspecified.
It suffers from anomalies:
• inconsistencies
• Circle reasoning
• open evaluative terms
• ambiguities
doctrine
Case law on
legislation
From Sources of Law to ICT Applications
Case law
legislation
Sources
concepts
p1,p2,…
q1,q2,…
norms
Meta-knowledge
GTerm: This
means that and has relations with those
ApplicationsFormal
Models
Tasks and
reasoning
e-CourtCLIME
FOLaw
LLD
Sartor
…LRI-core
O(α І β)
Sources of Law
Most important source of „knowledge‟
Explicite links between sources and
knowledge models essential for:
Validation
Maintenance (traceability)
Justification
Link at right level of detail (granularity)
8/9/2010
From Sources of Law to Formal Models
Automatic support :
Increase quality models and efficiency process
Increase inter-coder reliability
NL text
Structured
text with
explicit and
typed refs
Model of
individual
provisions
Integrated
model of
meaning
Recognizing
and
classifying
Model
fragment
suggestions
Text Structure
Structure Marking
Hoofdstuk 1
Paragraaf 1
Artikel 1
Lid 1
Lid 2
Artikel 2
Artikel 3
Paragraaf 2
…
…
Hoofdstuk 2
Ho
ofd
stu
k
Pa
rag
raa
f
Art
ikel
Lid
Relations between Sources of Law
Legislation
Adm. Case
Law
Case Law
Doctrine
Characteristics of Sources of Law
Legislation
Precise grammar for reference, clear
identity and version criteria
(Adm.) Case Law
Precise grammar for reference, precise
identity, no versions
Doctrine
Sloppy reference, no identity markings,
sloppy versioning
The Structure of References: Simple References
Simple references
Name
Customs Law
Label and number
Article 1
Label, number and publication date
The law of April 13th, 2006
Indirect references
That article
The Structure of References: Complex References
Multi-valued references
Articles 1, 5 and 12
Multi-layered references
Customs Law, article 5, first member
Multi-valued, multi-layered references
Customs Law, articles 1, 5, first
member, and 12
The Structure of References: Ordering
Zooming in
Customs Law, article 5, first member
Zooming out
first member, article 5, Customs Law
Zooming in, then zooming out
article 5, first member, Customs Law
The Structure of References: Miscellaneous
Opening words
Article 12, opening words and parts 1
and 2
Exceptions
Articles 5-21, with the exception of
article 9
Each time
Articles 5-10, each time the first
member
Complete and incomplete references
Complete references
Does mention the document that is being
referred to
Customs Law, article 5, first member
Incomplete references
Does not mention the document that is
being referred to
Article 5, first member
Finding references
Use context-free grammar, e.g:<article>
“article”
<designation> [[“,”] <lower_level>]
[ “-“ <designation> [[“,”] <lower_level>]
[
( [“,”]<designation> [[“,”] <lower_level>])*
“and” <designation> [[“,”] <lower_level>]
]
[[“,”] [“or”] <higher_level>]
Problems
Names cannot be recognised
Add names as a list to the grammar
Headings will (falsely) be recognised
as a reference
Mark headings beforehand; use Metalex as
input
Resolving references
Incomplete references
Reference needs to be completed from
context
Within a regulation, an incomplete
reference refers to the regulation itself
Within commentaries, incomplete
reference refer back to an earlier made
complete reference
Automatic Parsing
1. Determine identity source
In doc: Title, citation title
In metadata
2. Parse document
“Natural language” – model sentences
3. Find references
4. Determine type reference
E.g. attribution and delegation of power;
definitions; enactment; change
5. Determine identity goal
I.e. the thing it refers to
Results simple parser
99% of all simple references correctly
identified
95% of all complex references
correctly identified
Few false positives
Works adapted for Flemish law
Opsomer (2009)
Causes of errors
Failing to detect a reference
Missing labels or names
Textual errors
False positives
Homonyms: a label has a second
meaning in addition to being part of a
reference
the first member
Conclusions
Automatic detection of references is
entirely feasible
No complicated methods are needed;
regular grammars may suffice
8/9/2010
From Sources of Law to Formal Models
From structured text to models of individual
sentences…
NL text
Structured
text with
explicit and
typed refs
Model of
individual
provisions
Integrated
model of
meaning
Recognizing
and
classifying
Model
fragment
suggestions
Towards Automatic modelling
Automatic modelling – Sentences (1)
Start with sentences
Independent unit.
Often marked, otherwise easy to
recognize
Different types of sentences require
different translation, different model
Conclusions From Earlier Research
Dutch Law:
Provisions usually match one sentence
Several types of sentences can be easily
distinguished
Limited amount of language constructs
per type
Automatic recognition and
classification seems doable
Types not specific for Dutch law
(cf. Tiscornia e.a. for Italian law)
Categories
1. Definitions
2. Deeming Provision
3. Norm –
Right/Permission
4. Norm –
Obligation/Duty
5. Application
Provision
6. Value Assignment
7. Change*
8. Delegation
9. Enactment Date
10.Citation Title
11.Penalization
Each category uses specific language
constructs that can be used to identify
them.
Example: Penalisation Provision
Penalisation provisions set punishments
for breaking the law, and mark such an
act as either a misdemeanour or a crime.
Mining Act, article 133
1.Breaking article 43, sub 2, is punished
with a monetary fine of the second
category.
2.The fact marked as punishable by this
article is a misdemeanour.
Example: Norms (1)
Normative sentences form the core of
each regulation, stating obligations
and rights
Rights can be denoted by a wide
range of verbs: can, may, is allowed
to, has a right to, …
Similarly, obligations can be denoted
by the use of certain verbs: is
prohibited, is charged with
Many variations
Example: Norms (2)
However, obligations are often represented
as a “statement of fact”
Funeral Act, article 46, section 1
No bodies are interred on a closed cemetery.
May be about any subject
No common signal words or patterns
Preferred by the Guidelines for Legal
Drafting
Experiment (1)
Classifier
Based on 88 patterns
JAVA
Based on input in which
sentences and quoted text have
already been marked (MetaLex)
Assumes a statement of fact
norm if no explicit pattern is used
Experiment (2) - Lists
Lists are classified based on its header, if this contains a pattern; otherwise, each item is classified (without the header)
Tobacco Act, article 1
In this law, and in the stipulations based on it, is understood by:
a. tobacco products: … ;
b. Our Minister: …;
c. appendix: …;
…
Experiment – Test Set
18 texts
One royal decree
Three new bills
Fourteen amending bills
All „recent‟
No overlap with the training set
654 sentences
592 „regular‟ sentences
62 lists
Results per Document (1)
Source
Sentence List
TypeTotal Correct % Total Correct Partial %
Royal Decree Stb.
1945, F 214
26 23 97% 4 4 0 75%New
Bill 20 585 nr. 2 31 30 97% 4 3 1 75% New
Bill 22 139 nr. 2 22 20 91% 2 2 100% New
Bill 27 570 nr. 4 21 16 76% Change
Bill 27 611 nr. 2 11 11 100% 1 1 100% Change
Bill 30 411 nr. 2 141 128 91% 25 20 3 80% New
Bill 30 435 nr. 2 40 39 98% 4 3 1 75% Change
Bill 30 583 nr. A 27 27 100% Change
Bill 31 531 nr. 2 3 3 100% Change
Relative low score due to
a misapplied pattern (3x)
Results per Document (2)
Source
Sentence List
TypeTotal Correct % Total Correct Partial %
Bill 31 537 nr. 2 29 29 100% 2 2 0 100% Change
Bill 31 540 nr. 2 7 7 100% Change
Bill 31 541 nr. 2 8 8 100% Change
Bill 31 713 nr. 2 7 6 86% 2 2 0 100% Change
Bill 31 722 nr. 2 31 22 71% 6 5 0 83% Change
Bill 31 726 nr. 2 78 67 86% 2 1 1 50% Change
Bill 31 832 nr. 2 7 7 100% 3 3 100% Change
Bill 31 833 nr. 2 4 4 100% Change
Bill 31 835 nr. 2 99 90 91% 7 4 3 57% Change
Total 592 537 91% 62 50 9 81%
Relative low score due to
a pattern appearing in an
auxiliary sentence (5x)
Overall Results
91% of all regular sentences have
been correctly classified
71%-100% over laws
81% of all lists have been correctly
classified
50%-100% over laws
Results per Type (1)
Type In corpus Missed False
Definition 2% 12 1 0
Norm - Right/Permission 11% 64 4 13
Norm - Duty 5% 29 0 1
Delegation 3% 19 6 0
Publication Provision 1% 4 0 0
Application Provision 7% 40 1 8
Enactment Date 3% 17 1 0
Citation Title 1% 3 0 0
Value Assignment/Change 0% 1 0 0
Penalisation 0% 0 0 2
Change 41% 241 16 8
Mixed Type 1% 3 3 0
Norm - Statement of Fact
(default) 27% 159 23 23
Total 592 55 55
Results per Type (2)
Mostly norms and
modifications
right/permission 11%
obligation/duty 27% + 5%
change 41%
Several definitions and
application provisions
Barely any of the others
Results – Patterns Used
TypePatterns
Known
Patterns
Used
Definition 14 5
Norm - Right/Permission 17 3
Norm - Obligation/Duty 15 8
Delegation 7 5
Publication Provision 1 1
Application Provision 5 5
Enactment Date 1 1
Citation Title 2 2
Value Assignment 8 1
Penalisation 3 1
Change - Scope 2 2
Change - Insertion 4 4
Change - Replacement 3 3
Change - Repeal 2 1
Change - Renumbering 3 2
87 44
About 50% of the
known patterns has
been used
Difference in age
between test and
training set?
Underrepresented
sentence types
Problems (1)
Patterns appearing in auxiliary
sentences instead of the main
sentence
Mostly happens with rights and
application provisions:
If x has the right to …
If x is able to …
If x applies …
Problems (2)
Lists need a more serious approach Some can be classified by the header
only;
Some can be classified by the list item only;
Some can only be classified by the header combined with the item.
Lists need to be converted to individual sentences (header plus list item)
Minor problems
Missing patterns
Mixed sentences
Difficult to solve, but does not occur often
Patterns used for other purposes
Repeal of fines instead of repeal of
regulations
Specific patterns for specific laws
E.g. Tax Law (value assignment)
Conclusions
This (symbolic) approach is feasible
Using obligation as a default category
seems acceptable
No major categories are missing
We expect it to generalise to other
Dutch regulations
The approach could be used for other
(civil) jurisdictions and languages
Biagioli et al. (2005) similar results for
Italian law but statistical approach
8/9/2010
Next Step
NL text
Structured
text with
explicit and
typed refs
Model of
individual
provisions
Integrated
model of
meaning
Recognizing
and
classifying
Model
fragment
suggestions
Next Step
Divide sentence in different terms that
are linked through relations
Classification (and base pattern) gives
a rough division, and a rough relation
More detailed division of the
sentences is needed
Using of Dutch grammar parsers
Current Research (1)
Automatic modelling – Reference parser
References are important in legal texts
Useful when the computer understands
these better
Better understanding is possible
References do not fit well in “normal
Dutch sentence structure”
Separate reference parser
Things to think about – Granularity
Granulary – How far do we want to go
with the splitting of text?
Liquor: those drinks, that, at a
temperature of twenty degrees
Celsius, consist of alcohol for at least
fifteen volume percents, with the
exception of wine.
Thinks to think about – Norms
Classification distinguishes only a
limited set of norms
Do we need more distinction?
For computer calculations?
For interaction with the user
Things to think about - Procedures
Procedures use the same language
constructs as other norms (at least in
Dutch), but:
Procedures have a more specific
context
Procedures have a stronger ordering
Overall Conclusions
Distance from Legal Language to Computer
Language is too big to cross in one step
Automatic modelling support is already
partially possible:
Structure and References
Classification of sentences in legislation
Generalisation to all Dutch legislation
possible
Same method for other languages and
jurisdictions
Generalisation to other sources of law more
difficult