From legal Language to computer language (2009)

Post on 22-Jun-2015

207 views 0 download

Tags:

description

Overview of our research to (semi-)automatically get from sources of law in natural language to formal computer models of these sources.

Transcript of From legal Language to computer language (2009)

From Legal Language toComputer Language

Radboud Winkels

Emile de Maat

Outline

Leibniz Center for Law

From sources of law to ICT

applications

Structure

References

Content

Empirical results

Conclusions and current

research

09/08/20102

Leibniz Center for Law

Computational Legal Theory and

Legal Knowledge Management

(Formal) Models of:

Legal Knowledge

Sources? Elementary legal concepts?

Constituents of norms, coherence, …

Valid Legal Reasoning

Case assessment, causality, legal comparison,

Leibniz Center for Law -2

Applied Topics:

Improve quality of legal products

Legislation; decisions; advises, etc.

Inprove access to legal information

and knowledge

Support teaching and learning of legal

knowledge and skills

Legal organisations and change

management

Norms and Language

09/08/20105

09/08/2010

“Legal Engineering”

Legislation can be seen as

specification of a

normative system.

Legislation is

underspecified.

It suffers from anomalies:

• inconsistencies

• Circle reasoning

• open evaluative terms

• ambiguities

doctrine

Case law on

legislation

From Sources of Law to ICT Applications

Case law

legislation

Sources

concepts

p1,p2,…

q1,q2,…

norms

Meta-knowledge

GTerm: This

means that and has relations with those

ApplicationsFormal

Models

Tasks and

reasoning

e-CourtCLIME

FOLaw

LLD

Sartor

…LRI-core

O(α І β)

Sources of Law

Most important source of „knowledge‟

Explicite links between sources and

knowledge models essential for:

Validation

Maintenance (traceability)

Justification

Link at right level of detail (granularity)

8/9/2010

From Sources of Law to Formal Models

Automatic support :

Increase quality models and efficiency process

Increase inter-coder reliability

NL text

Structured

text with

explicit and

typed refs

Model of

individual

provisions

Integrated

model of

meaning

Recognizing

and

classifying

Model

fragment

suggestions

Structure Marking

Hoofdstuk 1

Paragraaf 1

Artikel 1

Lid 1

Lid 2

Artikel 2

Artikel 3

Paragraaf 2

Hoofdstuk 2

Ho

ofd

stu

k

Pa

rag

raa

f

Art

ikel

Lid

Relations between Sources of Law

Legislation

Adm. Case

Law

Case Law

Doctrine

Characteristics of Sources of Law

Legislation

Precise grammar for reference, clear

identity and version criteria

(Adm.) Case Law

Precise grammar for reference, precise

identity, no versions

Doctrine

Sloppy reference, no identity markings,

sloppy versioning

The Structure of References: Simple References

Simple references

Name

Customs Law

Label and number

Article 1

Label, number and publication date

The law of April 13th, 2006

Indirect references

That article

The Structure of References: Complex References

Multi-valued references

Articles 1, 5 and 12

Multi-layered references

Customs Law, article 5, first member

Multi-valued, multi-layered references

Customs Law, articles 1, 5, first

member, and 12

The Structure of References: Ordering

Zooming in

Customs Law, article 5, first member

Zooming out

first member, article 5, Customs Law

Zooming in, then zooming out

article 5, first member, Customs Law

The Structure of References: Miscellaneous

Opening words

Article 12, opening words and parts 1

and 2

Exceptions

Articles 5-21, with the exception of

article 9

Each time

Articles 5-10, each time the first

member

Complete and incomplete references

Complete references

Does mention the document that is being

referred to

Customs Law, article 5, first member

Incomplete references

Does not mention the document that is

being referred to

Article 5, first member

Finding references

Use context-free grammar, e.g:<article>

“article”

<designation> [[“,”] <lower_level>]

[ “-“ <designation> [[“,”] <lower_level>]

[

( [“,”]<designation> [[“,”] <lower_level>])*

“and” <designation> [[“,”] <lower_level>]

]

[[“,”] [“or”] <higher_level>]

Problems

Names cannot be recognised

Add names as a list to the grammar

Headings will (falsely) be recognised

as a reference

Mark headings beforehand; use Metalex as

input

Resolving references

Incomplete references

Reference needs to be completed from

context

Within a regulation, an incomplete

reference refers to the regulation itself

Within commentaries, incomplete

reference refer back to an earlier made

complete reference

Automatic Parsing

1. Determine identity source

In doc: Title, citation title

In metadata

2. Parse document

“Natural language” – model sentences

3. Find references

4. Determine type reference

E.g. attribution and delegation of power;

definitions; enactment; change

5. Determine identity goal

I.e. the thing it refers to

Results simple parser

99% of all simple references correctly

identified

95% of all complex references

correctly identified

Few false positives

Works adapted for Flemish law

Opsomer (2009)

Causes of errors

Failing to detect a reference

Missing labels or names

Textual errors

False positives

Homonyms: a label has a second

meaning in addition to being part of a

reference

the first member

Conclusions

Automatic detection of references is

entirely feasible

No complicated methods are needed;

regular grammars may suffice

8/9/2010

From Sources of Law to Formal Models

From structured text to models of individual

sentences…

NL text

Structured

text with

explicit and

typed refs

Model of

individual

provisions

Integrated

model of

meaning

Recognizing

and

classifying

Model

fragment

suggestions

Towards Automatic modelling

Automatic modelling – Sentences (1)

Start with sentences

Independent unit.

Often marked, otherwise easy to

recognize

Different types of sentences require

different translation, different model

Conclusions From Earlier Research

Dutch Law:

Provisions usually match one sentence

Several types of sentences can be easily

distinguished

Limited amount of language constructs

per type

Automatic recognition and

classification seems doable

Types not specific for Dutch law

(cf. Tiscornia e.a. for Italian law)

Categories

1. Definitions

2. Deeming Provision

3. Norm –

Right/Permission

4. Norm –

Obligation/Duty

5. Application

Provision

6. Value Assignment

7. Change*

8. Delegation

9. Enactment Date

10.Citation Title

11.Penalization

Each category uses specific language

constructs that can be used to identify

them.

Example: Penalisation Provision

Penalisation provisions set punishments

for breaking the law, and mark such an

act as either a misdemeanour or a crime.

Mining Act, article 133

1.Breaking article 43, sub 2, is punished

with a monetary fine of the second

category.

2.The fact marked as punishable by this

article is a misdemeanour.

Example: Norms (1)

Normative sentences form the core of

each regulation, stating obligations

and rights

Rights can be denoted by a wide

range of verbs: can, may, is allowed

to, has a right to, …

Similarly, obligations can be denoted

by the use of certain verbs: is

prohibited, is charged with

Many variations

Example: Norms (2)

However, obligations are often represented

as a “statement of fact”

Funeral Act, article 46, section 1

No bodies are interred on a closed cemetery.

May be about any subject

No common signal words or patterns

Preferred by the Guidelines for Legal

Drafting

Experiment (1)

Classifier

Based on 88 patterns

JAVA

Based on input in which

sentences and quoted text have

already been marked (MetaLex)

Assumes a statement of fact

norm if no explicit pattern is used

Experiment (2) - Lists

Lists are classified based on its header, if this contains a pattern; otherwise, each item is classified (without the header)

Tobacco Act, article 1

In this law, and in the stipulations based on it, is understood by:

a. tobacco products: … ;

b. Our Minister: …;

c. appendix: …;

Experiment – Test Set

18 texts

One royal decree

Three new bills

Fourteen amending bills

All „recent‟

No overlap with the training set

654 sentences

592 „regular‟ sentences

62 lists

Results per Document (1)

Source

Sentence List

TypeTotal Correct % Total Correct Partial %

Royal Decree Stb.

1945, F 214

26 23 97% 4 4 0 75%New

Bill 20 585 nr. 2 31 30 97% 4 3 1 75% New

Bill 22 139 nr. 2 22 20 91% 2 2 100% New

Bill 27 570 nr. 4 21 16 76% Change

Bill 27 611 nr. 2 11 11 100% 1 1 100% Change

Bill 30 411 nr. 2 141 128 91% 25 20 3 80% New

Bill 30 435 nr. 2 40 39 98% 4 3 1 75% Change

Bill 30 583 nr. A 27 27 100% Change

Bill 31 531 nr. 2 3 3 100% Change

Relative low score due to

a misapplied pattern (3x)

Results per Document (2)

Source

Sentence List

TypeTotal Correct % Total Correct Partial %

Bill 31 537 nr. 2 29 29 100% 2 2 0 100% Change

Bill 31 540 nr. 2 7 7 100% Change

Bill 31 541 nr. 2 8 8 100% Change

Bill 31 713 nr. 2 7 6 86% 2 2 0 100% Change

Bill 31 722 nr. 2 31 22 71% 6 5 0 83% Change

Bill 31 726 nr. 2 78 67 86% 2 1 1 50% Change

Bill 31 832 nr. 2 7 7 100% 3 3 100% Change

Bill 31 833 nr. 2 4 4 100% Change

Bill 31 835 nr. 2 99 90 91% 7 4 3 57% Change

Total 592 537 91% 62 50 9 81%

Relative low score due to

a pattern appearing in an

auxiliary sentence (5x)

Overall Results

91% of all regular sentences have

been correctly classified

71%-100% over laws

81% of all lists have been correctly

classified

50%-100% over laws

Results per Type (1)

Type In corpus Missed False

Definition 2% 12 1 0

Norm - Right/Permission 11% 64 4 13

Norm - Duty 5% 29 0 1

Delegation 3% 19 6 0

Publication Provision 1% 4 0 0

Application Provision 7% 40 1 8

Enactment Date 3% 17 1 0

Citation Title 1% 3 0 0

Value Assignment/Change 0% 1 0 0

Penalisation 0% 0 0 2

Change 41% 241 16 8

Mixed Type 1% 3 3 0

Norm - Statement of Fact

(default) 27% 159 23 23

Total 592 55 55

Results per Type (2)

Mostly norms and

modifications

right/permission 11%

obligation/duty 27% + 5%

change 41%

Several definitions and

application provisions

Barely any of the others

Results – Patterns Used

TypePatterns

Known

Patterns

Used

Definition 14 5

Norm - Right/Permission 17 3

Norm - Obligation/Duty 15 8

Delegation 7 5

Publication Provision 1 1

Application Provision 5 5

Enactment Date 1 1

Citation Title 2 2

Value Assignment 8 1

Penalisation 3 1

Change - Scope 2 2

Change - Insertion 4 4

Change - Replacement 3 3

Change - Repeal 2 1

Change - Renumbering 3 2

87 44

About 50% of the

known patterns has

been used

Difference in age

between test and

training set?

Underrepresented

sentence types

Problems (1)

Patterns appearing in auxiliary

sentences instead of the main

sentence

Mostly happens with rights and

application provisions:

If x has the right to …

If x is able to …

If x applies …

Problems (2)

Lists need a more serious approach Some can be classified by the header

only;

Some can be classified by the list item only;

Some can only be classified by the header combined with the item.

Lists need to be converted to individual sentences (header plus list item)

Minor problems

Missing patterns

Mixed sentences

Difficult to solve, but does not occur often

Patterns used for other purposes

Repeal of fines instead of repeal of

regulations

Specific patterns for specific laws

E.g. Tax Law (value assignment)

Conclusions

This (symbolic) approach is feasible

Using obligation as a default category

seems acceptable

No major categories are missing

We expect it to generalise to other

Dutch regulations

The approach could be used for other

(civil) jurisdictions and languages

Biagioli et al. (2005) similar results for

Italian law but statistical approach

8/9/2010

Next Step

NL text

Structured

text with

explicit and

typed refs

Model of

individual

provisions

Integrated

model of

meaning

Recognizing

and

classifying

Model

fragment

suggestions

Next Step

Divide sentence in different terms that

are linked through relations

Classification (and base pattern) gives

a rough division, and a rough relation

More detailed division of the

sentences is needed

Using of Dutch grammar parsers

Current Research (1)

Automatic modelling – Reference parser

References are important in legal texts

Useful when the computer understands

these better

Better understanding is possible

References do not fit well in “normal

Dutch sentence structure”

Separate reference parser

Things to think about – Granularity

Granulary – How far do we want to go

with the splitting of text?

Liquor: those drinks, that, at a

temperature of twenty degrees

Celsius, consist of alcohol for at least

fifteen volume percents, with the

exception of wine.

Thinks to think about – Norms

Classification distinguishes only a

limited set of norms

Do we need more distinction?

For computer calculations?

For interaction with the user

Things to think about - Procedures

Procedures use the same language

constructs as other norms (at least in

Dutch), but:

Procedures have a more specific

context

Procedures have a stronger ordering

Overall Conclusions

Distance from Legal Language to Computer

Language is too big to cross in one step

Automatic modelling support is already

partially possible:

Structure and References

Classification of sentences in legislation

Generalisation to all Dutch legislation

possible

Same method for other languages and

jurisdictions

Generalisation to other sources of law more

difficult

09/08/201059

winkels@uva.nl e.demaat@uva.nl

www.LeibnizCenter.org

Questions?