Meetup NLP #2 season 4 Structuring legal …...Meetup NLP #2 season 4 Structuring legal documents...
Transcript of Meetup NLP #2 season 4 Structuring legal …...Meetup NLP #2 season 4 Structuring legal documents...
Copyright © Doctrine
Meetup NLP #2 season 4
Structuring legal documents
with Deep Learning
Pauline Chavallard2019/11/27
Plan● About Doctrine and legal
research● Motivations● Modeling● Results● Further work
Google for law
Doctrine was created in 2016
Challenges
- volume of data
- heterogeneity
- domain specificity
Legal contents have tons of links
Challenges in data science at Doctrine
Low/weak supervision:
● No labeled data (esp. in French)
High specificity/heterogeneity:
● Language is different between decisions, legislations and commentaries
● Among decisions, depending on courts, structures are different
● Content comes in various formats (papers, images, PDFs, texts)
An example of French court decision
Plan● About Doctrine and legal
research
● Motivations● Modeling● Results● Further work
Motivation
● Four million court decisions delivered each year in France
● Critical information for lawyers
Problem:
● Long and complex documents
● One may be interested only in a very precise part
French court decisions
A french court decision is generally structured following these sections:
● Metadata (« En-tête » in French): court, number, date, etc., of the trial.
● Parties (« Parties » in French): information about the claimants and defendants
● Composition of the court (« Composition de la cour » in French)
● Facts (« Faits » in French): what happened?
● Pleas in law and main arguments (« Moyens » in French): arguments presented by
the claimant and defendant.
● Grounds (« Motifs » in French): reasons and arguments used by the court
● Operative part of the judgment (« Dispositif » in French): final decision
French court decisions - Example
Cour d'appel de Metz, 28 janvier 2015
French court decisions
Unfortunately, there is no mandatory guideline on how to
release a court decision.
Courts may use:
● different styles in term of writing
● different styles in term of organising the documents
● all sections from previous slide, or a subset
The French Court of Appeal usually has a very unified way of
writing: ~55 % have explicit titles for their categories
French Court of Appeal
Extracted from https://www.doctrine.fr/d/CA/Orleans/2007/SKDD824CCFE8D8D9D93128.
The French Court of Appeal usually has a very unified way of
writing: ~55 % have explicit titles for their categories
French Court of Appeal
Extracted from https://www.doctrine.fr/d/CA/Orleans/2007/SKDD824CCFE8D8D9D93128.
Facts
For the remaining 45 %, it’s harder...
French Court of Appeal
Extracted from https://www.doctrine.fr/d/CA/Metz/2015/RAC1261A1563690C06B77
How would an algorithm automatically generate table of contents ?
Plan● About Doctrine and legal
research● Motivations
● Modeling● Results● Further work
Information needed
To complete this task, a human being would take advantage of:
1. The vocabulary used
2. The order of the paragraphs
Information needed
1. The vocabulary used
Not always so obvious, legislation references in both...
-> standard approaches
(BoW - TF-IDF)
encodings performed
poorly
Information needed
1.
2. The order of the paragraphs
● Metadata
● Parties
● Composition of the court
● Facts
● Pleas in law and main arguments
● Grounds
● Operative part of the judgment
-> sequential
information is
important
Modeling
Split decisions into paragraphs (X)
Pre-process Replace rare words by <UNK> with p=0.5
Dataset creation
● Find labeled data from structured decisions with titles
● Remove titles
● Assign each paragraph to its corresponding label (y)
● y ∈ [0, 6]
-> Supervised classification
Looks like Named Entity Recognition... at paragraph scale.
With LSTM / CRF, we capture information from
● paragraph inherent properties
● paragraph context (the neighborhood gives insights on the label)
[1] Neural Architectures for Named Entity Recognition. Lample, Ballesteros, Subramanian, Kawakami, Dyer.
NAACL 2016.
Modeling
Modeling: paragraph embedding
Modeling: paragraph embedding
source: A structured self-attentive sentence embedding
Modeling: all in one
Modeling: all in one
Modeling: all in one
end-to-end training
Plan● About Doctrine and legal
research● Motivations● Modeling
● Results● Further work
Modeling: results
● Trained on 20.000 decisions
Modeling: results
bi-LSTM outperforms mean but is more computation hungry
Modeling: results
CRF outperforms softmax for the same computation cost
Modeling: results
Attention brings a few points for a low computation cost
Modeling: results
CRF enables to watch transition
probabilities:
● Each class followed by itself
● Metadata -> Parties
● Metadata -> Composition
● Low triangle part: green
● High triangle part: red
Modeling: attention
Modeling: attention
Product outcome
On the 45% incomplete table of contents of Court of
Appeal decisions, we now manage to get 90% complete
ones with this approach
Plan● About Doctrine and legal
research● Motivations● Modeling● Results
● Further work
Copyright © Doctrine 38
Errors of the model
Copyright © Doctrine 39
Further work
- better paragraphs / sentences splitting
- one of the tag is very rare, doesn’t perform well
- play with optimizers, dropout, …
- try different architectures ?
Blog post is available
Paragraph classification: article by Doctrine
Thank you for your attention!
Any questions ?