(LP) , an Adaptive Algorithm for IE from Web-related Texts · 2006-04-08 · (LP)2, an Adaptive...

(LP)2, an Adaptive Algorithm for IE from Web-related Texts

Using Shallow NLP in Adaptive IE from Web-related Texts

Fabio CiravegnaDepartment of Computer Science

University of Sheffield(F.Ciravegna@dcs.shef.ac.uk)

Covering algorithm for adaptive IE– Wrapper Induction-like algorithm

• effective on structured texts• effective on free texts• Uses shallow NLP for reducing data sparseness

– (LP)2 is the basis for Amilcare!

(LP)2=Learning Patterns via Language Processing

Talk Overview• Introduction• Rule Induction

• Tagging and Correction Rules

• Data Sparseness & Overfitting• NLP-based Rule Generalisation• Rule Set Pruning

• Discussion• Role of NLP-based Generalisation • Experimental Results

• Conclusion & Future Work

Wrapper-like approaches

• Wrapper-like approaches– Effective on (semi-)structured texts– Less effective on free texts

• Rules match words• Scarce or no NLP knowledge • Data sparseness

A simple wrapper (LP)

CONDITION: a pattern of conditions on a connected sequence of words

ACTION: operate on single SGML tags (inserting or shifting tags)

</speaker> inserted independently from <speaker>

Use of Rules for IEIE on test corpus

– Tagging Rules insert tags– Contextual Rules add further tags:

•inserted tags used as context •until no tags are added

– Correction Rules shift misplaced tags

– Tag Validation•removes uncoupled tags

Tagging Rule: example

Condition on Words

Action: Insert Tag

the seminar

at <time> 4

pm will

the seminar at <time> 4 pm will

Initial rule= window of conditions on words

Contextual Rules• Add recall• Low-precision/high-recall rules• Used

– when best rules not able to identify/shift some information • e.g., missing end or start tag

• Surrounding tags used to constrain rule application

The seminar will be given by <speaker> A. Padua </speaker> at <time> 12.30

Contextual Rule: Example

RuleWord=* </speaker> word=at

•is not a Best Rule (high recall/low precision)

•is reliable if applied to close an open <speaker> only!

Imprecision in TaggingBorder identification can be tricky• Up to 5% imprecision for some

information

• Correction rules shift tags misplaced by tagging rules– Positive examples: tags misplaced within a

distance d from the correct position– Negative examples: the correct tags

Correction Rule: ExampleThe seminar at 4 </stime> PM will be held in Room 201

Initial Rule: Condition Action

word wrong tag correct tag

at 4 </stime>

pm </stime>

120160200240280320360400440480520560

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Cases Covered

esData Sparseness

Rules relying on word matching tend to:• Report low coverage

• Not able to relate cases

• Overfit training corpus

50% rules cover 1 case71% cover up to 2 cases

Overfitting• Rules with limited coverage overfit the

training corpus– Sensitiveness to mistakes in tagging

• Some good rules not accepted – Low recall at testing time

• Some bad rules accepted– Low precision at testing time

• Solution:– Generalise rules to cover larger number of

cases– Prune rule sets

Rule Generalisation

• Each instance is generalised by reducing its pattern in length

• Generalizations are tested on training corpus

• best k rules generated from each instance reporting: – Smallest error rate (wrong/matches)– Greatest number of matches– Cover different examples

Implemented as a general to specific beam search with pruning.

See paper for details

Word1=at, action=<stime>matches: 0,12,15,44,72,134,146,230,250

action=<stime>, Word1=4matches: 12,27,72,112,134,230,245

Word1=at, action=<stime>, Word2=4matches: 12,72,134

Word1=seminar,Word2=at action=<stime>matches: 0,12,134

action=<stime>, Word1=4,Word2=pmmatches: 12,134

Word1=seminar,Word2=at, action=<stime>, Word3=4matches: 12,134

Word1=at, action=<stime>, Word2=4Word3=pmmatches: 12,134

Word1=seminar,Word2=at, action=<stime>, Word3=4Word3=pmmatches: 12,134

action=<stime>matches: all the corpus

Word1=seminar,Word2=* action=<stime>matches: 0,4,12,134,222,232

action=<stime>,Word1=*Word2=pmmatches: 12,27,72,134,245

r1 r2 r3 r4

r5 r6 r7

Rule Pruning

• Generation time: prune rule if– error rate>threshold– coverage too small– coverage subsumed by more general

rule with equivalent error rate• Testing time:

– N-cross folder cycle on training corpus• Unreliable rules removed• Error thresholds tuning

NLP-based Adaptive IE

• NLP-based approaches– Effective on newspaper articles (e.g. MUC)– Ineffective on other types of texts

• Why?– NLP knowledge may be inadequate

• Sublanguage phenomena• Need of adaptation of generic knowledge

– Need an IE expert

– Unparsability (e.g. HTML pages)

Use of NLP in Wrappers

• Need:– Bridging the gap between NLP-based and

Wrapper-like approaches• Proposed solution:

– Wrapper • has generic NLP knowledge• learns the level of NLP useful for the task at hand

by measuring its effectiveness on the training corpus

Wrapper with NLP-based Generalisation: (LP)2

Conditions on words are replaced by: – Information from shallow NLP modules

• Capitalisation• Morphological analysis

– Generalizes over gender/number• POS tagging

– Generalizes over lexical categories• User-defined dictionary or gazetteer

• All generalizations are tested and the best k derived from each instance are selected

Effect of Generalization(1)Effectiveness with NLP

Slot (LP)2G (LP)2NGspeaker 72.1 14.5 location 74.1 58.2

stime 100 97.4 etime 96.4 87.1

All slots 89.7 78.2

With comparable effectiveness on training corpus

} Most Interesting

020406080

100120140160180200220240260280300320340360380400420440460480500520540560

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Cases Covered

Generalisation

No Generalisation

Effect of Generalization(2)Reduction in Data Sparseness

NLP-based generalisation14% rules cover 1 case42% cover up to 2 cases

Non-NLP50% rules cover 1 case71% cover up to 2 cases

Effect of Generalisation (3)

<speaker> no Generalization

100 150 200 243Number of training texts

Series1

Overfitting<speaker> with generalisation

100 150 200 243number of training texts

Series1

Effect of Generalisation(4)Etime

100 150 200 243

Training Texts

Non generalisationGeneralisation

Converges rapidly to best value

Needs more training data

Effect of Generalization (3)

(LP)2G (LP)2NG

Average rule coverage 10.2 6.2 Selected rules 887 1136

Rules covering 1 case 133 (14%) 560 (50%)

Rules covering >50 cases 37 15

Rules covering >100 cases 19 3

102030405060708090

100 150 200 243

Speaker Gen Speaker NoGenetime GEN etime NoGen

Effect of Generalisation (4)A

No of training texts

<speaker>Overfitting

102030405060708090

100 150 200 243

Speaker Gen Speaker NoGenetime GEN etime NoGen

<etime.>ImmediateConvergence

Best Level of Generalization• Rules undergo selection during Learning

– NLP-based generalization is accepted only if useful and does not introduce errors

– Best level of NLP generalization is automatically selected

• Heuristics– If rule matches > n cases then generalization

is used– Otherwise generalization is refused

Best level of Generalization(2)

• Example– ITC seminar announcements (mixed I/E)

• Date, time, location in Italian• Speaker, title and abstract in English

– English POS also for the Italian part– NLP-based outperforms other version

• Same accuracy when NLP unreliable– Date, time, location

• Better accuracy when NLP is reliable– Speaker, title

Conclusion & Future Work

• (LP)2 selects the correct level of NLP processing needed for the task at hand

• Generalisation to be extended to:– Parsing: syntactic relations as alternative

adjacency relations– Discourse processing

(LP)2: Experimental results

• Ported to – 6 scenarios– 2 languages– 2 text types (free and semi-structured)

• Naïve version used in LearningPinocchio– a commercial system for adaptive IE

• The basis for Amilcare

Experiments(2)

• Excellent results wrt state of art• CMU seminar announcements (E)• TX-Austin Job Announcements (E)

• Application to other corpora• ITC seminar announcements (mixed I/E)

• Commercial Applications• “Tombstone” data from professional resumees (E)• IE from financial news (I)• IE from classified ads (I)• (…)

CMU: detailed results

(LP)2 BWI HMM SRV Rapier Whiskspeaker 77.6 67.7 76.6 56.3 53.0 18.3 location 75.0 76.7 78.6 72.3 72.7 66.4

stime 99.0 99.6 98.5 98.5 93.4 92.6 etime 95.5 93.9 62.1 77.9 96.2 86.0

All Slots 86.0 83.9 82.0 77.1 77.3 64.9

1. Best overall accuracy 2. Best result on speaker field3. No results below 75%

Jobs: Detailed Results

84.14 75.14

Algorithm’s Pros

• Single tag rules• Contextual rules• Correction rules

• NLP-based generalization: – Reduces data sparseness – Reduces needed training corpus size

Thank You!Amilcare

(LP) , an Adaptive Algorithm for IE from Web-related Texts · 2006-04-08 · (LP)2, an Adaptive...

Documents

Transcript of (LP) , an Adaptive Algorithm for IE from Web-related Texts · 2006-04-08 · (LP)2, an Adaptive...

DSTI/ICCP/IE(2004)17/FINAL Working Party on the ... · Unclassified DSTI/ICCP/IE(2004)17/FINAL ... related e-business solutions will become an increasing focus for on line ... logistics

Adaptive Multiple-Model Switching Control of Robotic Manipulators 2016. 12. 7. · related to robotic manipulators. Being an important methodology in control area, adaptive control

Adaptive Adaptive Adaptive Indexing

IE Curricullum - Bilkent University · •IE 460 Quantitative Models in Supply Chain Mgmt •IE 461 Supply Chain Management •IE 462 Int. to Advanced Manufac. Technologies •IE

On the Convergence of Adaptive Gradient Methods for ...ON THE CONVERGENCE OF ADAPTIVE GRADIENT METHODS FOR NONCONVEX OPTIMIZATION RMSProp. Other related algorithms include AdaDelta

BBA13SEA.TXT - Notepadcollegecirculars.unipune.ac.in/College Summary Statements and Stud… · bba13sea.txt (1840000917262 ) ie 301 ie 302 ie 303 ie 304 ie 305 ie 306 25016 1011400006

BBA08SEA 2015 1collegecirculars.unipune.ac.in/College Summary Statements...2015/03/09 · 20064 1011300072 CHETAN KUMAR SAH MINA 6 FR (150008558298 ) IE 518 IE 611 IE 612 IE 613 IE

IE 607 Constrained Design: Using Constraints to Advantage in Adaptive Optimization in Manufacturing.

RECOMMENDATIONS RELATED TO BROWNS FERRY FIRE · 2012-11-29 · P; NUREG-0050 IE RECOMMENDATIONS RELATED F TO BROWNS FERRY FIRE Report By Special Review Group Harold E. Collins Saul

Modelling of a Buck converter with adaptive modulation and …tesi.cab.unipd.it/48782/1/tesi_andrea_ceccato.pdf · 2015. 5. 22. · adaptive modulation and design of related driver

Adaptive Programming application guide - …€¦ · ABB industrial drives Application guide Adaptive Programming. List of related manuals You can find manuals and other product documents

MHC Adaptive Divergence between Closely Related and ...159) Blais...and fishes [20,21]. Some of the most spectacular examples of tropical biodiversity are the adaptive radiations of

ASKED QUESTIONS RELATED TO ADAPTIVE …...FREQUENTLY ASKED QUESTIONS RELATED TO ADAPTIVE SPORTS GRANT PROGRAM FUNDING OPPORTUNITY NUMBER: VA-ASG-2019 …

Bull-ie Wool-ie

IEX Bulletin VOlUME 5 India... · 07bUll Power Market Update: February’18 bUllETIN llET IE IE lETIN IE IE IE IE IE IE IE IE bUllETIN bUllETIN bUllETIN IE bUll bUllETIN ll bUll 08bUll

Adaptive differentiation of traits related to resource use in a desert ...

The Virtual Workshop: An on-line adaptive resource for ... · and implement a number of Adaptive Toolboxes, made up of Adaptive Tutorials to teach in design-related engineering and

THE IE COLLECTION Furniture July... · 2020-06-29 · IE-13T IE Dining Table 1.3m - Kettle £359.00 Order Only IE-18BEN IE Dining Bench 1.8m - Kettle £199.00 Order Only IE-18T IE

Distributed Adaptive Simulations using Structured Adaptive ...parashar/Papers/samr-siam-pp-06.pdf · Related Work: SAMR Infrastructures • SAMRAI, Lawrence Livermore National Lab

Naam: - Collegiate...dr ie v ie r t ie n f ie ts d ie p t ie r m ie r d ie r s ie k d ie f p ie p s ie n br ie f pap ie r w ie n ie ie PIEP! PIEP! sounds like “ i ” in English