Semi-supervised methods of text processing, and an application to medical concept extraction Yacine...
-
Upload
joleen-jones -
Category
Documents
-
view
216 -
download
1
Transcript of Semi-supervised methods of text processing, and an application to medical concept extraction Yacine...
![Page 1: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/1.jpg)
Semi-supervised methods of text processing, and an application to
medical concept extraction
Yacine JerniteText-as-Data seriesSeptember 17. 2015
![Page 2: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/2.jpg)
What do we want from text?
1. Extract information
2. Link to other knowledge sources
3. Use knowledge (Wikipedia, UpToDate,…)
![Page 3: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/3.jpg)
How do we answer those questions?
1. What do people talk about on social media, and how? (Sentiment analysis)
2. What actions are described in a news article? (Semantic parsing)
3. In a medical setting: what symptoms does a patient exhibit?
![Page 4: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/4.jpg)
Pipeline
![Page 5: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/5.jpg)
PipelineText
Information, Features
![Page 6: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/6.jpg)
PipelineText
Low level NLP
Information, Features
IE algorithms
![Page 7: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/7.jpg)
PipelineText
Low level NLP
Information, Features
IE algorithms
End-to-end NLP
![Page 8: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/8.jpg)
Machine learning approach
1. Specify task
2. Specify training algorithm
3. Get data
4. Train
![Page 9: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/9.jpg)
Machine learning approach
1. Specify task
2. Specify training algorithm
3. Get data
4. Train
![Page 10: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/10.jpg)
So much text, so few labels
• 5M English Wikipedia articles (3G words)
• 54M Reddit comments
• 1G Words in Gigaword dataset (newswire text)
• 5-grams from 1T words
![Page 11: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/11.jpg)
So much text, so few labels
• 1M words in Penn TreeBank (parsing)
• Machine translation: highly language (and domain) dependent
• A few thousand to few hundred thousand sentence…
• And so many other custom tasks
![Page 12: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/12.jpg)
Presentation outline
1. Literature review on semi-supervised paradigms
a. Label inductionb. Feature learning
2. Current work: Semi-Supervised Medical Entity Linking
![Page 13: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/13.jpg)
Label induction
Overview
1. Labeling data is costly
2. Automatically obtain approximate labeling on larger dataset
3. Train using pseudo-labels
![Page 14: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/14.jpg)
Feature learning
Overview
1. Feature quality affects accuracy
2. Learn features using other sources
3. Train with features on small labeled dataset
![Page 15: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/15.jpg)
So much text, so few labels
• Label induction
• Feature learning
• Domain adaptation
• Multi-view learning
![Page 16: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/16.jpg)
Labels
Overview
• Fine Grained Entity Recognition
• Ling and Weld, 2012• Distant Supervision for RE
with an incomplete KB• Min et al., 2013
• Co-Training for DA• Chen et al. 2011
• Semi-Supervised FSP for Unknown Predicates
• Das and Smith, 2011
![Page 17: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/17.jpg)
Fine Grained Entity Recognition● Method type: Automatic labeling
● Task: Identify entities in text, and tag them with one of 112 types
● Labeled data: Hand-labelled news reports
● Auxiliary data: Wikipedia, Freebase
![Page 18: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/18.jpg)
Fine Grained Entity Recognition
Freebase
![Page 19: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/19.jpg)
Fine Grained Entity Recognition
1. Automatically label entity spans in Wikipedia text
Don Quixote…MeaningHarold Bloom says that Don Quixote is the writing of radical nihilism and anarchy,…
![Page 20: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/20.jpg)
Fine Grained Entity Recognition
1. Automatically label Wikipedia text● Spans are obtained from hyperlinks● Types are obtained from Freebase
Don Quixote…MeaningHarold Bloom says that Don Quixote is the writing of radical nihilism and anarchy,…
Harold Bloom:Topic, Academic, Person, Author, Award winner, Influence node
Nihilism:Topic, Field of study, Literature subject, Religion
![Page 21: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/21.jpg)
Fine Grained Entity Recognition
1. Train CRF and perceptron on pseudo-labeled data
B
Harold
I
Bloom
O
says
O
that
B
Don
I
Quixote
O
is
person book
![Page 22: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/22.jpg)
Fine Grained Entity Recognition
● Compares to● Stanford NER: 4 most common classes● Ratinov et al. Named Entity Linking
● Results:
![Page 23: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/23.jpg)
Distant Supervision for Relation Extraction with an incomplete Knowledge Base
● Method type: Automatic labeling, Label inference
● Task: Relation extraction
● Labeled data: TAC 2011 KBP dataset
● Auxiliary data: Wikipedia infoboxes, Freebase
![Page 24: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/24.jpg)
Distant Supervision for Relation Extraction with an incomplete Knowledge Base
● Entity pairs extracted from Wikipedia infoboxes
● Labeled with FreeBase relations: origin
![Page 25: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/25.jpg)
Distant Supervision for Relation Extraction with an incomplete Knowledge Base
● Latent variable algorithm to learn from positive-only labels
• X: entity pair mention• Z: mention level label• l: bag level label• Y: KB entity pair label• θ: Number of positive labels
![Page 26: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/26.jpg)
Distant Supervision for Relation Extraction with an incomplete Knowledge Base
● Learns with EM, compares to (y = l)
![Page 27: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/27.jpg)
Co-Training for Domain Adaptation● Method type: Automatic labeling, Domain
adaptation
● Task: Text classification - review polarity
● Labeled data: Amazon reviews for books, DVD, electronics, kitchen
● Auxiliary data: Cross-domain training
![Page 28: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/28.jpg)
Self-Training
Labeleddata
Unlabeleddata
![Page 29: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/29.jpg)
Self-Training
Labeleddata
Unlabeleddata
Classifier 1
Pseudo-labeled
data
![Page 30: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/30.jpg)
Self-Training
Labeleddata
Unlabeleddata
Classifier 1
Pseudo-labeled
dataClassifier 2
![Page 31: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/31.jpg)
Self-Training
● Algorithm● Train System-1 on labeled data● Label some data with System-1● Train System-2 on combined data
● Not much improvement● Less than 1% parsing accuracy● Somewhat better “portability”
![Page 32: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/32.jpg)
Co-Training
Labeleddata
Unlabeleddata
![Page 33: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/33.jpg)
Co-Training
Labeleddata
Unlabeleddata
Classifier 1
Pseudo-labeled
data
Classifier 2
![Page 34: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/34.jpg)
Co-Training
Labeleddata
Unlabeleddata
Classifier 1
Pseudo-labeled
data
Classifier 2
Selection
![Page 35: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/35.jpg)
Co-Training
Labeleddata
Unlabeleddata
Classifier 1
Pseudo-labeled
data
Classifier 2
Selection
![Page 36: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/36.jpg)
Co-Training
● Algorithm● Train System-1 and System-2 on labeled
data with disjoint feature sets● Add data which is confidently labeled by
exactly one system● Re-train, iterate
● Theoretical guarantees for “independent” feature sets
![Page 37: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/37.jpg)
Co-Training for Domain Adaptation
● L1 regularization: starts using more target-domain features
![Page 38: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/38.jpg)
Co-Training for Domain Adaptation
● Best improvement adding a limited number of examples
![Page 39: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/39.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
● Method type: Label pre-selection
● Task: Frame-semantic parsing
● Labeled data: SemEval 2007
● Auxiliary data: Gigaword corpus, FrameNet
![Page 40: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/40.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
Ted really tried to read Infinite Jest, but was discouraged by the size of the book.
![Page 41: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/41.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
● Extracts possible frame targets from unlabeled data
![Page 42: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/42.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
● Extracts possible frame targets from unlabeled data
![Page 43: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/43.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
● Graph construction
● Distance from dependency parsed text
● About 60,000 targets (about 10,000 in FrameNet)
● Convex quadratic optimization problem
![Page 44: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/44.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
• Learned neighbor frame distribution
![Page 45: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/45.jpg)
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
• Parsing results
![Page 46: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/46.jpg)
Features
Overview
• Prototype-Driven Learning for Sequence Models
• Haghighi and Klein, 2006• DA with Structural
Correspondence Learning• Blitzer et al., 2006
• NLP (almost) from scratch• Collobert et al., 2011
• On Using Monolingual Corpora in NMT
• Gulcehere et al., 2015
![Page 47: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/47.jpg)
Prototype-Driven Learning for Sequence Models● Method type: Feature learning
● Task: POS tagging, Classified ads segmentation
● Labeled data: PTB/CTB, Classifieds
● Auxiliary data: Prototypes
![Page 48: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/48.jpg)
Prototype-Driven Learning for Sequence Models
● Example prototypes:
![Page 49: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/49.jpg)
Prototype-Driven Learning for Sequence Models
● Gives prototypes of tag-token pairs
● Compute a similarity measure on tokens
● Adds similarity to the prototypes as a feature
![Page 50: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/50.jpg)
Prototype-Driven Learning for Sequence Models
● Results:
POS tagging
Classifieds segmentation
![Page 51: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/51.jpg)
Domain Adaptation with Structural Correspondence Learning
● Method type: Feature learning, Multi-view learning, Domain adaptation
● Task : POS tagging
● Labeled data: MEDLINE (target domain)
● Auxiliary data: WSJ (source domain)
![Page 52: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/52.jpg)
Domain Adaptation with Structural Correspondence Learning
● Example: pivot features required, from, for
![Page 53: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/53.jpg)
Domain Adaptation with Structural Correspondence Learning
● Defines a set of pivot features, present in both source and target
● Sets up a set of mini-tasks: “predict the presence of pivot feature f”
● Runs SVD on the learned weights wf
![Page 54: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/54.jpg)
Domain Adaptation with Structural Correspondence Learning
● Projection on first singular vector:
![Page 55: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/55.jpg)
Domain Adaptation with Structural Correspondence Learning
● Results:
![Page 56: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/56.jpg)
NLP (almost) from Scratch● Method type: Feature learning, Multi-view
learning
● Task : POS, chunking, NER, SRL
● Labeled data: PTB, CoNLL
● Auxiliary data: 852M words from Wikipedia + Reuters
![Page 57: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/57.jpg)
NLP (almost) from Scratch
● Neural network architecture
![Page 58: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/58.jpg)
NLP (almost) from Scratch
● First approach: supervised training of neural networks for tasks
![Page 59: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/59.jpg)
NLP (almost) from Scratch
● Second approach: initialize with word representations from LM
![Page 60: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/60.jpg)
NLP (almost) from Scratch
● Finally: joint training
![Page 61: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/61.jpg)
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
● Sentiment analysis using word embeddings and syntactic parses
![Page 62: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/62.jpg)
Skip-Thoughts Vectors (Kiros et al., NIPS 2015)
● Encodes sentences directly
● Improves sentence-level tasks● Classification● Paraphrase● Image-sentence ranking
![Page 63: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/63.jpg)
On Using Monolingual Corpora in NMT● Method type: Feature learning, Target
distribution
● Task : Machine Translation
● Labeled data: Aligned text
● Auxiliary data: Monolingual corpora
![Page 64: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/64.jpg)
On Using Monolingual Corpora in NMT
● Neural Machine Translation as sequence to squence modeling
● RNN ancoder and decoder:
![Page 65: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/65.jpg)
On Using Monolingual Corpora in NMT● Train Neural Machine Translation system
● Train target language model: RNN
● Shallow fusion: beam search on combined scores
● Deep fusion: add language model hidden state as input to decoder (+controller)
![Page 66: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/66.jpg)
On Using Monolingual Corpora in NMT
Turkish
![Page 67: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/67.jpg)
On Using Monolingual Corpora in NMT
Chinese
![Page 68: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/68.jpg)
Semi-Supervised Learning for Entity Linkage using Variational
Inference
Yacine Jernite, Alexander Rush and David Sontag
![Page 69: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/69.jpg)
Semi-Supervised Learning for Entity Linkage using Variational Inference
● Method type: Feature learning, Label inference
● Task: Medical concept extraction
● Labeled data: Semeval 2015 (annotated medical notes)
● Auxiliary data: MIMIC-II (medical text), UMLS
![Page 70: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/70.jpg)
Task description● We have:
● Medical text from the MIMIC database
● Medical knowledge base UMLS with concept descriptions
● We want to identify concepts in the text and link them to UMLS
![Page 71: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/71.jpg)
UMLS samples
● Ambiguous, incomplete
![Page 72: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/72.jpg)
UMLS samples
● Ambiguous, incomplete
![Page 73: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/73.jpg)
Step 1: Mention Detection
![Page 74: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/74.jpg)
Step 1: Mention Detection
● B, I, O – ID, OD tagging with CRF
O
ordered
O
for
B
L
I
sided
I
hearing
I
loss
O
.
B
neuro
OD
exams
OD
did
OD
not
OD
reveal
OD
any
ID
deficits
![Page 75: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/75.jpg)
Step 1: Mention Detection
● Duplicating incompatible examples
B
L
I
sided
I
hearing
I
loss
O
and
O
pain
O
.
B
L
I
sided
OD
hearing
OD
loss
OD
and
ID
pain
O
.
![Page 76: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/76.jpg)
Step 1: Mention Detection
● Run inference on unlabeled and test set
● Approximate marginal probability
● Threshold
![Page 77: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/77.jpg)
Step 1: Mention Detection
● PR curve:
![Page 78: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/78.jpg)
Step 1: Mention Detection
● Other approaches:
● ezDI: A Supervised NLP System for Clinical Narrative Analysis, Pathak et al., 2015
● BIO for continuous, SVM to join
● ULisboa: Recognition and Normalization of Medical Concepts, Leal et al., 2015
● BIOENS tagging scheme, Brown clusters, domain lexicons
![Page 79: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/79.jpg)
Step 2: Mention Identification
![Page 80: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/80.jpg)
Step 2: Mention Identification
● Pathak et al.: ● Simple lookup● Semi-automated modified descriptions
● Edit distance
![Page 81: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/81.jpg)
Step 2: Mention Identification
● Leal et al.● Abbreviation dictionary
● UMLS lookup
● Similarity: Lucene, n-gram and edit distane
● Lowest Information Content (specificity, using UMLS tree structure)
![Page 82: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/82.jpg)
Step 2: Mention Identification● A Generative Entity-Mention Model for Linking
Entities with KB (Han and Sun, ACL 2011)
● : translation model from main description
● : unigram language model
![Page 83: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/83.jpg)
Step 2: Mention Identification
● Our model:
● : multinomial with automatically curated support
● : joint distribution on all entities in the document
![Page 84: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/84.jpg)
Step 2: Mention Identification
● Our model:
● : multinomial with automatically curated support
● : joint distribution on all entities in the document
![Page 85: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/85.jpg)
Step 2: Mention Identification
● : MRF on CUIs
L sidedhearing
lossHTN
Hyper-lipidemia
Neuro-logical deficits
…
![Page 86: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/86.jpg)
Step 2: Mention Identification
● Problem: CUIs are latent variables on MIMIC (unlabeled)
● Variational learning, following:● Autoencoding Variational Bayes, Kingma and
Welling, ICLR 2014
![Page 87: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/87.jpg)
Step 2: Mention Identification
● Objective:● Maximize
● Jensen’s inequality:
● Joint maximization in
![Page 88: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/88.jpg)
Step 2: Mention Identification
● Factorized :
L sidedhearing
lossHTN
Hyper-lipidemia
Neuro-logical deficits
…
![Page 89: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/89.jpg)
Step 2: Mention Identification
● Considers mention and neighbors:
L sidedhearing
lossHTN
Hyper-lipidemia
Neuro-logical deficits
…
![Page 90: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/90.jpg)
Step 2: Mention Identification
● Neural network parameterization
● Semi-automated restricted support
● Supervised training gives 2nd best accuracy on 2014 task
![Page 91: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/91.jpg)
Step 2: Mention Identification
● Next steps:● Pre-train parameters
● Use correlation model
● Train with variational algorithm
![Page 92: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/92.jpg)
Review of Semi-Supervised methods
● Automatic labeling of data
● Label pre-selection
● Use prototypes
● Use features learned on larger corpus
![Page 93: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/93.jpg)
Review of Semi-Supervised methods
● Domain adaptation: PubMed
● Multi-view learning
![Page 94: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/94.jpg)
Review of Semi-Supervised methods
● Multi-view learning:
● Other information on the patient: diagnosis codes, procedures, demographics, etc…
● Jointly learn to predict those
![Page 95: Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015.](https://reader035.fdocuments.us/reader035/viewer/2022070413/5697bfa51a28abf838c97f75/html5/thumbnails/95.jpg)
Questions?