CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system •...
Transcript of CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system •...
![Page 1: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/1.jpg)
CSE 291G : Deep Learning for Sequences
Paper presentation
Topic : Named Entity Recognition
Rithesh
![Page 2: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/2.jpg)
Outline
• Named Entity Recognition and its applications.
• Existing methods
• Character level feature extraction
• RNN : BLSTM- CNNs
![Page 3: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/3.jpg)
Named Entity Recognition (NER)
![Page 4: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/4.jpg)
Named Entity Recognition (NER)
WHAT ?
![Page 5: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/5.jpg)
Named Entity Recognition (entity identification, entity chunking & entity extraction)
• Locate and classify named entity mentions in unstructured text into predefined categories : person names, organizations, locations, time expressions etc.
• Ex : Kim bought 500 shares of IBM in 2010.
![Page 6: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/6.jpg)
Named Entity Recognition (entity identification, entity chunking & entity extraction)
• Locate and classify named entity mentions in unstructured text into predefined categories : person names, organizations, locations, time expressions etc.
• Ex : Kim bought 500 shares of IBM in 2010.
![Page 7: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/7.jpg)
Named Entity Recognition (entity identification, entity chunking & entity extraction)
• Locate and classify named entity mentions in unstructured text into predefined categories : person names, organizations, locations, time expressions etc.
• Ex : Kim bought 500 shares of IBM in 2010.
Person name Organization Time
![Page 8: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/8.jpg)
Named Entity Recognition (entity identification, entity chunking & entity extraction)
• Locate and classify named entity mentions in unstructured text into predefined categories : person names, organizations, locations, time expressions etc.
• Ex : Kim bought 500 shares of IBM in 2010.
• Importance of locating named entity in a sentence : Ex : Kim bought 500 shares of Bank of America in 2010.
Person name Organization Time
![Page 9: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/9.jpg)
Named Entity Recognition (NER)
WHAT ?
WHY ?
![Page 10: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/10.jpg)
Applications of NER
• Content Recommendations
• Customer support
• Classifying content for news providers
• Efficient Searching algorithms
• QA
• Machine Translation Systems
• Automatic Summarization system
![Page 11: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/11.jpg)
Named Entity Recognition (NER)
WHAT ?
WHY ?
HOW ?
![Page 12: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/12.jpg)
Approaches :
• ML Classification techniques (Ex : SVM, Perceptron model, CRF(Conditional Random Fields))
Drawback : Requires Hand-crafted features
• Neural Network Model (By Collobert – Natural Language Processing (almost) from scratch) Drawbacks : (i) Simple Feedforward NN with fixed window size (ii) Depends solely on word embeddings & fails to exploit character level features – prefix, suffix etc.
• RNN : LSTM – variable length input and long term memory
– First proposed by Hammerton in 2003
![Page 13: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/13.jpg)
RNN : LSTM
• Overcome drawbacks of existing system
• Account for variable length input and long term memory
• Fails to handle cases in which the ith word of a sentence(S) depends on words at positions greater than ‘i’ in S. Ex : Teddy bears are on sale. Teddy Roosevelt was a great president.
![Page 14: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/14.jpg)
RNN : LSTM
• Overcome drawbacks of existing system
• Account for variable length input and long term memory
• Fails to handle cases in which the ith word of a sentence(S) depends on words at positions greater than ‘i’ in S. Ex : Teddy bears are on sale. Teddy Roosevelt was a great president.
SOLUTION : Bi-directional LSTM (BLSTM) - Captures Information from the past and from the future.
![Page 15: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/15.jpg)
RNN : LSTM
• Overcome drawbacks of existing system
• Account for variable length input and long term memory
• Fails to handle cases in which the ith word of a sentence(S) depends on words at positions greater than ‘i’ in S. Ex : Teddy bears are on sale. Teddy Roosevelt was a great president.
SOLUTION : Bi-directional LSTM (BLSTM) - Captures Information from the past and from the future.
Fails to exploit character level features
![Page 16: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/16.jpg)
Techniques to capture character level features
• Santos and Labeau (2015) proposed a model for character level feature extraction using CNN for NER and POS.
• Ling (2015) proposed a model for character level feature extraction using BLSTM for POS.
![Page 17: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/17.jpg)
Techniques to capture character level features
• Santos and Labeau (2015) proposed a model for character level feature extraction using CNN for NER and POS.
• Ling (2015) proposed a model for character level feature extraction using BLSTM for POS.
• CNN or BLSTM?
![Page 18: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/18.jpg)
Techniques to capture character level features
• Santos and Labeau (2015) proposed a model for character level feature extraction using CNN for NER and POS.
• Ling (2015) proposed a model for character level feature extraction using BLSTM for POS.
• CNN or BLSTM?
– BLSTM did not perform significantly better than CNN and also,
BLSTM is computationally more expensive to train.
![Page 19: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/19.jpg)
Techniques to capture character level features
• Santos and Labeau (2015) proposed a model for character level feature extraction using CNN for NER and POS.
• Ling (2015) proposed a model for character level feature extraction using BLSTM for POS.
• CNN or BLSTM?
– BLSTM did not perform significantly better than CNN and also,
BLSTM is computationally more expensive to train.
BLSTM : Word level feature extraction CNN : Character level feature extraction
![Page 20: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/20.jpg)
Named Entity Recognition with Bidirectional LSTM-CNNs
Jason P.C. Chiu, Eric Nichols (2016). Named entity recognition with bidirectional LSTM-CNNs. Transactions of the
Association for Computational Linguistics, 4, 357-370.
• Inspired by : – Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray
Kavukcuoglu, and Pavel Kuksa. 2011b. Natural language processing (almost) from scratch. The journal of Machine Learning Research, 12:2493-2537.pages 25-33.
– Cicero Santos, Victor Guimaraes. 2015. Boosting named entity recognition with neural character embeddings. Proceedings of the fifth Named Entities Workshop,
![Page 21: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/21.jpg)
Reference paper : Boosting NER with Neural Character Embeddings
• CharWNN deep neural network – uses word and character level representations(embeddings) to perform sequential classification.
• HAREM I : Portuguese SPA CoNLL-2002 : Spanish
• CharWNN extends Collobert et al.’s (2011) neural network architecture for sequential classification by adding a convolutional layer to extract character-level representations.
![Page 22: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/22.jpg)
CharWNN • Input : Sentence
• Output : For each word in the sentence a score for each class
![Page 23: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/23.jpg)
CharWNN • Input : Sentence
• Output : For each word in the sentence a score for each class
S : <w1, w2, .. wN>
![Page 24: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/24.jpg)
CharWNN • Input : Sentence
• Output : For each word in the sentence a score for each class
S : <w1, w2, .. wN>
wn
un
un =[rwrd; rwch]
![Page 25: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/25.jpg)
CharWNN • Input : Sentence
• Output : For each word in the sentence a score for each class
S : <w1, w2, .. wN>
wn
un
un =[rwrd; rwch]
![Page 26: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/26.jpg)
CNN for character embedding
![Page 27: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/27.jpg)
CNN for character embedding
W : <c1, c2, ..cM>
![Page 28: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/28.jpg)
CNN for character embedding
W : <c1, c2, ..cM>
![Page 29: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/29.jpg)
CNN for character embedding
W : <c1, c2, ..cM>
Matrix vector operation with window size k
![Page 30: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/30.jpg)
CNN for character embedding
W : <c1, c2, ..cM>
Matrix vector operation with window size k
![Page 31: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/31.jpg)
CNN for character embedding
W : <c1, c2, ..cM>
Matrix vector operation with window size k
rwch
![Page 32: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/32.jpg)
CharWNN • Input : Sentence
• Output : For each word in the sentence a score for each class
S : <w1, w2, .. wN>
wn
un
un =[rwrd; rwch]
rwch <u1, u2, .. uN>
![Page 33: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/33.jpg)
CharWNN
• Input to convolution layer : <u1, u2, .. uN>
![Page 34: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/34.jpg)
CharWNN
• Input to convolution layer : <u1, u2, .. uN>
Two Neural Network layers
![Page 35: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/35.jpg)
CharWNN
• Input to convolution layer : <u1, u2, .. uN>
• For a Transition score matrix Atu
Two Neural Network layers
=
![Page 36: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/36.jpg)
Network Training for CharWNN
• CharWNN is trained by minimizing the negative log-likelihood over the training set D.
• Interpret the sentence score as a conditional probability over a path (the score is exponentiated and normalized with respect to all possible paths)
• Stochastic gradient descent (SGD) to minimize the negative log-likelihood with respect to
![Page 37: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/37.jpg)
Embeddings
• Word level Embedding : For Portuguese NER, the world level embeddings previously trained by Santos, 2004 was used. And for Spanish, Spanish wikipedia was used.
• Character level Embedding : Unsupervised learning of character level embeddings was NOT performed. The character level embeddings are initialized by randomly sampling each value from an uniform distribution.
![Page 38: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/38.jpg)
Corpus : Portuguese & Spanish
![Page 39: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/39.jpg)
Hyperparameters
![Page 40: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/40.jpg)
Comparison of different NNs for the SPA CoNLL-2002 corpus
![Page 41: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/41.jpg)
Comparison of different NNs for the SPA CoNLL-2002 corpus
Comparison with the state-of-the-art for the SPA CoNLL-2002 corpus
![Page 42: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/42.jpg)
Comparison of different NNs for the HAREM I corpus
Comparison with the State-of-the-art for the HAREM I corpus
![Page 43: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/43.jpg)
Chiu, J. P., & Nichols, E. (2016). Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for
Computational Linguistics, 4, 357-370.
BLSTM : Word level feature extraction CNN : Character level feature extraction
![Page 44: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/44.jpg)
Character Level feature extraction
![Page 45: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/45.jpg)
Word level feature extraction
![Page 46: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/46.jpg)
Word level feature extraction
![Page 47: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/47.jpg)
Embeddings
• Word embeddings : 50 dimensional word embeddings released by Collobert (2011b) : Wikipedia & Reuters RCV-I corpus. Also, Stanford’s Glove and Google’s word2vec.
• Character embeddings : randomly initialized lookup table with values drawn from a uniform distribution with range [-0.5, 0.5] to output a character embedding of 25 dimensions.
![Page 48: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/48.jpg)
Additional Features
• Additional word level features : – Capitalization feature : allCaps, upperInitial, lowercase, mixedCaps,
noinfo.
– Lexicons : SENNA and DBpedia
![Page 49: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/49.jpg)
Training and Inference • Implementation :
– torch7 library
– Initial state of LSTM set to zero vectors.
• Objective : Maximize sentence level log-likelihood – The objective function and its gradient can be efficiently computed by
Dynamic programming.
– Viterbi algorithm is used to find the optimal tag sequence [ i ]T that maximizes :
• Learning : Training was done by mini-batch stochastic gradient descent (SGD) with a fixed learning rate, and each mini-batch consists of multiple sentences with same number of tokens.
![Page 50: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/50.jpg)
Results
![Page 51: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/51.jpg)
Results : F1 scores of BLSTM and BLSTM-CNN with various addition features
( emb : Collobert word embeddings, Char : character type feature, caps : capitalization feature, Lex : lexicon feature )
![Page 52: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/52.jpg)
Results : Word embeddings
![Page 53: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/53.jpg)
Results : Various dropout values
![Page 54: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/54.jpg)
Questions to discuss
• Why BLSTM-CNNs is the best choice?
• Is the proposed model Language independent?
• Is it a good idea to use additional features( Capitalization, prefix, suffix etc.) ?
• Possible Future Works..
![Page 55: CSE 291G : Deep Learning for Sequences · RNN : LSTM • Overcome drawbacks of existing system • Account for variable length input and long term memory • Fails to handle cases](https://reader030.fdocuments.us/reader030/viewer/2022013023/5fabe514304f9662772ad5a8/html5/thumbnails/55.jpg)
Thank you!!