RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf ·...
Transcript of RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf ·...
![Page 1: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/1.jpg)
RNN for book classificationMindlab Group, Ritual Group
![Page 2: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/2.jpg)
Dataset construction
Annotated Dataset
Get top tags from users
No. Class Tags
1 science_fiction sci-fi, science-fiction
2 comedy comedies, comedy, humor
... ... ...
9 religion christian, religion, christianity,...
![Page 3: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/3.jpg)
Predict next word with NN (Language model)
Learn one vector per each word
![Page 4: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/4.jpg)
Word embeddings properties
![Page 5: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/5.jpg)
Word embeddings properties
![Page 6: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/6.jpg)
From words to book representationRepresent a sequence of N words, by representing each word using word2vec embedding space and average their word vectors.
Take M sequences of vectors as input for a RNN. Label all sequences with the genre of the source book.
Total Books: 3629
Total Samples: ~68000
![Page 7: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/7.jpg)
Dataset Distribution
![Page 8: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/8.jpg)
RNN Architecture
![Page 9: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/9.jpg)
Proposed model
![Page 10: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/10.jpg)
Results using RNN
![Page 11: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/11.jpg)
Representation of samples over 2D visualization
https://goo.gl/e9jO38
![Page 12: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/12.jpg)
Image captioningNext level of computer vision
![Page 13: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/13.jpg)
Image Captioning● A step beyond image classification or object detection.● Requires the identification of complex relations between elements in the
image● Additionally requires a generative model to build meaningful sentences.● A hard task to evaluate.● Proposed methods focus on get higher BLEU scores, rather than solve the
problem
![Page 14: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/14.jpg)
Previous approaches● Detect objects using complex features● Identify actions, relations in the scene● Train a language model● Integrate…● Sentence retrieval
![Page 15: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/15.jpg)
Neural Image Caption generator
Model
● CNN for images● RNN for language
modeling● Backpropagation for
training
Data
![Page 16: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/16.jpg)
An End-to-End approach:
O Vinyals - 2015
![Page 17: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/17.jpg)
Generated sentences
![Page 18: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/18.jpg)
Attention models in Translation
D. Bahdanau 2014
![Page 19: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/19.jpg)
Attention models in Image CaptioningK. Xu, 2016
![Page 20: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/20.jpg)
Visual Alignments
![Page 21: RNN for book classification - GitHub Pageslin99.github.io/NLPTM-2016/4.Docs/Image captioning.pdf · From words to book representation Represent a sequence of N words, by representing](https://reader030.fdocuments.us/reader030/viewer/2022040415/5f2d403274ca6d022038ff01/html5/thumbnails/21.jpg)
Generated phrases