Text Classification and Convolutional Neural Networks Fall 2017 … · 2017-11-27 · Text...

Text Classification and Convolutional Neural Networks

COSC 7336: Advanced Natural Language ProcessingFall 2017

Some content on these slides was borrowed from J&M

Today’s lecture★ Text Classification: task definition★ Classical approaches to Text Classification★ Convolutional Neural Networks (CNN)★ Recent work using CNNs for Text Classification problems★ Demo: CNN for text★ Practical

What do these books have in common?

Other tasks that can be solved as TC★ Sentiment classification

★ Native language identification

★ Profiling

Formal definition of the TC task

★ Input:○ a document d○ a fixed set of classes C = {c1, c2,…, cJ}

★ Output: a predicted class c ∈ C

Methods for TC tasks★ Rule based approaches★ Machine Learning algorithms

○ Naive Bayes○ Support Vector Machines○ Logistic Regression○ And now deep learning approaches

Naive Bayes for Text Classification★ Simple approach ★ Based on the bag-of-words representation

Bag of wordsThe first reference to Bag of Words is attributed to a 1954 paper by Zellig Harris

Naive BayesProbabilistic classifier (eq. 1)

According to Bayes rule: (eq. 2)

Replacing eq. 2 into eq. 1:

Dropping the denominator:

Naive BayesA document d is represented as a set of features f

2 , …, f

How many parameters do we need to learn in this model?

Naive Bayes Assumptions1. Position doesn’t matter2. Naive Bayes assumption: probabilities P(f

i|c) are independent given the class

c and thus we can multiply them:

This leads us to:

Naive Bayes in PracticeWe consider word positions:

We also do everything in log space:

Naive Bayes: TrainingHow do we compute and ?

Is Naive Bayes a good option for TC?

Evaluation in TCConfusion table

Accuracy = TP + TN

(TP + TN + FN + FP)

Gold Standard

True False

True TP = true positives FP = False positives

False FN = false negatives TN = True negatives

Evaluation in TC: Issues with Accuracy?Suppose we want to learn to classify each message in a web forum as “extremely negative”. We have a collected gold standard data:

★ 990 instances are labeled as negative★ 10 instances are labeled as positive★ Test data has 100 instances (99- and 1+)★ A dumb classifier can get 99% accuracy by always predicting “negative” !

More Sensible Metrics: Precision, Recall and F-measure

P= TP/(TP+FP)

R=TP/(TP+FN)

F-measure =

Gold Standard

True False

True TP = true positives FP = False positives

False FN = false negatives TN = True negatives

What about Multi-class problems?● Multi-class: c > 2

● P, R, and F-measure are defined for a single class

● We assume classes are mutually exclusive

● We use per class evaluation metrics

P = R =

Micro vs Macro Average★ Macro average: measure performance per class and then average★ Micro average: collect predictions for all classes then compute TP, FP, FN,

and TN ★ Weighted average: compute performance per label and then average where

each label score is weighted by its support

Example

Train/Test Data Separation

Convolutional Neural Networks

Visual Cortex

Neocognitron (Fukushima, 1980)

LeNet (LeCun, 1998)

Convolution

(Source:Feature extraction using convolution, Stanford Deep Learning Wiki)

Convolution

(Source:Feature extraction using convolution, Stanford Deep Learning Wiki)

Pooling or Subsampling

Pooling

(source: Karpathy, CS231n Convolutional Neural Networks for Visual Recognition)

Pooling

(source: Karpathy, CS231n Convolutional Neural Networks for Visual Recognition)

Properties★ Local invariance★ Compositionality

Adapted from: http://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/

CNNs for NLP★ Same as images, text exhibits some local invariance properties that can be

modeled by CNNs★ CNNs are not as popular as recurrent neural networks (to be discussed next

class) for text analysis, but there are many cases where they work pretty well.★ Big advantage: CNNs can be trained efficiently since they take full advantage

of parallelism.

A character-level CNN

Example from Sebastián Sierrahttp://lin99.github.io/NLPTM-2016/4.Docs/cnn%20for%20text.pdf

A character-level CNN

Convolutional neural networks for sentence classification

Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).

Convolutional neural networks for sentence classification

Recent work using CNNs: Text Classification★ Architecture with up to 29

convolutional layers★ Idea is to learn a hierarchical

representation of text★ Achieve state of the art on

most datasets and outperform recent work using shallow CNNs

★ They reach state of the art on large data sets > 630k

★ No statistical tests for significance

★ They couldn’t outperform a hierarchical method adapted for multiple sentences.

Recent work using CNNs: Authorship Attribution

CNNs for Sentence Classification Demo

Text Classification and Convolutional Neural Networks Fall 2017 … · 2017-11-27 · Text...

Documents

Transcript of Text Classification and Convolutional Neural Networks Fall 2017 … · 2017-11-27 · Text...

High-Throughput Classification of Radiographs Using Deep Convolutional Neural … · 2017-04-11 · High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural ...web.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/yuanzhe.pdf · ImageNet Classification with Deep Convolutional Neural

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural ... · ImageNet Classiﬁcation with Deep Convolutional Neural Networks Alex Krizhevsky University of Toronto kriz@cs.utoronto.ca

Scene classification using Convolutional Neural Networks - Jayani Withanawasam

IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL …

Convolutional Neural Networks for Image Classification and ...€¦ · Convolutional Neural Networks for Image Classi cation and Captioning Sachin Padmanabhan sachin@cs.stanford.edu

Convolutional Neural Network Hyper-Parameters Optimization ...€¦ · applications, Convolutional Neural Network (CNN) is the most widely used technique for image classification.

ImageNet Classification with Deep Convolutional Neural ...

MULTI-CHANNEL CONVOLUTIONAL NEURAL NETWORK FOR TARGETED SENTIMENT CLASSIFICATION · 2019. 9. 13. · MULTI-CHANNEL CONVOLUTIONAL NEURAL NETWORK FOR TARGETED SENTIMENT CLASSIFICATION

Deep Convolutional Neural Networks for Map Type Classification · Deep Convolutional Neural Networks for Map Type Classification Xiran Zhou, Wenwen Li, Arizona State University Samantha

CONVOLUTIONAL NEURAL NETWORKS - GitHub …Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing

(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Networks

Deep Neural Networks Convolutional Networks IIbhiksha/courses/deep... · Deep Neural Networks Convolutional Networks II Bhiksha Raj 1. Story so far • Pattern classification tasks

Deep Convolutional Neural Networks for Hyperspectral Image Classification · 2017-10-17 · ResearchArticle Deep Convolutional Neural Networks for Hyperspectral Image Classification

Convolutional Neural Networks for Classification of ...

Complex Networks Classification with Convolutional Neural … · 2018. 4. 10. · Complex Networks Classification with Convolutional Neural Netowrk Ruyue Xin School of Systems Science,

Dialog-act classification using Convolutional Neural Networks · 2019-01-20 · Kim (2014) proposed a sentence classification model: a simple but strong one-layer convolutional neural

O-CNN: Octree-based Convolutional Neural Networks for … · O-CNN: Octree-based Convolutional Neural Networks for 3D ... object classification, ... Octree-based Convolutional Neural

Skin Lesion Classification Using Convolutional Neural …...2020/11/24 · Skin Lesion Classi cation Using Convolutional Neural Network for Melanoma Recognition Aishwariya Dutta1,