Text Classification/Categorization

Post on 23-Jan-2017

172 views 2 download

Transcript of Text Classification/Categorization

NLP For Text Categorization/Classification

Domain-Natural Language ProcessingPrepared By-Abhishek Oswal

Guide-Jayshree Ghorpade

Some Questions What is NLP Detecting ->Patterns Features Models

What is Cassification What is Text Classification Why Text Classification

Promblem Type• Supervised • You know about it• Train data

• Fruits Analogy

• Unsupervised• You don't know about it • Untrain data

Supervised

• Regression and Classification• Regression -> Real estate market predict price ,

• Price Continious Output • Classification ->Whether it sells for more or less than asked price,discrete output

Process of Classification• Data preprocessing• Training and Test set• Creation of model• Algorithm• Classify

Methods To Represent• Document -term Matrix

• Bags of words

Methods To Classify• Using Probability• Naive Bayes• Using Graphs• Simple Vector Machine• Tree• Decision Tree

Naive Bayes• Predictice Model• Conditional Probability

Naive Bayes• Independent Features• Prior• Likelihood

Naive Bayes

Naive Bayes

Another Method

Simple Vector Machine

SVMl Optimal plane

SVM• Using margin• Marging is no man's land

SVM• Optimal plane would be one with Biggest margin

• Equation of hyperplane

Applications

• Email Classification• Spam Filtering• News Organization• Classification of documents based on language• Opining Mining

• Eg.• Sakaal Classifieds• Gmail Spam Mail Detection

Example

Comparision

Naive BayesEasy FastDifferent Classes

SVMDifficultSlow (traning time)Binary Output

Questions• My Questions First• Classify my presentation• Class ->Good /Bad

ReferencesFei Yu ,Jiyao An and Hong Li,Mialiang Zhu and Ouyang Yang,”Intelligence Text Categorization Based on Bayes Algorithm”,Proceedings of International Conference on Information Acquisition.