NLP Harry Smith - Penn EngineeringCIS192 Python Programming NLP Harry Smith University of...

CIS192 Python ProgrammingNLP

Harry Smith

University of Pennsylvania

April 12, 2017

Harry Smith (University of Pennsylvania) CIS 192 April 12, 2017 1 / 25

Outline

1 Natural Language Processing (NLP)MotivationTokenizationCounting Word FrequenciesGenerating Random SentencesPart of Speech TaggingFree Word AssociationSentiment AnalysisNext Steps

Natural Language Processing

source: researchperspectives.org

Language is Hard

How can a computer:Recognize parts of speech of sentences?Tell whether a sentence is positive or negative?Figure out what words are most commonly used?Summarize text?

Some Applications of NLP

Sentiment analysisSpam filteringPlagiarism detectionDocument categorizationSummarizationText searchMuch more...

Natural Language Tool Kit (NLTK)

NLP toolkit for English in PythonDeveloped at Penn in 2001!nltk.org

Terminology

Corpus: a body of textToken: Each meaningful "entity" in a string

I Depending on context, tokens can be words, sentences,paragraphs

Part of Speech: categories that words are assigned toI noun, verb, adjective, ...

Stopwords: most common words in a language, filtered out beforeNLP tasks

I the, is, at, which, on, ...

Outline

Word Tokenization

>>> nltk.word_tokenize(’The mitochondria is thepowerhouse of the cell.’)[’The’, ’mitochondria’, ’is’, ’the’, ’powerhouse’, ’of’, ’the’, ’cell’, ’.’]

Sentence Tokenization

>>> sentences = "Prof. Sanjeev Khanna taught CIS 320last spring. It was a great class...and I wasn’t

able to get off the waitlist for CIS 677.">>> nltk.sent_tokenize(sentences)[’Prof. Sanjeev Khanna taught CIS 320 last spring.’,"It was a great class...and I wasn’t able to getoff the waitlist for CIS 677."]

Outline

Counting Words in a Corpus

Before today:

>>> counts = defaultdict(int)>>> for word in words:

counts[word] += 1

Better:

>>> counts = FreqDist(words)>>> counts.most_common(10) #=> [(’the’, 49), ...]

Outline

Creating "random" sentences from a corpus

In probability theory, Markov Chains are "memoryless""Future state depends on current state only"To create a "random" sentence:

I Take your current wordI Add a new word that typically appears after your current wordI Repeat!

Outline

Part of Speech Tagging

Use nltk.pos_tag(list_of_tokens) to identify part ofspeech tagsnltk.help.upenn_tagset shows what each tag code means

Outline

Free Word Association

After hearing a word, what’s the word that immediately comes tomind?"dog" → "cat" → "meow" → ...How would we do this?

Free Word Association

After hearing a word, what’s the word that immediately comes tomind?"dog" → "cat" → "meow" → ...Simple way:

I For each token in our corpus, count the occurrences of surroundingtokens

Outline

Sentiment Analysis

Is some particular text is positive or negative (and to whatdegree?)How might we do this?

Sentiment Analysis

Is some particular text is positive or negative (and to whatdegree?)How might we do this?

I Machine learning (last two lectures)F Try to "learn" the sentiment-relevant features of textF Need lots of training dataF Data driven approach

I Rule-based methodsF "Rule of thumb": uses heuristics to determine sentimentsF Needs little training dataF Good for production: fast, but harder to initially createF VADER: popular rule based model aimed for social media

Outline

Next Steps

CIS 526: Machine TranslationCIS 530: Computational LinguisticsKaggle for large datasets, competitionsawesome-nlp: curated list of NLP resources on GitHub

NLP Harry Smith - Penn EngineeringCIS192 Python Programming NLP Harry Smith University of...

Documents

Transcript of NLP Harry Smith - Penn EngineeringCIS192 Python Programming NLP Harry Smith University of...

Harry W. Smith

Exercises.pdfThe Trainer’s Book of NLP Exercises © 2010 Andy Smith - 2 - Contents INTRODUCTION

Life Changing NLP Training - Home - Dr Bridget NLP · Life Changing NLP Training ... Certified Master Practitioner of NLP (Neuro Linguistic Programming), ... NLP PRACTITIONER COURSE

NLP Training - NLP Certification

Place-making and place-keeping of open spaces – MP4 Harry Smith – Heriot-Watt University

Adam norton newell Harry Christopher Hanrahan Daniel … · Adam norton newell Harry emma White Rob McHaffie Gemma Smith Mitch Cairns Christopher Hanrahan Daniel Mudie Cunningham

NLP Practitioner Workbook Business Nlp Training.uk

NLP Trainer’s Training, Evaluation & Certification ... · This NLP Trainer’s Training is delivered and facilitated by ABNLP Certified Master Trainer of NLP, Dr. Harry Wong ...

Module 2 Course Development Kathleen Burnett, FSU Linda Smith, UIUC Harry Bruce, UW.

HARRY SMITH: STRING FIGURES Organized by Terry Winters · PDF file3 “Harry Smith: String Figures,” an exhibition drawn from the collection of John Cohen and organized by painter

T. Harry Williams Papers - LSU Libraries · 2018-06-26 · T. HARRY WILLIAMS PAPERS . Mss. 2489, 2510 . Inventory . Compiled by . Anne Smith . Louisiana and Lower Mississippi Valley

By Harry Smith - Suffolk Maths

LAST NAME, FIRST AARON, STEVEN B ABESSINIO ... final...SLOVER, WILLIAM L SLYE, SUSAN S SMART, CATHERINE A SMART, WILLIAM F III SMITH, ELEANOR BALDWIN SMITH, HARRY LEONARD SMITH, JEFFREY

August 2017 draft - VVAA · Harry Arthur Smith Lieutenant Colonel Harry Arthur Smith SG, MC (born 25 July 1933) is a former senior oﬃcer in the Australian Army, seeing service during

Introduction to NLP - ils.albany.eduWhat is NLP? 2.Some application areas of NLP 3.A brief history of NLP 4.Famous NLP systems 5. Ambiguity and NLP 6.Overcoming ambiguity – Brief

words by HARRY B. SMITH and FRANCIS WHEELER music by TED SNYDER Featured in COLUMBIA ... · 2013-06-30 · words by HARRY B. SMITH and FRANCIS WHEELER music by TED SNYDER Featured

Neighborhood Connection 4€¦ · NLP Profile: Shavonnah Stephens, Spring 2006 • First NLP youth participant • Attends Oscar Smith High School, 10th grade • All Honors Classes

NLP Master Coach Course - Edge NLP Limited...NLP Master Coach Course The NLP Master Coach training programme follows the American Board of NLP Coaching division NLP Master Coach syllabus.

NLP Practitioner Heart of NLP - NLP Courses

(Nlp) Nlp Secrets