Lecture 6: Comparing Things Word Similarity
description
Transcript of Lecture 6: Comparing Things Word Similarity
![Page 1: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/1.jpg)
Methods in Computational Linguistics II
Queens College
Lecture 6: Comparing ThingsWord Similarity
![Page 2: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/2.jpg)
2
Today
• List Comprehensions• Determining Word Similarity• Co-occurrences • WordNet
![Page 3: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/3.jpg)
3
List Comprehensions
• Compact way to process every item in a list.
• [x for x in array]
![Page 4: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/4.jpg)
4
Methods
• Using the iterating variable, x, methods can be applied.
• Their value is stored in the resulting list.• [len(x) for x in array]
![Page 5: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/5.jpg)
5
Conditionals
• Elements from the original list can be omitted from the resulting list, using conditional statements
• [x for x in array if len(x) == 3]
![Page 6: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/6.jpg)
6
Building up
• These can be combined to build up complicated lists
• [x.upper() for x in array if len(x) > 3 and x.startswith(‘t’)]
![Page 7: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/7.jpg)
7
Lists Containing Lists
• Lists can contain lists• [[a, 1], [b, 2], [d, 4]]• ...or tuples• [(a, 1), (b, 2), (d, 4)]• [ [d, d*d] for d in array if d < 4]
![Page 8: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/8.jpg)
8
Lists within lists are often called 2-d arrays
• This is another way we store tables.
• Similar to nested dictionaries.• a = [[0,1], [1,0]• a[1][1]• a[0][0]
![Page 9: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/9.jpg)
9
Using multiple lists
• Multiple lists can be processed simultaneously in a list comprehension
• [x*y for x in array1 for y in array2]
![Page 10: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/10.jpg)
10
Co-occurrences
• How would you identify common co-occurrences?
• Define a co-occurrence:– “school bus” vs. “school river”
![Page 11: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/11.jpg)
11
How are words related?
![Page 12: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/12.jpg)
12
Some relations
![Page 13: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/13.jpg)
13
Anything else?
• What relationships would you like to know about between words?
![Page 14: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/14.jpg)
14
WordNet
![Page 15: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/15.jpg)
15
Synsets
![Page 16: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/16.jpg)
16
Other relationships in WordNet
![Page 17: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/17.jpg)
17
WordNet Similarity
![Page 18: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/18.jpg)
18
WordNet Similarity
![Page 19: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/19.jpg)
19
Word sense disambiguation
![Page 20: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/20.jpg)
20
Stemming and Lemmatizing
![Page 21: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/21.jpg)
21
Stemming and Lemmatization in NLTK
![Page 22: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/22.jpg)
22
WordNet Demo
![Page 23: Lecture 6: Comparing Things Word Similarity](https://reader035.fdocuments.us/reader035/viewer/2022070422/56816426550346895dd5e419/html5/thumbnails/23.jpg)
23
Next Time
• Word Similarity– Wordnet
• Data structures– 2-d arrays. – Trees– Graphs