Introduction to text mining
-
Upload
lars-juhl-jensen -
Category
Documents
-
view
1.547 -
download
3
description
Transcript of Introduction to text mining
![Page 1: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/1.jpg)
Introduction to text mining
Lars Juhl Jensen
>10 km
![Page 2: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/2.jpg)
exponential growth
![Page 3: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/3.jpg)
![Page 4: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/4.jpg)
~45 seconds per paper
![Page 5: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/5.jpg)
text mining
![Page 6: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/6.jpg)
information retrieval
![Page 7: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/7.jpg)
find the relevant papers
![Page 8: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/8.jpg)
user-specified query
![Page 9: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/9.jpg)
“yeast AND cell cycle”
![Page 10: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/10.jpg)
![Page 11: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/11.jpg)
entity recognition
![Page 12: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/12.jpg)
identify the concepts
![Page 13: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/13.jpg)
comprehensive lexicon
![Page 14: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/14.jpg)
orthographic variation
![Page 15: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/15.jpg)
“black list”
![Page 16: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/16.jpg)
Reflect
![Page 17: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/17.jpg)
augmented browsing
![Page 18: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/18.jpg)
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009
![Page 19: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/19.jpg)
used by publishers
![Page 20: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/20.jpg)
![Page 21: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/21.jpg)
information extraction
![Page 22: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/22.jpg)
formalize the facts
![Page 23: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/23.jpg)
co-mentioning
![Page 24: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/24.jpg)
NLPNatural Language Processing
![Page 25: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/25.jpg)
Gene and protein names
Cue words for entity recognition
Verbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
![Page 26: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/26.jpg)
molecular networks
![Page 27: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/27.jpg)
![Page 28: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/28.jpg)
information on side effects
![Page 29: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/29.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 30: Introduction to text mining](https://reader036.fdocuments.us/reader036/viewer/2022062616/54b3e2b24a7959bf068b4587/html5/thumbnails/30.jpg)
Acknowledgments
Sean O’Donoghue
Sune Frankild
Heiko Horn
Evangelos Pafilis
Michael Kuhn
Reinhardt Schneider