Search Pubmed With R Part3
Transcript of Search Pubmed With R Part3
![Page 1: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/1.jpg)
SearchSearch PubmedPubmed withwith RR
Part3Part3
![Page 2: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/2.jpg)
Query Query pubmedpubmed titles for systemic lupus titles for systemic lupus erythematosuserythematosus with R Package RISmedwith R Package RISmed11
#Type the following in the R console:#Type the following in the R console:library(RISmedlibrary(RISmed))lupus<lupus<-- EUtilsSummary('lupus[TiEUtilsSummary('lupus[Ti] ] erythematosus[tierythematosus[ti] ] systemic[Tisystemic[Ti]', ]', retmaxretmax=200)=200)
# # retmaxretmax refer to Maximum number of records to retrieve, default is 100refer to Maximum number of records to retrieve, default is 1000.0.
fetch.lupusfetch.lupus <<-- EUtilsGet(lupusEUtilsGet(lupus))fetch.lupusfetch.lupus
# Results: # Results: PubMedPubMed query: query: lupus[Tilupus[Ti] AND ] AND erythematosus[tierythematosus[ti] AND ] AND systemic[Tisystemic[Ti] Records: 200 ] Records: 200
lupus.titlupus.tit<<--ArticleTitle(fetch.lupusArticleTitle(fetch.lupus))lupus.titlupus.tit [1:10] # to view the first 10 results of titles[1:10] # to view the first 10 results of titles
# export results to text file# export results to text file
write(lupus.tit,filewrite(lupus.tit,file="="lupusRISmedTi.txtlupusRISmedTi.txt")")ReferencesReferences11-- RISmedRISmed packagepackage: : StephanieStephanie KovalchikKovalchik (2013). (2013). RISmedRISmed: : DownloadDownload contentcontent fromfrom NCBI NCBI databasesdatabases. R . R packagepackage versionversion 2.1.0. 2.1.0.
httphttp://://CRAN.RCRAN.R--project.orgproject.org//packagepackage==RISmedRISmed
![Page 3: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/3.jpg)
Query Query pubmedpubmed titles for systemic titles for systemic lupus lupus erythematosuserythematosus using using RISmedRISmed
![Page 4: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/4.jpg)
View results of the exported text fileView results of the exported text file
Export results to text file with R command line Export results to text file with R command line write(lupus.tit,filewrite(lupus.tit,file="="lupusRISmedTi.txtlupusRISmedTi.txt")")# export title results as text file and open file in excel or an# export title results as text file and open file in excel or any other valid text editory other valid text editor
![Page 5: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/5.jpg)
Find the Title Verb Relation with Find the Title Verb Relation with ReverbReverb
REVERB1 is an open extractor executable jarexecutable jar programdeveloped by the University of Washington's Turing Center.
• It is important to note that Reverb is dependent on JAVA, therefore itis not a R program.
• Reverb is powerful and provides useful information about structurerelation of a text. It is relative easy to use and runs very fast.
• In our case we will apply Reverb to to our text title results.
Reference:@inproceedings{ReVerb2011, author = {Anthony Fader and Stephen Soderland and Oren Etzioni},
title = {Identifying Relations for Open Information Extraction}, booktitle = {Proceedings of the Conference of Empirical Methods in Natural Language Processing ({EMNLP} '11)}, year = {2011}, month = {July 27-31}, address = {Edinburgh, Scotland, UK} }
![Page 6: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/6.jpg)
Install ReverbInstall ReverbYou can download the latest You can download the latest ReVerbReVerb jar from jar from http://reverb.cs.washington.edu/reverbhttp://reverb.cs.washington.edu/reverb--latest.jarlatest.jar
This is the executable jar file is easy to run from MSThis is the executable jar file is easy to run from MS--DOS command. DOS command.
In In https://github.com/knowitall/reverb/https://github.com/knowitall/reverb/ you can find how to use you can find how to use Reverb. It provides the following example which illustrates whaReverb. It provides the following example which illustrates what it t it does:does:
““ReVerbReVerb takestakes rawraw texttext as as inputinput, , andand outputsoutputs (argument1, (argument1, relationrelationphrasephrase, argument2) triples. , argument2) triples. ForFor exampleexample, , givengiven thethe sentencesentence"Bananas are "Bananas are anan excellentexcellent sourcesource ofof potassiumpotassium," ," ReVerbReVerb willwill extractextractthethe triple (bananas, be triple (bananas, be sourcesource ofof, , potassiumpotassium).).””
In In orderorder toto runrun ReverbReverb youyou needneed toto havehave Java Java installedinstalled onon youryourcomputercomputer. . YouYou can can installinstall Java Java fromfrom https://www.java.com/en/download/https://www.java.com/en/download/
Reference:@inproceedings{ReVerb2011, author = {Anthony Fader and Stephen Soderland and Oren Etzioni}, title = {Identifying Relations for Open Information Extraction}, booktitle = {Proceedings of the Conference of Empirical Methods in
Natural Language Processing ({EMNLP} '11)}, year = {2011}, month = {July 27-31}, address = {Edinburgh, Scotland, UK} }
![Page 7: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/7.jpg)
Use of ReverbUse of Reverb
Place Place reverb-latest.jar file and the result file “lupusRISmedTi.txtlupusRISmedTi.txt”” under the same folderunder the same folder
Figure shows example of the 2 files in the same folder (which we named Reverb-Java)
![Page 8: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/8.jpg)
Use of ReverbUse of Reverb
11--Open the MSOpen the MS--DOS DOS cmdcmd and type the path of and type the path of the folder (Reverbthe folder (Reverb--Java in our example) Java in our example) containing both files: containing both files: reverb-latest.jar file and lupusRISmedTi.txtlupusRISmedTi.txt
![Page 9: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/9.jpg)
Use ReverbUse Reverb22-- Type the following cmd line to view results on the console:
java -Xmx512m -jar reverb-latest.jar lupusRISmedTi.txtlupusRISmedTi.txt
Results are displayed on the MSResults are displayed on the MS--DOS windowDOS window
![Page 10: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/10.jpg)
Use of ReverbUse of Reverb-- export the results to export the results to xlsxls filefile
33-- Type the following cmd line to export results to a file ::
java -Xmx512m -jar reverb-latest.jar lupusRISmedTi.txtlupusRISmedTi.txt > > ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt
(the name given to the file was ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt. You can use . You can use other name or even export to a other name or even export to a xlsxls file if you type file if you type ReverbLupusRISmedTi.xlsReverbLupusRISmedTi.xls
![Page 11: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/11.jpg)
Open the Reverb result file Open the Reverb result file ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt with MS excel with MS excel
![Page 12: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/12.jpg)
Reverb outputReverb outputThe Reverb output has 18 columnsThe Reverb output has 18 columns
(see results in the excel file)(see results in the excel file)TheThe mostmost interestinginteresting are:are:
Col 3 (Col C) : Argument1 Col 3 (Col C) : Argument1 Col 4 (Col D): Col 4 (Col D): VerbVerb RelationRelation phrasephraseCol 5 (Col E): Argument2Col 5 (Col E): Argument2
(Col 12 (Col 12 referrefer toto thethe confidenceconfidence thatthat thisthis extractionextraction isis correctcorrect andand col 2 col 2 referrefer toto the sentence number where the extraction came from)
![Page 13: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/13.jpg)
Reverb ResultsReverb ResultsResults of the first 5 rows (excel) from columns 3Results of the first 5 rows (excel) from columns 3--55
11-- childhoodchildhood--onset systemic lupus onset systemic lupus erythematosuserythematosus is associated withis associated with ethnicityethnicity22-- renal involvementrenal involvement are lower inare lower in ACE inhibitorACE inhibitor--treated patientstreated patients33-- PrednisonePrednisone inducedinduced twotwo--way myocardial developmentway myocardial development44-- Acetylated Acetylated histoneshistones contribute tocontribute to the the immunostimulatoryimmunostimulatory potential of potential of
NeutrophilNeutrophil ExtracellularExtracellular TrapsTraps55--clinical practiceclinical practice monitor the impact ofmonitor the impact of systemic lupus systemic lupus erythematosuserythematosus
Note: Note: Blue color refer to argument 1Blue color refer to argument 1; white color is verb relation; ; white color is verb relation; orange color orange color refer to argument 2refer to argument 2
![Page 14: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/14.jpg)
Prepare Reverb ResultsPrepare Reverb Resultsdata for R data for R WordcloudWordcloud
# use # use read.tableread.table script (from referencescript (from reference11 ) as follows:) as follows:d <d <--
read.table('ReverbLupusRISmedTi.txt',quoteread.table('ReverbLupusRISmedTi.txt',quote='',='',commentcomment.char.char='', ='', allowEscapesallowEscapes==F,sepF,sep='='\\t', header=FALSE, t', header=FALSE, as.isas.is=T, =T, stringsAsFactorsstringsAsFactors=F)=F)
# transforms the data into a data frame# transforms the data into a data framee<e<--as.data.frame(das.data.frame(d))# merge columns (3# merge columns (3--5) into a single text sentence5) into a single text sentencef=paste(e$V3,e$V4,e$V5) f=paste(e$V3,e$V4,e$V5) f[1:3] f[1:3] # view the first 3 lines # view the first 3 lines [1] "childhood[1] "childhood--onset systemic lupus onset systemic lupus erythematosuserythematosus is associated with ethnicity"is associated with ethnicity"[2] "renal involvement are lower in ACE inhibitor[2] "renal involvement are lower in ACE inhibitor--treated patients" treated patients" [3] "Prednisone induced two[3] "Prednisone induced two--way myocardial development"way myocardial development"Reference:Reference:1 Please stop using Excel1 Please stop using Excel--like formats to exchange datalike formats to exchange data
December 7th, 2012John MountDecember 7th, 2012John Mount
![Page 15: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/15.jpg)
Represent Reverb ResultsRepresent Reverb Resultsin R in R WordcloudWordcloud
library (tm)my.corpusmy.corpus<<--Corpus(VectorSource(fCorpus(VectorSource(f))))summary(my.corpus)inspect(my.corpus [1:3]) my.corpus <- tm_map(my.corpus, removeWords, stopwords("english"))#my.corpus <- tm_map(my.corpus, stemDocument)myTdm <- TermDocumentMatrix(my.corpus, control =
list(wordLengths=c(1,Inf)))myTdm
# A term-document matrix (140 terms, 26 documents)# Non-/sparse entries: 163/3477# Sparsity : 96%# Maximal term length: 22 # Weighting : term frequency (tf)
![Page 16: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/16.jpg)
Represent Reverb ResultsRepresent Reverb Resultsin R in R WordcloudWordcloud
findFreqTerms(myTdm, lowfreq=2)# [1] "associated" "damage" "distinct" "erythematosus"# [5] "increased" "independently" "lupus" "systemic"
termFrequency <- rowSums(as.matrix(myTdm))termFrequency <- subset(termFrequency, termFrequency>=10)m <- as.matrix(myTdm)wordFreq <- sort(rowSums(m), decreasing=TRUE) # This yields Word
Frequencylibrary (wordcloud)#library (RColorBrewer)set.seed(375) pal1 <- brewer.pal(6,"Dark2")wordcloud(words=names(wordFreq), freq=wordFreq,
scale=c(2,.9),min.freq=1, random.order=F, colors= pal1)
![Page 17: Search Pubmed With R Part3](https://reader031.fdocuments.us/reader031/viewer/2022030309/577cc9ca1a28aba711a4a1ac/html5/thumbnails/17.jpg)
R WordcloudR Wordcloud of Reverb Resultsof Reverb Results