Pruning cooccurrence networks
Transcript of Pruning cooccurrence networks
![Page 2: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/2.jpg)
Bibliometric networks
![Page 3: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/3.jpg)
Density problem
Dense networks are hard to
visualize
interpret
Solution: pruning networks
PathFinder (Schvaneveldt, 1990)
Deleting low-weight links (De Nooy, Mrvar, and Batagelj, 2005)
Cocitation and bibliographic coupling (Persson, 2010)
Threshold for cosine values (Leydesdorff, 2007; Egghe &
Leydesdorff, 2009)
![Page 4: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/4.jpg)
Cooccurrence networks
E.g. cocitation, bibliographic coupling, coauthorship…
Especially prone to density problem
Cooccurrence networkTwo-mode network
e.g., authors
e.g., citingpapers
![Page 5: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/5.jpg)
Methods
![Page 6: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/6.jpg)
Steps
Based on Zweig and Kaufman (2011): we start from two-mode network
1. Define pattern of interest
2. Determine interestingness of cooccurrence
3. If cooccurrence is interesting, authors are linked
![Page 7: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/7.jpg)
Why interestingness?
Highly cited author
High coocurrence counts with many other authors
Citing paper referring to many authors under consideration
Resulting cooccurrences are less important
![Page 8: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/8.jpg)
Determining interestingness
Here:
How to determine Exp and σ?
Estimate by sampling from Fixed Degree Sequence Model (FDSM): all two-mode networks with same node degrees
Markov Chain Monte Carlo simulation: link swapping
If p < 0.0001 (or z > 3.29) , we consider link interesting
![Page 9: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/9.jpg)
Link swapping
e.g., authors
e.g., citingpapers
![Page 10: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/10.jpg)
Link swapping
![Page 11: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/11.jpg)
Results
![Page 12: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/12.jpg)
Author cocitation
Author (co-)citations to
12 authors from bibliometrics
12 authors from information retrieval
in Scientometrics and JASIS, 1996-2000
Same data set studied by
Ahlgren, Jarneving & Rousseau (2003)
Egghe & Leydesdorff (2009)
Leydesdorff & Vaughan (2006)
![Page 13: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/13.jpg)
Author cocitations: cosine
![Page 14: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/14.jpg)
Author cocitations: FDSM and z-scores
![Page 15: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/15.jpg)
Bibliographic coupling
Bibliographic coupling of all JASIST articles, 1999-2000
n = 371
12 981 unique references
Two VOSviewer maps
cosine normalization
FDSM and z-scores
![Page 16: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/16.jpg)
Bibliographic coupling: cosine
![Page 17: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/17.jpg)
Bibliographic coupling: FDSM and z-scores
![Page 18: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/18.jpg)
Conclusions
Advantages
1. Both positive and negative cooccurrences
2. Thresholds correspond to specific p-values
3. Accounts for degree variations of bottom nodes
Disadvantages
1. Some nodes may become isolates
2. More computationally intensive than cosine similarity
![Page 19: Pruning cooccurrence networks](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a3a7511a28ab400d8b4871/html5/thumbnails/19.jpg)
Thank you!