AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11...
Transcript of AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11...
![Page 1: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/1.jpg)
1
AMiner II — Toward Understanding
Big Scholar Data
@2006-2015, http://aminer.org
Jie Tang
Tsinghua University
![Page 2: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/2.jpg)
2
Mining Knowledge from Big Data
Drown or
Survive?
Knowledge
![Page 3: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/3.jpg)
3
4
- Researcher profile
extraction
- Expert finding
- Social network search
- Topic browser
- Conference analysis
- ArnetApp platform
![Page 4: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/4.jpg)
4
Person
Search Basic Info.
Citation
statistics
Ego network
Research Interests
![Page 5: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/5.jpg)
5
Expert
Search
Finding experts,
for ―data mining‖
Demographics: gender, language, location, etc.
Knowledge about
―data mining‖
similar authors
![Page 6: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/6.jpg)
6
Organization
Ranking
By different metrics
![Page 7: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/7.jpg)
7
Conference
Ranking
![Page 8: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/8.jpg)
8
Reviewer
Suggestion
Interest matching
COI avoiding
Load balancing
Forecast review quality
![Page 9: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/9.jpg)
9
Reviewer Suggestion
![Page 10: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/10.jpg)
10
Academic Social Network Analysis
and Mining system—AMiner
(http://aminer.org)
Online since 2006
>39 million researcher profiles
>100 million publication papers
>241 million requests
>12.35 Terabyte data
100K IP access from 170 countries
per month
10% increase of visits per month
Deep analysis, mining, and search
AMiner II (ArnetMiner)
![Page 11: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/11.jpg)
11
6.32 million IP from 220 countries/regions
User Distribution
![Page 12: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/12.jpg)
12
Top 10 countries
1. USA 6. Canada
2. China 7. Japan
3. Germany 8. Spain
4. India 9. France
5. UK 10. Italy
6.32 million IP from 220 countries/regions
User Distribution
![Page 13: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/13.jpg)
13
Technologies —Toward understanding big scholar data
Recent progress…
![Page 14: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/14.jpg)
14
Ruud Bolle Office: 1S-D58
Letters: IBM T.J. Watson Research Center
P.O. Box 704
Yorktown Heights, NY 10598 USA
Packages: IBM T.J. Watson Research Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Email: [email protected]
Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's
Degree in Analog Electronics in 1977 and the Master's Degree in Electrical
Engineering in 1980, both from Delft University of Technology, Delft, The
Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in
1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode
Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Department. In 1988 he became manager of the newly formed Exploratory Computer
Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video
processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
DBLP: Ruud Bolle
2006
Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics:
A Case Study in Fingerprints. ICPR (4) 2006: 370-373EE50
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint
Representation Using Localized Texture Features. ICPR (4) 2006: 521-524EE49
Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle:
Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)EE48
2005
Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior:
The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20EE47
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha:
Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116EE46
...
Ruud Bolle Office: 1S-D58
Letters: IBM T.J. Watson Research Center
P.O. Box 704
Yorktown Heights, NY 10598 USA
Packages: IBM T.J. Watson Research Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Email: [email protected]
Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's
Degree in Analog Electronics in 1977 and the Master's Degree in Electrical
Engineering in 1980, both from Delft University of Technology, Delft, The
Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in
1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode
Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Department. In 1988 he became manager of the newly formed Exploratory Computer
Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video
processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
Knowledge Acquisition from the Web (ACM TKDD, WWW’12, ISWC’06, ICDM’07, ACL’07)
Contact Information
Educational history
Academic services
Publications
1
1
2
2
Ruud Bolle
Position
Affiliation
Address
Address
Phduniv
Phdmajor
Phddate
Msuniv
Msdate
Msmajor
BsunivBsdate
Bsmajor
Research Staff
IBM T.J. Watson Research
Center
P.O. Box 704
Yorktown Heights,
NY 10598 USA
Brown University
1984
Electrical Engineering
Delft University of Technology
Analog Electronics
1977
Delft University of Technology
IBM T.J. Watson Research
Center
19 Skyline Drive
Hawthorne, NY 10532 USA
IBM T.J. Watson
Research Center
Electrical Engineering
1980
Applied Mathematics
Msmajor
http://researchweb.watson.ibm.com/
ecvg/people/bolle.html
Homepage
Ruud BolleName
video database indexing
video processing
visual human-computer interaction
biometrics applications
Research_Interest
Photo
Publication 1#
Cancelable Biometrics:
A Case Study in
Fingerprints
ICPR 370
2006
Date
Start_page
Venue
Title
373
End_page
Publication 2#
Fingerprint
Representation Using
Localized Texture
Features
ICPR 521
2006
Date
Start_page
Venue
Title
524
End_page
. . .
Co-authorCo-author
1
Ruud Bolle
2
Publication #3
Publication #5
coauthor
coauthor
UIUC affiliation
Professor position
2
1
![Page 15: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/15.jpg)
15
Researcher Profile Database[1]
Extracted more than 1,000,000
researcher profiles from the Web
[1] J. Tang, L. Yao, D. Zhang, and J. Zhang. A Combination Approach to Web User Profiling. ACM Transactions on Knowledge Discovery from
Data (TKDD), (vol. 5 no. 1), Article 2 (December 2010), 44 pages.
![Page 16: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/16.jpg)
16
Is this Enough?
![Page 17: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/17.jpg)
17
Required semantics are distributed in
multiple sources
LinkedIn Videolectures
![Page 18: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/18.jpg)
18
Knowledge Linking
• Identifying users from multiple heterogeneous networks and integrating
semantics from the different networks together.
![Page 19: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/19.jpg)
19
COSNET: Connecting Social Networks with
Local and Global Consistency
• Input: G={G1, G2, …, Gm}, with Gk=(Vk, Ek, Rk)
• Formalization: X={xi}, all possible pairwise
matchings and each corresponds to
• COSNET: an energy-based model
[1] Yutao Zhang, Jie Tang, Zhilin Yang, Jian Pei, and Philip Yu. COSNET: Connecting Heterogeneous Social Networks
with Local and Global Consistency. KDD’15.
yi Î{1,0}
Y * = argminE(Y ,X)
![Page 20: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/20.jpg)
20
Local vs. Global consistency
• Given three networks,
𝑣13
𝑣11
𝑣31
𝑣12
𝑣22
𝑣21
𝑣33 𝑣2
3 𝑣32
𝐺2
Username:
Ortiz_Brandy
Nation: USA
Gender: female
Username:
@ortizbrandy
Nation: USA
Gender: female
𝐺 1
𝐺3
![Page 21: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/21.jpg)
21
Local vs. Global consistency
• Local matching: matching users by profiles
𝑣13
𝑣11
𝑣31
𝑣12
𝑣22
𝑣21
𝑣33 𝑣2
3 𝑣32
𝐺2
局部一致性
Username:
Ortiz_Brandy
Nation: USA
Gender: female
Username:
@ortizbrandy
Nation: USA
Gender: female
𝐺 1
𝐺3
Local
consistency
Pairwise similarity features
– Username similarity and
uniqueness
– Profile content similarity
– Ego network similarity
– Social status
Energy function
![Page 22: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/22.jpg)
22
𝑣13
𝑣11
𝑣31
𝑣12
𝑣22
𝑣21
𝑣33 𝑣2
3 𝑣32
𝐺2
局部一致性网络一致性
Username:
Ortiz_Brandy
Nation: USA
Gender: female
Username:
@ortizbrandy
Nation: USA
Gender: female
𝐺 1
𝐺3
Local vs. Global consistency
• Network matching: matching users’ ego networks
Network
matching Local
consistency Encourage ―neighborhood-
preserving matching‖
![Page 23: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/23.jpg)
23
Network Matching
• Network matching: matching users’ ego networks
𝒗𝟏𝟏
𝒗𝟏𝟐 𝒗𝟏
𝟑
𝒗𝟐𝟏
𝒗𝟐𝟐 𝒗𝟐
𝟑
𝒗𝟏𝟏𝒗𝟐
𝟏 𝒗𝟏𝟐𝒗𝟐
𝟏
𝒗𝟏𝟏𝒗𝟐
𝟐 𝒗𝟏𝟑𝒗𝟐
𝟑 𝒗𝟏𝟐𝒗𝟐
𝟐
𝒗𝟏𝟐𝒗𝟐
𝟑 𝒗𝟏𝟏𝒗𝟐
𝟑
𝒗𝟏𝟑𝒗𝟐
𝟏
𝒗𝟏𝟑𝒗𝟐
𝟐
𝑮𝟏
𝑮𝟐
𝑪
Input networks Matching graph
Energy function
![Page 24: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/24.jpg)
24
𝑣13
𝑣11
𝑣31
𝑣12
𝑣22
𝑣21
𝑣33 𝑣2
3 𝑣32
𝐺2
局部一致性网络一致性
全局一致性
Username:
Ortiz_Brandy
Nation: USA
Gender: female
Username:
@ortizbrandy
Nation: USA
Gender: female
𝐺 1
𝐺3
Local vs. Global consistency
• Global consistency: matching users by avoiding global
inconsistency
Global
inconsistency
Network
matching Local
consistency
Avoid ―global inconsistency‖
![Page 25: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/25.jpg)
25
Avoid global inconsistency
Energy function
Input networks Matching graph
![Page 26: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/26.jpg)
26
Model Construction
Objective function by combining all the energy functions
𝑣31
𝑣12
𝑣22
𝑣21
𝑣32
𝐺2
s
𝑣11𝑣1
2
𝑣11
Matching Graph Generation
Candidate Pruning
Model Construction
𝑣11𝑣2
2 𝑣11𝑣3
2
𝑣21𝑣2
2 𝑣21𝑣1
2 𝑣21𝑣3
2
𝑣31𝑣1
2 𝑣31𝑣2
2 𝑣31𝑣3
2
𝑥1 𝑥2 𝑥3
𝑥4 𝑥5
𝑦1
𝑦4 𝑦5
𝑦2 𝑦3
𝑓 𝑙 (𝑥1 ) 𝑓 𝑙 (𝑥2 ) 𝑓 𝑙 (𝑥4 ) 𝑓 𝑙 (𝑥5 )
𝑓 𝑙 (𝑥3 )
𝑓 𝑒 (𝑦1 , 𝑦2 )
𝐺1
𝑣11𝑣1
2 𝑣11𝑣3
2
𝑣21𝑣2
2
𝑣31𝑣1
2 𝑣31𝑣3
2
𝑣11𝑣1
2 𝑣11 𝑣3
2 𝑣21𝑣2
2
𝑣31𝑣1
2 𝑣31𝑣3
2
𝑓 𝑒 (𝑦2 , 𝑦3 )
𝑓 𝑒 (𝑦2 , 𝑦4 )
𝑓 𝑒 (𝑦2 , 𝑦5 )
(a) Two input networks (b) The generated matching graph (c) Matching graph after pruning (d) The constructed model
![Page 27: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/27.jpg)
27
Model Learning
• Max-margin learning
• As the original problem is intractable, we use Lagrangian
relaxation to decompose the original objective function into a
set of easy-to-solve sub-problems
![Page 28: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/28.jpg)
28
Model Learning (cont.)
• Dual decomposition
The resulting objective function is convex
and non-differentiable, and can be solved
by projected sub-gradient method
This provides a lower bound to
the original function
![Page 29: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/29.jpg)
29
Results
![Page 30: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/30.jpg)
30
Connecting AMiner with …
• LinkedIn and VideoLectures
Name-match: match name only;
SVM: use classifier to identify the same user;
MNA: an optimization method;
SiGMa: local propagation;
COSNET: our method;
COSNET-: w/o global consistency.
![Page 31: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/31.jpg)
31
Application in AMiner
- Video contents
- Personal profiles- Business connections- Skills and expertise
- Patents data
![Page 32: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/32.jpg)
32
AMiner Today — A brief summary
![Page 33: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/33.jpg)
33
ArnetMiner’s History
Date Version New Features
2006/5 V0.1 Profile extraction, person/paper/conf. search
2006/8 V1.0 Rewritten all codes in Java.
2007/7 V2.0 Survey search, research interest, association search
2008/11 V4.0 Graph search, topic mining, NSFC/NSF
2009/4 V5.0 Bole/course search, profile editing, open resources,
2009/12 V6.0 Academic statistics, user feedbacks, refined ranking
2010/5 V7.0 Name disambiguation, reviewer assignment, open API
2011/7 V8.0 AMiner, location search, conference analysis
2012/3 V9.0 New UI, cross-domain collaboration
2013/5 VII Knowledge graph, new architecture
2014/10 VII 2.0 Organization ranking, conference ranking
2015/4 VII 3.0 Network integration, deep learning
![Page 34: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/34.jpg)
34
Widely used..
The largest
publisher:
Elsevier
Conferences
KDD 2010
KDD 2011
KDD 2012
WSDM 2011
ICDM 2011
ICDM 2012
SocInfo 2011
ICMLA 2011
WAIM 2011
etc.
……
![Page 35: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/35.jpg)
35
AMiner as a platform…
AMiner
Mining knowledge from
patents:
• competitor analysis
• company search
• patent summarization
PatentMiner
Mining ―QQ‖
• Association search
• Influence analysis
• Hot topic detection
QQMiner
Mining Pubmed data
• Expert finding
• Ranking subgraphs
• Novel search
• Instant search
PubmedMiner
Mining more data…
...
![Page 36: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/36.jpg)
36
Opportunity: exploiting social network and semantic web
in the real-world
Social search
& mining
Social extraction
Social mining
IBM US, Tencent
IBM CRL
Scientific
Literature
Users cover >180
countries
>600K researcher
>3M papers
Arnetminer.org
(NSFC, 863)
Advertisement
Advertisement
Recommendation
Sohu
Mobile Context
Mobile search
& recommendation
Nokia
Large-scale
Mining
Scalable algorithms
for message tagging
and community
Discovery
Energy trend
analysis
Energy product
Evolution
Techniques
Trend
Oil Company
Search, browsing, complex query, integration, collaboration, trustable
analysis, decision support, intelligent services,
Web, relational data,
ontological data,
social data
Data Mining and Social Network techniques
国家核高基项目 (NSFC)
自然科学基金重点 (NSFC Key)
科技信息资源内容监测与分析服务平台 (中国科技部信息情报研究所)
![Page 37: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/37.jpg)
37
Representative Publications
• Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner:
Extraction and Mining of Academic Social Networks. KDD’08. (Top 6 cited papers among
KDD 2008's papers)
• Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social Influence Analysis in Large-scale
Networks. KDD'09. (Top 4 cited papers among KDD 2009's papers)
• Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu, Jingyi Guo. Mining
Advisor-Advisee Relationships from Research Publication Networks. KDD’10.
• Jie Tang, Sen Wu, Jimeng Sun, and Hang Su. Cross-domain Collaboration
Recommendation. KDD'12 (Full Presentation & Best Poster Award)
• Yutao Zhang, Jie Tang, Zhilin Yang, Jian Pei, and Philip Yu. COSNET: Connecting
Heterogeneous Social Networks with Local and Global Consistency. KDD’15.
• Jie Tang, Limin Yao, Duo Zhang, and Jing Zhang. A Combination Approach to Web User
Profiling. ACM TKDD, 2010.
• Jie Tang, Jing Zhang, Ruoming Jin, Zi Yang, Keke Cai, Li Zhang, and Zhong Su. Topic
Level Expertise Search over Heterogeneous Networks. Machine Learning Journal,
2011.
• Jie Tang, A.C.M. Fong, Bo Wang, and Jing Zhang. A Unified Probabilistic Framework for
Name Disambiguation in Digital Library. IEEE TKDE, 2012.
![Page 38: AMiner II Toward Understanding - 203.170.84.89203.170.84.89/~idawis33/bds/bds15/wp-content... · 11 6.32 million IP from 220 countries/regions User Distribution . 12 Top 10 countries](https://reader035.fdocuments.us/reader035/viewer/2022063011/5fc7364315712c58cf6087bb/html5/thumbnails/38.jpg)
38
Thanks!
Jie Tang, KEG, Tsinghua U, http://keg.cs.tsinghua.edu.cn/jietang
Download data & Codes, http://aminer.org/download
AMiner.org