Visually Analyzing People with Graphs
-
Upload
graphistry -
Category
Data & Analytics
-
view
447 -
download
2
Transcript of Visually Analyzing People with Graphs
![Page 1: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/1.jpg)
1
Visually Analyzing People
Leo Meyerovich (@LMeyerov)CEO
![Page 2: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/2.jpg)
is:Supercharging visual analytics through GPU cloud streaming.
(We tricky graphs.)
![Page 3: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/3.jpg)
![Page 4: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/4.jpg)
CASE STUDY: TWITTER FRAUD
Naïve layout on 1K+ node graphs give impenetrable hairballs.
Gauss-Seidel Force-Directed Graph, O(N^2) n-body, GPU
Node: Twitter accountEdge: Friendship
Friends and friend-of-friends of a bot who randomly messaged real people and retweeted them.
![Page 5: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/5.jpg)
Even on a small graph (77 nodes), smart design starts adding clarity
![Page 6: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/6.jpg)
With smart layouts, fake account clusters pop outForceAtlas2 Layout, O(n log n) n-body, GPU
The spambot is an entrypointto more bots…
Obviously fakeaccount names
![Page 7: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/7.jpg)
A quiet small business who buys virtual game currency from
gamers…
![Page 8: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/8.jpg)
Who somehow got exactly 1 message massively
trended & advertised by Twitter
![Page 9: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/9.jpg)
spammer
laundering
accounts
bot retweet network
It’s a “retweet laundering” botnet! Tricks Twitter into targeting gamers
to check out a cyberfraud site.They steal gamers’ money and
identities.
![Page 10: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/10.jpg)
Relationships hard to see without graphs with smart layouts & interactions.
Next step: explore the time dimensionEx: how do mobs launch from Twitter?
![Page 11: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/11.jpg)
11Leo A. Meyerovich, @lmeyerov, GraphistryAriel S. Rabkin, @asrabkin, Cloudera
THE SOCIOLOGY
OFPROGRAMMIN
GLANGUAGESadoption
![Page 12: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/12.jpg)
http://hammerprinciple.com/therighttool
![Page 13: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/13.jpg)
~14,000 developers
![Page 14: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/14.jpg)
Fastest? C > Java> JavaScript > PascalSafest? Java > Pascal > JavaScript > C
Goal: Rank Beliefs
Programmers won’t agree on
ranking..
![Page 15: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/15.jpg)
Idea: Chess Ranking
![Page 16: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/16.jpg)
Let’s run a competition for the friendliest language! (Glicko2)
Each survey response is a game match:1. Person A says Python beats C in
friendliness2. Person A says Java beats C in
friendliness3. Person B says C beats APL in
friendliness …
![Page 17: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/17.jpg)
Score Points set by a BookieEvery language starts with rank 1000
1. “Person A: Python friendlier than C” Python’s rank goes up
2. “Person B: Python friendlier than C” Python already > C, less valuable win
3. “Person C: Haskell friendlier than Python”Problem: little known about Haskell (“sparse”) Haskell beat a high-rank language: big level increase!
(Bayesian!)
![Page 18: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/18.jpg)
Many Tournaments = Correlation Matrix!
Language x Belief
![Page 19: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/19.jpg)
Cluster (K-Means)
![Page 20: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/20.jpg)
Reduce Dimensionality: Pick fun languages & cluster centers
![Page 21: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/21.jpg)
Graphs are (Adjacency) Matrices
![Page 22: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/22.jpg)
Correlation Matrices are Fuzzy Graphs
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5 0.5 0.5
0.5
0.5
0.5 0.5
0.5 0.5
![Page 23: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/23.jpg)
Weak Edges Are Annoying!
![Page 24: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/24.jpg)
Filter: Only Show Strong Relationships
![Page 25: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/25.jpg)
Relationships hard to see without graphs with smart layouts & interactions.
Step 2 of analysis is correlate (step 1 is count).
Correlations are relationships, so explore them as graphs!
![Page 26: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/26.jpg)
26
Projects (2000-2010)200K[PLATEAU 2013]
![Page 27: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/27.jpg)
-20%
0%
20%
40%
60% Java
Project categories (223)
Popu
larity
0%
1%
2%
3%
4%
Scheme
Project categories (223)
Popularity Across Niches
27
blogging
search
build tools
![Page 28: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/28.jpg)
28
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
PrologVBScript SchemeFortran
PL/SQL AssemblyC#
Java
Dispersion across niches(σ / μ)
Popu
lari
tyPopularity vs. Niche: Dispersion
![Page 29: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/29.jpg)
29
0.127 1.27 12.7 1270.0100%
0.1000%
1.0000%
10.0000%
100.0000%
Language Rank (Decreasing )
Propor-tion of Projects
for Lan-
guage
Language Use (survey)Java: winner takes all
Long TailDesign for
nichesand grow
![Page 30: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/30.jpg)
30
Survey of 1,679 Developers
Extrinsic factorsdominate!
(on last project)
![Page 31: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/31.jpg)
FUTURE STEP: Now that we’ve counted things, let’s correlate
them!
Topics in Free-form ResponsesAnswer Correlations
![Page 32: Visually Analyzing People with Graphs](https://reader034.fdocuments.us/reader034/viewer/2022042723/5870d7731a28ab64768b6e5d/html5/thumbnails/32.jpg)
Relationships hard to see without graphs with smart layouts & interactions.
Step 2 of analysis is correlate (step 1 is count).
Correlations are relationships, so explore them as graphs!
Powerful because correlations everywhere:
raw features, inferred topics, …