1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang...
-
Upload
rudolph-matthews -
Category
Documents
-
view
218 -
download
0
Transcript of 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang...
![Page 1: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/1.jpg)
1/30 Remco Chang – SEAri Workshop 15
Big Data Visual Analytics: A User Centric Approach
Remco Chang
Assistant ProfessorTufts University
![Page 2: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/2.jpg)
2/30 Remco Chang – SEAri Workshop 15
Human + Computer
• Human vs. Artificial IntelligenceGarry Kasparov vs. Deep Blue (1997)– Computer takes a “brute force” approach
without analysis– “As for how many moves ahead a grandmaster
sees,” Kasparov concludes: “Just one, the best one”
• Artificial vs. Augmented IntelligenceHydra vs. Cyborgs (2005)– Grandmaster + 1 chess program > Hydra
(equiv. of Deep Blue)– Amateur + 3 chess programs > Grandmaster +
1 chess program1
1. http://www.collisiondetection.net/mt/archives/2010/02/why_cyborgs_are.php
![Page 3: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/3.jpg)
3/30 Remco Chang – SEAri Workshop 15
Example: What Does (Wire) Fraud Look Like?• Financial Institutions like Bank of America have legal responsibilities to
report all suspicious wire transaction activities (money laundering, supporting terrorist activities, etc)
• Data size: approximately 200,000 transactions per day (73 million transactions per year)
• Problems:– Automated approach can only detect known patterns– Bad guys are smart: patterns are constantly changing– Data is messy: lack of international standards resulting in ambiguous data
• Current methods:– 10 analysts monitoring and analyzing all transactions– Using SQL queries and spreadsheet-like interfaces– Limited time scale (2 weeks)
![Page 4: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/4.jpg)
4/30 Remco Chang – SEAri Workshop 15
WireVis: Financial Fraud Analysis
• In collaboration with Bank of America– Develop a visual analytical tool (WireVis)– Visualizes 7 million transactions over 1 year– Beta-deployed at WireWatch
• A great problem for visual analytics:– Ill-defined problem (how does one define fraud?)– Limited or no training data (patterns keep changing)– Requires human judgment in the end (involves law enforcement
agencies)
• Design philosophy: “combating human intelligence requires better (augmented) human intelligence”
R. Chang et al., Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information Visualization,2008.R. Chang et al., Wirevis: Visualization of categorical, time-varying data from financial transactions. IEEE VAST, 2007.
![Page 5: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/5.jpg)
5/30 Remco Chang – SEAri Workshop 15
WireVis: A Visual Analytics Approach
Heatmap View(Accounts to Keywords Relationship)
Strings and Beads(Relationships over Time)
Search by Example (Find Similar Accounts)
Keyword Network(Keyword Relationships)
![Page 6: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/6.jpg)
6/30 Remco Chang – SEAri Workshop 15
Visual Analytics = Human + Computer
• Visual analytics is “the science of analytical reasoning facilitated by visual interactive interfaces.” 1
• By design, it is a collaboration between human and computer to solve hard problems.
1. Thomas and Cook, “Illuminating the Path”, 2005.
![Page 7: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/7.jpg)
7/30 Remco Chang – SEAri Workshop 15
“The computer is incredibly fast, accurate, and stupid. Man is unbelievably slow, inaccurate, and
brilliant. The marriage of the two is a force beyond calculation.”
-Leo Cherne, 1977 (often attributed to Albert Einstein)
![Page 8: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/8.jpg)
8/30 Remco Chang – SEAri Workshop 15
Which Marriage?
![Page 9: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/9.jpg)
9/30 Remco Chang – SEAri Workshop 15
Which Marriage?
![Page 10: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/10.jpg)
10/30 Remco Chang – SEAri Workshop 15
Applications of Visual Analytics
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparisonR. Chang et al., Two Visualization Tools for Analysis of Agent-Based Simulations in Political Science. IEEE CG&A, 2012
![Page 11: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/11.jpg)
11/30 Remco Chang – SEAri Workshop 15
Applications of Visual AnalyticsWhere
When
Who
What
Original Data
EvidenceBox
R. Chang et al., Investigative Visual Analysis of Global Terrorism, Journal of Computer Graphics Forum, 2008.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 12: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/12.jpg)
12/30 Remco Chang – SEAri Workshop 15
Applications of Visual Analytics
R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, 2010. To Appear.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 13: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/13.jpg)
13/30 Remco Chang – SEAri Workshop 15
Applications of Visual Analytics
R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data , IEEE Vis (TVCG) 2009.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 14: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/14.jpg)
14/30 Remco Chang – SEAri Workshop 15
Future of Visual Analytics
• Current Approach:– One command, one response (not quite a collaboration)
• Assumptions:– User’s mouse and keyboard actions with a visualization reflect a user’s reasoning process– If the computer knows what the user’s reasoning process, it can better support (collaborate with)
the user
• Goals: • Can we extract a higher level information about the user through analyzing the
user’s interactions?• How will the computer utilize such information?
Visualization HumanOutput
Input
Keyboard, Mouse
Images (visualizations)
![Page 15: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/15.jpg)
15/30 Remco Chang – SEAri Workshop 15
Extracting User Model from Interactions
1. Learning about a User in Real-TimeWho is the user,
and what is she doing?
![Page 16: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/16.jpg)
16/30 Remco Chang – SEAri Workshop 15
Experiment: Finding Waldo
• Google-Maps style interface– Left, Right, Up, Down, Zoom In, Zoom Out, Found
![Page 17: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/17.jpg)
17/30 Remco Chang – SEAri Workshop 15
Fast completion time
Pilot Visualization – Completion Time
Slow completion time
Eli Brown et al., Where’s Waldo. IEEE VAST 2014.
![Page 18: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/18.jpg)
18/30 Remco Chang – SEAri Workshop 15
Post-hoc Analysis Results
Mean Split (50% Fast, 50% Slow)
Data Representation Classification Accuracy Method
State Space 72% SVM
Edge Space 63% SVM
Action Sequence 77% Decision Tree
Mouse Event 62% SVM
Fast vs. Slow Split (Mean+0.5σ=Fast, Mean-0.5σ=Slow)
Data Representation Classification Accuracy Method
State Space 96% SVM
Edge Space 83% SVM
Action Sequence 79% Decision Tree
Mouse Event 79% SVM
![Page 19: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/19.jpg)
19/30 Remco Chang – SEAri Workshop 15
“Real-Time” Prediction (Limited Time Observation)
State-Based
Linear SVM
Accuracy: ~70%
Interaction Sequences
N-Gram + Decision Tree
Accuracy: ~80%
![Page 20: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/20.jpg)
20/30 Remco Chang – SEAri Workshop 15
Predicting a User’s Personality
External Locus of Control Internal Locus of Control
Ottley et al., How locus of control influences compatibility with visualization style. IEEE VAST , 2011.Ottley et al., Understanding visualization by understanding individual users. IEEE CG&A, 2012.
![Page 21: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/21.jpg)
21/30 Remco Chang – SEAri Workshop 15
Predicting Users’ Personality Traits
• Noisy data, but can detect the users’ individual traits “Extraversion”, “Neuroticism”, and “Locus of Control” at ~60% accuracy by analyzing the user’s interactions alone.
Predicting user’s “Extraversion”
Linear SVM
Accuracy: ~60%
![Page 22: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/22.jpg)
22/30 Remco Chang – SEAri Workshop 15
User-Model Adaptive Databases
2. What Can a System DoIf It Knows Something About Its User?
![Page 23: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/23.jpg)
23/30 Remco Chang – SEAri Workshop 15
Problem Domain: Big Data Exploration
Visualization on aCommodity Hardware
Large Data in aData Warehouse
![Page 24: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/24.jpg)
24/30 Remco Chang – SEAri Workshop 15
Problem Statement
• Constraint: Data is too big to fit into the memory or hard drive of the personal computer– Note: Ignoring various database technologies (OLAP, Column-Store,
No-SQL, Array-Based, etc)
• Goal: Guarantee a result set to a user’s query within X number of seconds.– Based on HCI research, the absolute upperbound for X is 10 seconds– Ideally, we would like to get it down to 1 second or less
• In CS talk: trading speed for accuracy, but optimize on minimizing latency (user wait time).
![Page 25: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/25.jpg)
25/30 Remco Chang – SEAri Workshop 15
Our Approach: Predictive Pre-Computation and Pre-Fetching
• In collaboration with MIT and Brown– Models the user based on their past interaction histories– “Guesses” a set of the user’s possible next moves– pre-computes and pre-fetches the necessary data chunks– If the guesses are right, the user would experience no
wait time
![Page 26: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/26.jpg)
26/30 Remco Chang – SEAri Workshop 15
Interactive Visualization System
client
middleware
database
Predictive Engine
Caching and Query Execution
Recommender
Recommender
Recommender
Cooked Tile Cache
Semi-CookedTile Cache
Server Server Server
![Page 27: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/27.jpg)
27/30 Remco Chang – SEAri Workshop 15
Preliminary System and Evaluation
• Using a simple Waldo-like interface
• 18 users explored the NASA MODIS dataset– Users were in WA– Database in Boston
• Tasks include “find 4 areas in Europe that have a snow coverage index above 0.5”
• What happens if the guesses are “wrong”?
![Page 28: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/28.jpg)
28/30 Remco Chang – SEAri Workshop 15
Summary
![Page 29: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/29.jpg)
29/30 Remco Chang – SEAri Workshop 15
Wrap Up: Visual Analytics Theory and Practice• Visual analytics offers tremendous
opportunities to combine “human + computer” as a collaborative computational unit
• “Increasing the input bandwidth” is a critical challenge. There is a lot of “signal” about the user’s reasoning process and analysis behaviors that can be extracted from analyzing their (past) interactions.
• By modeling the user based on their past interactions, we can design very complex (adaptive) systems to better support the user. The example of “big data” is just one of many potentially rich and impactful example.
![Page 31: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/31.jpg)
31/30 Remco Chang – SEAri Workshop 15
Backup
![Page 32: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/32.jpg)
32/30 Remco Chang – SEAri Workshop 15
Prediction Algorithms
• General Idea:– Lots of “experts” who
recommends chunks of data to pre-fetch / pre-compute
– One “manager” who listens to the experts and chooses which experts’ advice to follow
– Each “expert” gets more of their recommendations accepted if they keep guessing correctly
![Page 33: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/33.jpg)
33/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31 Iteration: 0
![Page 34: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/34.jpg)
34/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31 Iteration: 0
![Page 35: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/35.jpg)
35/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31 Iteration: 0
User Requests Data Block 13
![Page 36: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/36.jpg)
36/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31 Iteration: 0
User Requests Data Block 13
![Page 37: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/37.jpg)
37/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31 Iteration: 0
User Requests Data Block 13
![Page 38: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/38.jpg)
38/30 Remco Chang – SEAri Workshop 15
4 12 34 88 27
5 23 1 92 34
42 12 31 32 13 Iteration: 1
![Page 39: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/39.jpg)
39/30 Remco Chang – SEAri Workshop 15
Training
• Instead of training the manager in real-time, this process can be done offline– Using past user interaction logs
• This approach is similar to how Database are currently tuned– Instead of a DBA manually tune the performance of a
database– Past SQL logs are used to automatically tune the database
for an organization’s specific needs (e.g. read-mostly, write-often, etc.)
![Page 40: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/40.jpg)
40/30 Remco Chang – SEAri Workshop 15
How to Determine the “Experts”?
• More detail on this later
• Some obvious ones include:– Momentum-based– Data similarity-based– Frequency (hot-spot)-based– Past action sequence-based
• Generally speaking, given the “manager” approach, we want as many different types of “experts” as possible
![Page 41: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/41.jpg)
41/30 Remco Chang – SEAri Workshop 15
Preliminary Results
• Using a simple Google-maps like interface
• 18 users explored the NASA MODIS dataset
• Tasks include “find 4 areas in Europe that have a snow coverage index above 0.5”
![Page 42: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/42.jpg)
42/30 Remco Chang – SEAri Workshop 15
13 48 11 3 99
2 13 99 67 45
82 7 22 42 31
User’s Requests Data Block 52
Worst Case Scenario: Cache Miss
![Page 43: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/43.jpg)
43/30 Remco Chang – SEAri Workshop 15
Cache Miss
• How to guarantee response time when there’s a cache miss?
• Trick: the ‘EXPLAIN’ command• Usage:
explain select * from myTable;
• Not standard SQL, but implemented in most commercial databases
![Page 44: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/44.jpg)
44/30 Remco Chang – SEAri Workshop 15
Example EXPLAIN Output from SciDB
• Example SciDB the output of (a query similar to) Explain SELECT * FROM earthquake
[("[pPlan]:schema earthquake<datetime:datetime NULL DEFAULT null,magnitude:double NULL DEFAULT null,latitude:double NULL DEFAULT null,longitude:double NULL DEFAULT null>[x=1:6381,6381,0,y=1:6543,6543,0]bound start {1, 1} end {6381, 6543}density 1 cells 41750883 chunks 1est_bytes 7.97442e+09")]
The four attributes in the table ‘earthquake’
Notes that the dimensions of this array (table) is 6381x6543
This query will touch data elements from (1, 1) to (6381, 6543), totaling 41,750,833 cells
Estimated size of the returned data is 7.97442e+09 bytes (~8GB)
![Page 45: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/45.jpg)
45/30 Remco Chang – SEAri Workshop 15
Other Examples
• Oracle 11g Release 1 (11.1)
![Page 46: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/46.jpg)
46/30 Remco Chang – SEAri Workshop 15
Other Examples
• MySQL 5.0
![Page 47: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/47.jpg)
47/30 Remco Chang – SEAri Workshop 15
Other Examples
• PostgreSQL 7.3.4
![Page 48: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/48.jpg)
48/30 Remco Chang – SEAri Workshop 15
Query Modification
• Based on the resulting query plan, our system chooses one of three strategies to reduce results from the query
– Can be based on the literal resolution of the visualization (number of pixels)
– Or desired data size
![Page 49: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/49.jpg)
49/30 Remco Chang – SEAri Workshop 15
Reduction Strategies
• Aggregation:– In SciDB, this operation is carried out as
regrid (scale_factorX, scale_factorY)
• Sampling – In SciDB, uniform sampling is carried out as
bernoulli (query, percentage, randseed)
• Filtering – Currently, the filtering criteria is user specified
where (clause)
![Page 50: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/50.jpg)
50/30 Remco Chang – SEAri Workshop 15
Quick Summary
• Key Components:1. Pre-computation and pre-
fetching2. Three-tiered system3. Pre-fetching based on
“expert-manager” approach
4. Use the “explain” trick to handle cache-miss
5. Guarantees response time, but not data quality
![Page 51: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/51.jpg)
51/30 Remco Chang – SEAri Workshop 15
Future Work: Streaming
• Integrate Streaming [Fisher et al. CHI 2012]
t = 1 second t = 5 minuteFisher et al. , Trust Me, I'm Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster. CHI 2012
![Page 52: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/52.jpg)
52/30 Remco Chang – SEAri Workshop 15
Designing “Experts”
• How much can a user’s past interactions tell us about:
– The user’s future analysis behaviors?– The user’s analysis style?– The user’s analysis intent?– The user’s mental model of the data and problem?
• Fundamental question in Visualization and HCI…
![Page 53: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/53.jpg)
53/30 Remco Chang – SEAri Workshop 15
Project Outline
“Reverse engineer” the human cognitive black box (by analyzing user interactions)
A. Data Modeling– Interactive Metric Learning
B. User Modeling– Predict Analysis Behavior
C. Interactive Big Data Databases– Adaptive Pre-fetching and computation
R. Chang et al., Science of Interaction, Information Visualization, 2009.
![Page 54: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/54.jpg)
54/30 Remco Chang – SEAri Workshop 15
Data Modeling
1. Interactive Metric LearningQuantifying a User’s Knowledge about Data
![Page 55: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/55.jpg)
55/30 Remco Chang – SEAri Workshop 15
1. Richard Heuer. Psychology of Intelligence Analysis, 1999. (pp 53-57)
![Page 56: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/56.jpg)
56/30 Remco Chang – SEAri Workshop 15
Exploring High-Dimensional Space: iPCA
Jeong et al., iPCA: An Interactive System for PCA-based Visual Analytics . Eurovis 2009.
![Page 57: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/57.jpg)
57/30 Remco Chang – SEAri Workshop 15
Metric Learning
• Finding the weights to a linear distance function
• Instead of a user manually give the weights, can we learn them implicitly through their interactions?
![Page 58: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/58.jpg)
58/30 Remco Chang – SEAri Workshop 15
Metric Learning
• In a projection space (e.g., MDS), the user directly moves points on the 2D plane that don’t “look right”…
• Until the expert is happy (or the visualization can not be improved further)
• The system learns the weights (importance) of each of the original k dimensions
• Short Video (play)
![Page 59: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/59.jpg)
59/30 Remco Chang – SEAri Workshop 15
Dis-Function
Brown et al., Find Distance Function, Hide Model Inference. IEEE VAST Poster 2011Brown et al., Dis-function: Learning Distance Functions Interactively. IEEE VAST 2012.
Optimization:
![Page 60: 1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.](https://reader035.fdocuments.us/reader035/viewer/2022062516/56649e155503460f94b0015c/html5/thumbnails/60.jpg)
60/30 Remco Chang – SEAri Workshop 15
Results
• Used the “Wine” dataset (13 dimensions, 3 clusters)
• Added 10 extra dimensions, and filled them with random values
• Blue: original data dimension• Red: randomly added
dimensions• X-axis: dimension number• Y-axis: final weights of the
distance function