Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf ·...
Transcript of Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf ·...
![Page 1: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/1.jpg)
Fathoming the (Deep) Real-Time Web
Defrag Conference November 12 2009
Mike Franklin Founder and CTO, Truviso
![Page 2: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/2.jpg)
Lots of Buzz about the Real-time Web
![Page 3: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/3.jpg)
But, where is the information of value?
3
![Page 4: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/4.jpg)
Here’s the Story for the Static Web
4
Think: “databases”, “log files”, “spreadsheets”, …
![Page 5: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/5.jpg)
The “Real-Time” Web is Deep Too!
5
Every: • Ad impression, click • Tweet, re-tweet, link • Wall post, friending, … • Billing event • Fast Forward, pause,… • Server request • Transaction • Network message • Fault • … Basically, all the stuff that ends up in:
“databases”, “log files”, “spreadsheets”, …
![Page 6: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/6.jpg)
Deep Real-Time Web Example: Video/TV Distribution
• Today Nielsen panels 5,000 households nationally and 20,000 in local TV markets for current television viewership.
• Today the U.S. Online Video audience is currently over 150 million viewers.
.
![Page 7: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/7.jpg)
On-Line Data is More Detailed and Relevant
From playout to reporting.
• Viewer • Program/Content • Technology platform • Minutes viewed • Program segments viewed • Ads viewed • Viewing behaviors (FF, RW)
times 10’s or 100’s M of users,
every 5 minutes… = TBs/day
![Page 8: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/8.jpg)
Giving advertisers the analytic tools to maximize their investment
8
![Page 9: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/9.jpg)
Identify and leverage trends immediately
9
![Page 10: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/10.jpg)
Identify operational and delivery problems faster
![Page 11: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/11.jpg)
Challenges of the Deep Real-Time Web
11
• It’s BIG • Companies are dealing with many TB per day • Used to be just Physicists, Spooks, and Telcos
• It’s Growing • Machine-generated vs. hand-typed @ 140 char • e.g., a video player beacon – 4KB per min per viewer
• It Gets Stale • Decisions have deadlines; things change
• It Requires Context • Demographics and history matter
• Sometimes it Shows Up Late • Seconds, minutes, days, ….
![Page 12: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/12.jpg)
Transform
Extract
Index, Build Cubes,
Update Queries
Display
Time Slice Analysis
Batch Source Data
Load
DELAY
DELAY
DELAY
DELAY
Batch Data Window
Batch Processing: A Poor Fit for the Real-Time Web
12
The basic approach is the same for a
legacy data warehouse cluster or a
brand new Hadoop cluster.
![Page 13: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/13.jpg)
Data Records
Update Queries
Source Data
Data Warehouse
Update Display
Continuous Analysis
Store
Pre-Aggregation
Continuous Analytics
13
![Page 14: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/14.jpg)
The Latency/Performance Tradeoff: Gone 8-node cluster 4-node cluster Continuous 16-node cluster
With Batch: reducing batch size reduces throughput. With Continuous Analytics this tradeoff goes away.
High Through
put
Low Latency
![Page 15: Fathoming the (Deep) Real-Time Web - EECS at UC Berkeleyfranklin/Talks/FranklinDefrag09.pdf · Fathoming the (Deep) Real-Time Web Defrag Conference November 12 2009 Mike Franklin](https://reader034.fdocuments.us/reader034/viewer/2022042200/5e9f5a9eb5921b0eac667026/html5/thumbnails/15.jpg)
15
Taming the Deep Real-Time Web Key to:
• Monetization • Personalization • Behavioral Targeting • Service Optimization • New/Improved Products • New/Improved Business
Models
Continuous Analytics provides the required
scalability and immediacy.