Creating Business Value From Big Data, Analytics & Technology
Creating Added Value with Big Data
-
Upload
klaas-bosteels -
Category
Documents
-
view
418 -
download
2
description
Transcript of Creating Added Value with Big Data
![Page 1: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/1.jpg)
CREATINGADDED VALUEWITH BIG DATA
by KLAAS BOSTEELS@klbostee
![Page 2: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/2.jpg)
MY CAREER PATH SO FAR
2007: Began working with big data as PhD student
2009: Embarked on a data science career at Last.fm
2011: Joined Massive Media as Lead Data Scientist
Data company at heart; one of the earliest Hadoop adopters world-wide; inventors of Ketama; organised first “NoSQL” meetup in SF.
Huge audience and tremendous potential, but data science newcomer at the time.
![Page 3: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/3.jpg)
Second big product of Massive Media, after Netlog
2011: Initial launch of Twoo.com
2012: Biggest dating site world-wide on comScore
2013: Massive Media acquired by InterActiveCorp
![Page 4: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/4.jpg)
IT’S A BIG FAMILY
IAC’s main personals brands:
Some other well-known IAC brands:
![Page 5: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/5.jpg)
STEP 1
FOLLOW THE MONEY
photo by Chris Isherwood
![Page 6: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/6.jpg)
BOOTSTRAP BY SAVING OR GAINING MONEY
You need to get some capital to get started
Saving money tends to be easier in practice
Real-world example:
• Analyzing CDN logs unveiled abuse
• Stopping the abuse greatly reduced the bills
![Page 7: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/7.jpg)
STEP 2
EMBRACE HADOOP
photo by Doug Kukurudza
![Page 8: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/8.jpg)
HADOOP
Not the holy grail, but deserves a central role
It has a vibrant community and is proven to be:
ECONOMICAL runs on commodity hardware
SCALABLE smart distributed processing
MAINTAINABLE very robust and fault-tolerant
FLEXIBLE predefined schemas not required
![Page 9: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/9.jpg)
STEP 3
BUILD DASHBOARDS
photo by Dawn Hopkins
![Page 10: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/10.jpg)
STATS PIPELINE BASED ON HADOOP
MapReduce
HBase
HDFS
Log collector
Dashboardsin batches
continuous
![Page 11: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/11.jpg)
STATS PIPELINE BASED ON HADOOP
Realtimeprocessing
Cfr. “lambda architecture”
coined by @nathanmarz
MapReduce
HBase
HDFS
Log collector
Dashboardsin batches
continuous
![Page 12: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/12.jpg)
STATS PIPELINE BASED ON HADOOP
Ad-hoc results
Realtimeprocessing
Cfr. “lambda architecture”
coined by @nathanmarz
MapReduce
HBase
HDFS
Log collector
Dashboardsin batches
continuous
![Page 13: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/13.jpg)
CUSTOM-TAILORED WEB INTERFACE
Annotation & exporting functionality
SupportsA/B testingand cohort
analysis
Various othernifty extra’s
![Page 14: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/14.jpg)
STEP 4
ASSEMBLE A TEAM
photo by Jean-François Schmitz
![Page 15: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/15.jpg)
THE SECRET IS IN THE MIX
Hadoop’s tricks also apply to data science teams
• Avoid specialisation to allow easy distribution and scaling
• Exploit data locality by hiring people with wide skill set
Great Data Scientists have the right mix of skills
• Hackers with solid technical background
• Analytical mind that knows statistics and machine learning
• Clever and creative in everything they do
![Page 16: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/16.jpg)
CHEAPER TECH MAKES PEOPLE MORE EXPENSIVE
Graph by Trifacta. Source: John C. McCallum, Wikipedia and Federal Reserve Bank of St Louis. Inflation adjusted to 2011 dollars.
![Page 17: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/17.jpg)
STEP 5
EXPLORE & INNOVATE
photo by NASAr
![Page 18: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/18.jpg)
SOME TIPS AND TRICKS
Dare to fail and/or start from estimates
Introduce data exploration/innovation days
• Basically 20% time devoted to playing with data
• Incorporate collaborative brainstorming
• Goal is to find promising new projects to work on
Communicate findings to the rest of the company
• Fun and silliness are allowed
• Prototype early and often
![Page 19: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/19.jpg)
PRODUCT INSIGHTS & EXTENSIONS
E.g. recommendations and activity patterns analysis
![Page 20: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/20.jpg)
CUTE OBSERVATIONS FOR PR
http://www.twoo.com/blog/2012/04/twoos-great-global-vocabulary-experiment
![Page 21: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/21.jpg)
FIVE SIMPLE STEPS IS ALL IT TAKES
1
2
3
4
5
FOLLOW THE MONEY
EMBRACE HADOOP
BUILD DASHBOARDS
ASSEMBLE A TEAM
EXPLORE & INNOVATE
![Page 22: Creating Added Value with Big Data](https://reader033.fdocuments.us/reader033/viewer/2022052619/5560cd34d8b42a0d088b4cf4/html5/thumbnails/22.jpg)
FIVE SIMPLE STEPS IS ALL IT TAKES
1
2
3
4
5
FOLLOW THE MONEY
EMBRACE HADOOP
BUILD DASHBOARDS
ASSEMBLE A TEAM
EXPLORE & INNOVATE
Thanks!Questions?