The Future Of Big Data
-
Upload
matthew-dennis -
Category
Technology
-
view
7.028 -
download
7
description
Transcript of The Future Of Big Data
![Page 1: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/1.jpg)
Cassandra 1.0The Future Of Big DataMatthew F. Dennis // @mdennis7th Advanced Computing ConferenceSeoul, South KoreaFebruary 15th, 2012
![Page 2: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/2.jpg)
Cassandra Job Trends (indeed.com)
![Page 3: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/3.jpg)
Cassandra Job Trends (indeed.com)
![Page 4: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/4.jpg)
“Big Data” Job Trends (indeed.com)
![Page 5: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/5.jpg)
Big Data
![Page 6: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/6.jpg)
Why People Choose Cassandra
True MultiDC Support
Linearly scalable
Largerthanmemory datasets
Bestinclass performance (not just for writes!)
Fully durable
Integrated caching
Tuneable consistency
No single point of failure (SPOF)
![Page 7: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/7.jpg)
Common Cassandra Use Cases
Time Series
Sensor Data
Messaging
Ad Tracking
Financial Market Data
User Activity Streams
Fraud Detection / Risk Analysis
Anything Requiring:linear scale + high performance + global availability
![Page 8: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/8.jpg)
“With Cassandra, we get better business agility, and we don’t have to plan capacity in advance, we don’t need to ask permission of other people to build things for us, and we don’t worry about running out of space or power.”
Adrian Cockcroft, Cloud Architect
![Page 9: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/9.jpg)
Netflix’s problems
Could not build datacenters fast enoughMade decision to go to cloud (AWS)Cassandra on AWS is a key infrastructure component of its globally distributed streaming product.Applications include Netflix’s subscriber system, AB testing, and viewing history service (including pause/resume).
![Page 10: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/10.jpg)
Netflix on Cassandra
FastCheapScalableFlexibleNo SPOF
![Page 11: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/11.jpg)
Scale Horizontallyhttp://www.datastax.com/1-million-writes
Number Of Nodes
Clie
nt
Wri
tes
Per
Seco
nd
![Page 12: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/12.jpg)
“Without Cassandra, our engineers would’ve had to create something that could scale to our needs, that would’ve prevented us from focusing on building product and solving problems for Backupify’s users, which are far more important tasks.”
Matt Conway, VP Engineering
![Page 13: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/13.jpg)
Backupify’s problem
Cloudbased utility that enables businesses and consumers to backup, search and restore the content of popular online applications such as Google Apps, Gmail, Facebook, Twitter, and Blogger
![Page 14: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/14.jpg)
Backupify on Cassandra
Ease of scale enabled engineers to focus on building great applicationsDataStax OpsCenter made it easy to monitor the health and performance of their clusterReliable, redundant, scalable and cheap data storage helped eliminate downtimeAbility to offer both backup and storage, but also analysis of data in the future
![Page 15: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/15.jpg)
“You can seamlessly add new nodes and expand your total capacity without deteriorating the performance of the data store. Cassandra has allowed us to scale very effectively.”
Harry Robertson, Tech Lead
![Page 16: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/16.jpg)
Ooyala’s problem
Ooyala provides a suite of technologies and services that support content owners in managing, analyzing and monetizing the digital video they publish online
![Page 17: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/17.jpg)
Ooyala on Cassandra
Classic “Big Data” problem did not require rearchitectingEnabled Application agility – developers spend time building cool apps, not figuring out how to scaleEnabled more powerful and granular analytics for their customers
![Page 18: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/18.jpg)
Financial
Social Media
Advertising
Entertainment
Energy
ETail
Health Care
Infrastructure
Government
Some More Cassandra Users http://www.datastax.com/cassandrausers
![Page 19: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/19.jpg)
Big Data
![Page 20: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/20.jpg)
The evolution of Analytics
Analytics + Realtime
![Page 21: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/21.jpg)
The evolution of Analytics
Analytics Realtime
replication
![Page 22: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/22.jpg)
The evolution of Analytics
ETL
RealtimeAnalytics
![Page 23: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/23.jpg)
DataStax Enterprise re-unifies realtime and analytics
![Page 24: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/24.jpg)
realtime and analytics
![Page 25: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/25.jpg)
Portfolio Demo dataflow
Portfolios
Historical Prices
Intermediate Results
Largest loss
Portfolios
Live Prices for today
Largest loss
![Page 26: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/26.jpg)
Operations
“Vanilla” HadoopMany pieces to setup, monitor, backup, and maintain(NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Region Server, ...)Single points of failure
DataStax EnterpriseSingle simplified systemSelforganizes based on workloadPeer to peerJobTracker failoverNo additional Cassandra config
![Page 27: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/27.jpg)
Monitoring Cassandra (OpsCenter)
![Page 28: The Future Of Big Data](https://reader034.fdocuments.us/reader034/viewer/2022052620/557a35c9d8b42a32248b4901/html5/thumbnails/28.jpg)
Q?Matthew F. Dennis // @mdennishttp://slideshare.net/mattdennis