How To Tell if Your Business Needs NoSQL
-
Upload
datastax -
Category
Technology
-
view
105 -
download
0
description
Transcript of How To Tell if Your Business Needs NoSQL
How to Tell if Your Business Needs NoSQLRobin SchumacherVP Products
• Founded in April 2010 • The Apache Cassandra™ company• Home to Apache Cassandra Chair & most committers• Cassandra is a massively scalable NoSQL database• Provide enterprise-class big data platform based on
Cassandra • 270+ customers • Headquartered in San Francisco Bay area• Funded by prominent venture firms
Overview of DataStax
Serving Every Industry
Leading in Performance
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
Netflix Cloud Benchmark…
“In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.”Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10. Benchmark paper presented at the Very Large Database Conference, 2013. http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2013.pdf
End Point Independent NoSQL BenchmarkHighest in throughput…
Lowest in latency…
NoSQL Momentum“According to analysis by Wikibon’s David Floyer (and highlighted in the Wall Street Journal), the NoSQL database market is expected to grow at a compound annual growth rate of nearly 60% between 2011 and 2017. The SQL slice of the Big Data market, in contrast, will grow at just a 26% CAGR during that same time period.”
NoSQL Momentum
“NoSQL is the stuff of the Internet Age.”
- Andrew Oliver, InfoWorld
But Does My Business Need NoSQL…?
Just because a technology appears to be having strong adoption in the market, that doesn’t mean it’s right for your business…
What is NoSQL…?
• Progressive data management engines
• Go beyond legacy relational databases
• Flexible data model• Horizontal scalability• Distributed architectures• Use of languages and
interfaces that are “not only” SQL
NoSQL Example – Apache Cassandra
Apache Cassandra is a massively scalable NoSQL database that offers continuous availability and easy data distribution.
NoSQL Example – Apache Cassandra
“Cassandra stands at the front of the NoSQL pack when it comes to supporting real-time, big data applications.”
– Wikibon
How Can I Tell if NoSQL Can Help Me Run My Business and Reduce Costs?
©2013 DataStax Confidential. Do not distribute without consent.
NoSQL Business Considerations
• Need scale-out (vs. scale-up)?• Manage different types of data like social media?• Lots of data coming in (and fast)?• Have non-RDBMS, non-ACID transactions?• Must keep large data volumes online?• Continuous uptime necessary?• Wide-scale data distribution needed?• Need to integrate different systems?• Cost a factor?
Need Scale-Out (vs. Scale-Up)?
No• Application does not require multiple machines• Can scale-up and meet the application’s current and future
needs
Yes• Application demands divide-and-conquer • Capacity expansion is best/can only be handled via new
machines
Key takeaway: If your applications can easily run on one machine, fit all your data in RAM or can easily expand via new cores/more drives to fulfill current and future requirements, you may not need NoSQL…
NoSQL Case Study
Ooyala distributes and analyzes video content for companies like ESPN, Rolling Stone and others. They track about one quarter of all online video viewers each day and generate 1-2 billion events that are streaming in real-time through their system.
Manage Different Types of Data?
No• No non-structured data (all or mostly rigid formats) • E.G. No social media data
Yes• All types of data (structured, semi, and unstructured)• Social media data
Key takeaway: If all your data systems deal with standard RDBMS structured data and that won’t be changing, then you may not need NoSQL…
NoSQL Case Study
HealthCare Anytime needs to analyze doctor’s notes and other types of difficult data to properly bill back Medicare / Medicaid.
NoSQL Case Study“Cassandra’s NoSQL data model allows us to insert and query data much more naturally than what we had previously. The analysts who routinely use this data were impressed with the flexibility and speed at which the queries came back.”
– CSC/NASA
Lots of Data Coming In (and Fast)?
No• No high velocity data (e.g. device, sensors, web streaming,
etc.)• No multiple locations • Little/no concern about write speed
Yes• High velocity, write intensive• Multiple locations sending data• Must consume data as quickly as possible
Key takeaway: Business applications involving rapid time series data, device ‘exhaust’, web or financial streaming data make good use cases for NoSQL…
NoSQL Case Study
Gnip takes in huge volumes of social media data at high rates of speed (e.g. 20,000 Tweets per second).
Non-RDBMS, Non-ACID transactions?
No• Standard RDBMS, Nested, ACID transactions required• Complex, requiring rollbacks, savepoints, etc., needed
Yes• “Big Data” transactions OK or are necessary• Atomic, Isolated, Durable (AID), but eventual or tunable
consistency allowed
Key takeaway: NoSQL databases do transactions, but since they don’t support joins or foreign keys, consistency conforms to the CAP theorem vs. RDBMS ACID styled consistency…
NoSQL Case Study
eBay does transactions, but does not want overhead of RDBMS ACID-type transactions.
Cassandra and TransactionsIndividual or batch transactions with AID and tunable consistency.
Must Keep Large Data Volumes Online?
No• No application requirement to keep large volumes of data• System typically purges data older than certain time period
Yes• Must keep large volumes of data online and available to
customers• Retain both hot and cold data
Key takeaway: Some NoSQL databases like Cassandra can excel over typical RDBMS’s when it comes to maintaining large volumes of data online and meeting stringent performance SLA’s …
NoSQL Case Study
Easou is the #1 mobile search firm in China. One of their Cassandra applications stores online video images for retrieval / viewing and is 300TB in size.
Continuous Uptime Necessary?
No• Applications have no need for constant uptime• Unplanned downtime can be handled via traditional failover
Yes• Applications cannot tolerate any downtime• Standard log shipping, failover, hot backups, won’t do
Key takeaway: Some NoSQL databases like Cassandra are able to guarantee no downtime because of their architectures…
NoSQL Case Study
Netflix systems are run in the cloud across multiple availability zones with Cassandra and sport constant uptime.
NoSQL Case Study
Commenting on Amazon outage in Oct 2012: “We configure all our clusters to use a replication factor of three, with each replica located in a different Availability Zone. This allowed Cassandra to handle the outage remarkably well. When a single zone became unavailable, we didn't need to do anything. Cassandra routed requests around the unavailable zone and when it recovered, the ring was repaired.”
- Netflix Tech Blog
Wide-Scale Data Distribution Needed?
No• Application’s data needs are single site only• No need to distribute data in other locales for any reason
Yes• Application serves customers in multiple locations• Data is distributed across multiple data centers / cloud zones
for latency/performance or disaster recover reasons
Key takeaway: Cassandra is the gold standard among NoSQL databases for multi-data center, data distribution use cases…
NoSQL Case Study
Rightscale keeps its customers in contact with each other all over the world via Cassandra clusters in 5+ global data centers.
Need to Integrate Different Systems?
No• Applications use siloed databases• No need for different data systems to interact with each other
Yes• Application has different database workloads• Multiple data domains serve single application
Key takeaway: ETL and simple connectors oftentimes do not do the job. Instead, what’s needed is something like DataStax Enterprise, which provides one database that serves multiple database workloads…
NoSQL Case Study
Datafiniti, which is a search engine for data, needs to consume lots of data in real time and provide fast search on top of the same data.
Cost a Factor?
No• Application is small and not cost intensive to operate• Software license costs not a factor
Yes• Large scale business applications • Traditional RDBMS software costs a significant concern
Key takeaway: NoSQL databases costs can oftentimes be 70-80% less than legacy RDBMS software. Further large operations staff are not required to manage NoSQL systems.
NoSQL Case Study
Constant Contact found that scaling out with NoSQL vs. an RDBMS saved them 90% in software costs, and was implemented in 1/3 the time...
What Strategies Can I Use To Implement NoSQL in my Business?
©2013 DataStax Confidential. Do not distribute without consent.
NoSQL Implementation Strategies
New Hybrid Replacement• New big data
applications• Legacy systems
keep old databases
• NoSQL database used for heavy lifting / big data management
• Legacy RDBMS maintains smaller parts of database
• Legacy RDBMS cannot meet demands of new or evolving big data system
• Data models and data are migrated
DataStax Enterprise – NoSQL for the Enterprise
DataStax Enterprise is a complete big data platform, built on Cassandra, that is architected to manage real-time, analytic, and enterprise search data all in the same database cluster.
What You Get With DataStax Enterprise
1. DataStax Enterprise Database Server
2. OpsCenter Enterprise Management solution
3. Expert 24x7 support
Use Cases Handled By DataStax Enterprise
Managed by Cassandra
Managed by Hadoop
Managed by Solr
• Time series data• Device/Sensor/Data
“exhaust” systems• Distributed applications • Media streaming • Online Web retail
(transactional, shopping carts, etc.)
• Real-time data analytics• Social media capture
and analysis • Web click-stream
analysis • Write-intensive
transactional systems
• Buyer behavior analytics
• Compliance/regulatory analysis
• Customer recommendation output
• Fraud detection• Risk analysis• Sales program
campaign analysis • Supply chain
analytics • Batch Web
clickstream analysis
• General Web search• Web retail faceted
(categorization) search • Search/hit prioritization
and highlighting • Application log search
and analysis • Document (PDF, MS
Word, etc.) search and analysis
• Geospatial search • Real estate location
and property search • Social media match
ups
Next Steps
Download DataStax Enterprise and try it in your own environment.
• Go to www.datastax.com/download
• Download a copy of DataStax Enterprise
• Installs and configures in minutes
• Completely free for development use; subscription required for production deployments
For More Information
Thank You
We power the big data applications that transform business.
©2013 DataStax Confidential. Do not distribute without consent.