The Perfect Storm: The Impact of Analytics, Big Data and Analytics
-
Upload
inside-analysis -
Category
Technology
-
view
449 -
download
0
description
Transcript of The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Twitter Tag: #briefr
The Briefing Room
! Reveal the essential characteristics of enterprise software, good and bad
! Provide a forum for detailed analysis of today’s innovative technologies
! Give vendors a chance to explain their product to savvy analysts
! Allow audience members to pose serious questions... and get answers!
Twitter Tag: #briefr
The Briefing Room
November: Cloud
December: Innovators
January: Big Data
February: Performance
March: Integration
Twitter Tag: #briefr
The Briefing Room
! Historically, databases have been built around SQL, a declarative query language targeted at organizing data in two-dimensional tables
! The ever increasing variety, volume and velocity of data has
taxed traditional relational databases and created performance bottlenecks, particularly around CPU, memory, disk I/O and network saturation
! Alternatives like NoSQL and NewSQL have emerged to better
support extreme and diverse workloads without suffering hits in performance
Twitter Tag: #briefr
The Briefing Room
Dr. Barry Devlin is a founder of the data warehousing industry and among the foremost authorities worldwide on business intelligence (BI). He is a widely respected consultant, lecturer and author of “Data Warehouse—from Architecture to Implementation.” Barry has 30 years of experience in the IT industry, previously with IBM, as an architect, consultant, manager and software evangelist. As founder and principal and 9sight Consulting (www.9sight.com), Barry provides strategic consulting and thought leadership to buyers and vendors of BI solutions. He is currently developing a new architectural model for fully consistent business support—from informational to operational and collaborative—Business Integrated Insight (BI2). He is based in Cape Town, South Africa.
Twitter Tag: #briefr
The Briefing Room
! NuoDB is an ACID-compliant NewSQL relational database management system
! It is architected to scale elastically on the cloud
! It leverages a peer-to-peer, distributed architecture
! NuoDB currently has 1000+ users in beta
Twitter Tag: #briefr
The Briefing Room
Barry is an accomplished software CEO with over 25 years of industry experience in running private and public companies around industry-changing paradigm shifts in technology. He had leadership roles at IONA Technologies, which helped lay the groundwork for modern SOA-based systems, and StreamBase Systems, a pioneer of complex event processing. Barry’s early career included technical, management and business development roles. Barry does a great deal of consulting and has served on a variety of boards for startup companies in Boston, Ireland and South Africa. He earned his Degree in Engineering from New College Oxford University and holds an Honorary Doctorate in Business Administration from the IMCA.
Copyright © NuoDB 2012 1
The Elastically Scalable Database™
Copyright © NuoDB 2012
NuoDB
2
The Database for the 21st Century
NuoDB is a revolutionary database system based on a patented Emergent Architecture.
NuoDB is designed for modern datacenters, workloads and business models.
NuoDB delivers all of the capabilities and services of the 20th Century RDBMS.
NuoDB has a SQL personality but it could just as easily be a Document Database, an Object Database, a Graph Database or something else.
NuoDB Inc is building next generation capabilities that will redefine the role of databases in next generation applications.
Copyright © NuoDB 2012
20th Century Database
3
Powerful Query Language
Industry Standards
Data Guarantees
Employee Skills
Tools
Existing Data
44%
21%
19%
4%3%
9%
ORACLEIBMMicrosoftSybaseTeradataOthers
Copyright © NuoDB 2012
21st Century Problem
4
Powerful Query Language
Industry Standards
Data Guarantees
Employee Skills
Tools
Existing Data 44%
21%
19%
4%3%
9%
ORACLEIBMMicrosoftSybaseTeradataOthers
Commodity Datacenters ✗Big Data ✗
Modern Workloads ✗24x7 Operation✗
Geo-distribution ✗Developer
Empowerment ✗
Copyright © NuoDB 2012
Database Crisis
5
Wikipedia Flickr Facebook
Amazon Google
Source: Marc Bojoly
Copyright © NuoDB 2012
Jim Starkey
6
‣ DEC RDB/ELN
‣ InterBase
‣ Firebird
‣ Falcon
‣ BLOBS
‣ MVCC
‣ etc
“Elastically Scalable Transactions represent the biggest breakthrough in database technology in 25 years”
Copyright © NuoDB 2012
“An emergent behavior can
appear when a number of simple entities operate in an environment,
forming more complex behaviors
as a collective.”
- Wikipedia
7
Emergent Database Architecture
Copyright © NuoDB 2012
Notes
MySQL 5.1
NuoDB Beta 3 - Single Node
http://www.polepos.org
‣ Time taken for given benchmark, normalized to NuoDB = 1‣ Less is Better
In early tests NuoDB on a single node was 2x to 20x
faster than MySQL 5.1 running the industry standard Poleposition
Benchmarks.
Your mileage may vary.
8
Poleposition - Single Node
Copyright © NuoDB 2012
Second Machine Instant Performance Increase
• Second machine typically doubles TPS
• Second machine is added to live database while it is running at 1,000’s of TPS
• Performance increase is immediate
• BTW - you can take either machine away and the database keeps running without data loss
9
Adding a Second Machine
Copyright © NuoDB 2012
Second & Third Machine Instant Performance Increase
• Third machine typically triples single machine TPS
• Third machine is added to live database while it is running at 1,000’s of TPS
• Performance increase is immediate
• BTW - you can take any machine away and the database keeps running without data loss
10
Adding a Third Machine
Copyright © NuoDB 2012
Number of Nodes
TPS
11
Technical Details:
‣ 2-9 Tx engines‣ 1 storage manager‣ Best sustained TPS and
# clients combination
‣ 50% updates
Nodes TPS
MySQL 1 3,000
NuoDB 1 4,500
NuoDB 9 27,000
NuoDB running on 9 nodes was approx. 9x faster than MySQL running on 1 node.
!"
#!!!"
$!!!!"
$#!!!"
%!!!!"
%#!!!"
&!!!!"
$" %" &" '" #" (" )" *" +"
More Machines? Bring ‘em On
Copyright © NuoDB 2012 12
!"
#!!!"
$!!!"
%!!!"
&!!!"
'!!!!"
'#!!!"
'" #" (" $" )" %" *" &" +"
Number of EC2 Nodes
TPS
‣ Nuodb scales linearly on
EC2
‣ Per-node performance on
m1.large nodes approx 50%
of our commodity servers
‣ Just started on optimizing
‣ RDS runs on 1 node, and
gets overloaded with 10+
connections
Or Scale-out on IAAS
Copyright © NuoDB 2012
Squirrel SQL
MS Excel (and other MS tools)
DBVisualizer
You already know how to use NuoDB
13
Standard SQL - Favorite Tools
Copyright © NuoDB 2012
NuoDBThe Elastically Scalable Database™
14
Applications Brokers Transaction Engines Storage Managers
Copyright © NuoDB 2012
NuoDB Architecture
15
Copyright © NuoDB 2012 16
OldSQL NoSQL NuoDB
20th C. Database
Powerful Query Language (SQL) ✓ � ✓
20th C. Database
Industry Standards (SQL, JDBC, ODBC etc) ✓ � ✓
20th C. Database
Data Guarantees (ACID Transactions) ✓ � ✓20th C.
DatabaseEmployee Skills ✓ � ✓
20th C. Database
Existing Data ✓ � ✓
21st C.Database
On-demand Capacity � ✓ ✓
21st C.Database
Commodity Datacenters / Virtualization / Cloud � ✓ ✓
21st C.Database
Modern Workloads (Concurrency, TPS, Latency) � ½ ✓
21st C.Database
Big Data ½ ✓ ✓21st C.
Database
100% Uptime � ✓ ✓21st C.Database Online Maintenance, Admin and
Schema Evolution � ✓ ✓21st C.
Database
Geo-distribution � ✓ ✓
21st C.Database
Developer Empowerment � ✓ ✓
21st C.Database
Zero Touch Backup � ✓ ✓
21st C.Database
“Zero” Admin � ✓ ✓
The 21st Century Database
Copyright © NuoDB 2012
The Elastically Scalable Database™
Twitter Tag: #briefr
The Briefing Room
Copyright © 2012 9sight Consulting, All Rights Reserved
Dr Barry Devlin Founder & Principal
9sight Consulting
The Perfect Storm: The Impact of Analytics, Big Data and Cloud The Briefing Room, 23 October 2012
11 Copyright © 2010-12 9sight Consulting
Three key trends in business are driving rapid change.
1. Closed-loop business – strategy to execution – Merge operational, informational & collaborative – Extreme flexibility in adapting to change
2. Massive information volumes for use – Volumes, sources, types
3. Collaborate to innovate – Millennials move into power – Mobile users and applications
Faster Bigger
Distributed More flexible
More personal
1. Closed-loop business – strategy to execution 2. Massive information volumes for use
3. Collaborate to innovate
4. SOA, Mobile Apps and Analytics – Adaptive IT and design flexibility
5. Advances in “Data Processing” – RDBMS advances, Big Data and Cloud
6. Web / Enterprise 2.0 and beyond – Collaborative tools, semantic web and more
12 Copyright © 2010-12 9sight Consulting
Recent technology advances offer new ways to address emerging business needs.
Big data is really all data
13 Copyright © 2012 9sight Consulting
Human-sourced information
(Traditional) Business
Processes Machine-generated
data Process-mediated
data
Business Analytics
Flex
ibili
ty
Timeliness
Three domains § Process-mediated data
– “Traditional” operational & informational data
– Via data entry & cleansing
§ Machine-generated data – Output of machines & sensors – High-speed, high-volume – The Internet of Things
§ Human-sourced information – Subjectively interpreted record
of personal experiences – Model unknown before usage – From Tweets to Videos
§ See: bit.ly/Big_Data_Zoo [In the context of these domains, “data” signifies well-structured and/or modeled and “information” is more loosely structured and human-centric.]
Technology drives and dictates progress
§ Vast improvements in price-performance for memory – Critical data for most businesses can fit in main memory – Traditional database design is disk-centric
– Commit means on disk – Disk I/O bottleneck is a key design point
§ Single processors cannot go any faster; the move to multi-core / multi-processing has been ongoing for over 5 years – Traditional programming is single-CPU-centric – MPP – from specialized / high-cost to wide-spread / low-cost
§ Physical data representation back at the forefront – Row store vs column store vs key-value store – Compression ratios – Are column stores slow for update?
14 Copyright © 2012 9sight Consulting
Database – Innovation and evolution
§ Hierarchical & Network – Speed of record update and
access – Physical storage optimization
15 Copyright © 2012 9sight Consulting
1960 1970 1980 1990 2000 2010 2020
Features / Performance
Hierarchical & Network
Relational
Next wave?
Niche?
§ Relational – A logical model of data’s
relationships to “reality” – Predefined model
§ “Post-relational” – Flexibility – Scalability
“Post-relational”
Cumulative progress
Sustaining change
Disruptive change
Clayton M. Christensen, “The Innovator’s Dilemma”, 1997
The emerging biz-tech ecosystem
§ Fully symbiotic existence of business and IT
1. Interdependence – New technology enables new
business possibilities; new business opportunities drive technology advances
2. Reintegration – Silos in business and IT deter Web-savvy customers;
coherence becomes mandatory
3. Cross-over – Business people need IT skills to see how to recreate the business
with new technology; IT people need business acumen to see how to satisfy business needs in new ways with emerging technology
16
Questions (1)
1. You emphasize the object-oriented / distributed / message-oriented nature of NuoDB as well as in-memory operation. With improving memory price-performance and the possibility that many businesses will be able to fit all business-critical data in memory, why do you need both?
2. It seems that disk storage is replaced first by distributed computer storage, and then “failback” to disk. Are you replacing disk I/O latencies with network latencies? How is this an advantage?
3. As an in-memory database, how do you position NuoDB vs. SAP HANA?
4. With advances in memory, MPP, columnar stores, etc., I see the possible end of the old operational vs informational split. What is your view? Where does NuoDB fit in that scenario?
5. Big data – what do you mean by the term? On which aspects of big data does NuoDB focus?
17 Copyright © 2012 9sight Consulting
Questions (2)
6. “NoSQL” databases emphasize flexibility to changing data structures mainly by exposing a key-value store to applications. Is that why you use a KV store? How do you benefit from the KV store as it is “locked behind” the relational model?
7. The query optimizer is perhaps the key to database performance. For most new DBs, it has proven to be a long road to build an optimized optimizer – how will NuoDB address this?
8. In your white paper you say “database designers don’t need to compromise on schema design by de-normalizing tables, removing joins” for performance… sounds like magic. Why not?
9. You support indexing. Why do you need it / use it in an in-memory database?
10. You put Multiversion Concurrency Control (MVCC) forward as the solution to ACID requirements. Do you always insert rather than update?
18 Copyright © 2012 9sight Consulting
Copyright © 2012 9sight Consulting, All Rights Reserved
Dr Barry Devlin Founder & Principal
9sight Consulting
Barry Devlin
20 Copyright © 2012 9sight Consulting
Founder and Principal 9sight Consulting, www.9sight.com
Dr. Barry Devlin is a founder of the data warehousing industry and among the foremost authorities worldwide on business intelligence (BI) and beyond. He is a widely respected consultant, lecturer and author of “Data Warehouse—from Architecture to Implementation”. Barry has 30 years of experience in the IT industry, previously with IBM, as an architect, consultant, manager and software evangelist. As founder and principal of 9sight Consulting (www.9sight.com), Barry provides strategic consulting and thought-leadership to buyers and vendors of BI solutions. He is currently developing a new architectural model for fully consistent business support—from informational to operational and collaborative—Business Integrated Insight (BI2). Based in Cape Town, South Africa, Barry’s knowledge and expertise are in demand both locally and internationally.
Email: [email protected] Twitter: @BarryDevlin
Twitter Tag: #briefr
The Briefing Room
Twitter Tag: #briefr
The Briefing Room
This Month: Database
November: Cloud
December: Innovators
January: Big Data
2013 Editorial Calendar (www.insideanalysis.com)
Twitter Tag: #briefr
The Briefing Room