The Perfect Storm: The Impact of Analytics, Big Data and Analytics

40

description

The Briefing Room with Barry Devlin and NuoDB Live Webcast on Oct. 23, 2012 Three major factors in enterprise computing are combining to rewrite how data is stored, accessed and managed: 1) the demand of analytics that now spreads across hundreds, even thousands of users; 2) the pervasiveness of Big Data in all its forms and sizes; and 3) the rise of the commodity data center, aka Cloud computing. The convergence of these forces calls for a new data foundation, one that can handle the scalability and workload issues that face today's information managers. Check out this episode of The Briefing Room to learn from veteran Analyst Barry Devlin, one of the very first architects of data warehousing, who will explain how today's information architectures require a radically different approach. He'll be briefed by Barry Morris, Founder and CEO of NuoDB, who will tout his company's product, described as a peer-to-peer messaging system that acts as a database. It behaves just like a traditional relational database, but was designed with a completely distributed and scalable architecture. http://www.insideanalysis.com

Transcript of The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Page 1: The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Page 2: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

[email protected]

Page 3: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

!  Reveal the essential characteristics of enterprise software, good and bad

!  Provide a forum for detailed analysis of today’s innovative technologies

!  Give vendors a chance to explain their product to savvy analysts

!  Allow audience members to pose serious questions... and get answers!

Page 4: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

November: Cloud

December: Innovators

January: Big Data

February: Performance

March: Integration

Page 5: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

!  Historically, databases have been built around SQL, a declarative query language targeted at organizing data in two-dimensional tables

!  The ever increasing variety, volume and velocity of data has

taxed traditional relational databases and created performance bottlenecks, particularly around CPU, memory, disk I/O and network saturation

!  Alternatives like NoSQL and NewSQL have emerged to better

support extreme and diverse workloads without suffering hits in performance

Page 6: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

Dr. Barry Devlin is a founder of the data warehousing industry and among the foremost authorities worldwide on business intelligence (BI). He is a widely respected consultant, lecturer and author of “Data Warehouse—from Architecture to Implementation.” Barry has 30 years of experience in the IT industry, previously with IBM, as an architect, consultant, manager and software evangelist. As founder and principal and 9sight Consulting (www.9sight.com), Barry provides strategic consulting and thought leadership to buyers and vendors of BI solutions. He is currently developing a new architectural model for fully consistent business support—from informational to operational and collaborative—Business Integrated Insight (BI2). He is based in Cape Town, South Africa.

Page 7: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

! NuoDB is an ACID-compliant NewSQL relational database management system

!   It is architected to scale elastically on the cloud

!   It leverages a peer-to-peer, distributed architecture

! NuoDB currently has 1000+ users in beta

Page 8: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

Barry is an accomplished software CEO with over 25 years of industry experience in running private and public companies around industry-changing paradigm shifts in technology. He had leadership roles at IONA Technologies, which helped lay the groundwork for modern SOA-based systems, and StreamBase Systems, a pioneer of complex event processing. Barry’s early career included technical, management and business development roles. Barry does a great deal of consulting and has served on a variety of boards for startup companies in Boston, Ireland and South Africa. He earned his Degree in Engineering from New College Oxford University and holds an Honorary Doctorate in Business Administration from the IMCA.

Page 9: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012 1

The Elastically Scalable Database™

Page 10: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

NuoDB

2

The Database for the 21st Century

NuoDB is a revolutionary database system based on a patented Emergent Architecture.

NuoDB is designed for modern datacenters, workloads and business models.

NuoDB delivers all of the capabilities and services of the 20th Century RDBMS.

NuoDB has a SQL personality but it could just as easily be a Document Database, an Object Database, a Graph Database or something else.

NuoDB Inc is building next generation capabilities that will redefine the role of databases in next generation applications.

Page 11: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

20th Century Database

3

Powerful Query Language

Industry Standards

Data Guarantees

Employee Skills

Tools

Existing Data

44%

21%

19%

4%3%

9%

ORACLEIBMMicrosoftSybaseTeradataOthers

Page 12: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

21st Century Problem

4

Powerful Query Language

Industry Standards

Data Guarantees

Employee Skills

Tools

Existing Data 44%

21%

19%

4%3%

9%

ORACLEIBMMicrosoftSybaseTeradataOthers

Commodity Datacenters ✗Big Data ✗

Modern Workloads ✗24x7 Operation✗

Geo-distribution ✗Developer

Empowerment ✗

Page 13: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Database Crisis

5

Wikipedia Flickr Facebook

Amazon Google

Source: Marc Bojoly

Page 14: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Jim Starkey

6

‣ DEC RDB/ELN

‣ InterBase

‣ Firebird

‣ Falcon

‣ BLOBS

‣ MVCC

‣ etc

“Elastically Scalable Transactions represent the biggest breakthrough in database technology in 25 years”

Page 15: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

“An emergent behavior can

appear when a number of simple entities operate in an environment,

forming more complex behaviors

as a collective.”

- Wikipedia

7

Emergent Database Architecture

Page 16: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Notes

MySQL 5.1

NuoDB Beta 3 - Single Node

http://www.polepos.org

‣ Time taken for given benchmark, normalized to NuoDB = 1‣ Less is Better

In early tests NuoDB on a single node was 2x to 20x

faster than MySQL 5.1 running the industry standard Poleposition

Benchmarks.

Your mileage may vary.

8

Poleposition - Single Node

Page 17: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Second Machine Instant Performance Increase

• Second machine typically doubles TPS

• Second machine is added to live database while it is running at 1,000’s of TPS

• Performance increase is immediate

• BTW - you can take either machine away and the database keeps running without data loss

9

Adding a Second Machine

Page 18: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Second & Third Machine Instant Performance Increase

• Third machine typically triples single machine TPS

• Third machine is added to live database while it is running at 1,000’s of TPS

• Performance increase is immediate

• BTW - you can take any machine away and the database keeps running without data loss

10

Adding a Third Machine

Page 19: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Number of Nodes

TPS

11

Technical Details:

‣ 2-9 Tx engines‣ 1 storage manager‣ Best sustained TPS and

# clients combination

‣ 50% updates

Nodes TPS

MySQL 1 3,000

NuoDB 1 4,500

NuoDB 9 27,000

NuoDB running on 9 nodes was approx. 9x faster than MySQL running on 1 node.

!"

#!!!"

$!!!!"

$#!!!"

%!!!!"

%#!!!"

&!!!!"

$" %" &" '" #" (" )" *" +"

More Machines? Bring ‘em On

Page 20: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012 12

!"

#!!!"

$!!!"

%!!!"

&!!!"

'!!!!"

'#!!!"

'" #" (" $" )" %" *" &" +"

Number of EC2 Nodes

TPS

‣ Nuodb scales linearly on

EC2

‣ Per-node performance on

m1.large nodes approx 50%

of our commodity servers

‣ Just started on optimizing

‣ RDS runs on 1 node, and

gets overloaded with 10+

connections

Or Scale-out on IAAS

Page 21: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

Squirrel SQL

MS Excel (and other MS tools)

DBVisualizer

You already know how to use NuoDB

13

Standard SQL - Favorite Tools

Page 22: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

NuoDBThe Elastically Scalable Database™

14

Applications Brokers Transaction Engines Storage Managers

Page 23: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

NuoDB Architecture

15

Page 24: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012 16

OldSQL NoSQL NuoDB

20th C. Database

Powerful Query Language (SQL) ✓ � ✓

20th C. Database

Industry Standards (SQL, JDBC, ODBC etc) ✓ � ✓

20th C. Database

Data Guarantees (ACID Transactions) ✓ � ✓20th C.

DatabaseEmployee Skills ✓ � ✓

20th C. Database

Existing Data ✓ � ✓

21st C.Database

On-demand Capacity � ✓ ✓

21st C.Database

Commodity Datacenters / Virtualization / Cloud � ✓ ✓

21st C.Database

Modern Workloads (Concurrency, TPS, Latency) � ½ ✓

21st C.Database

Big Data ½ ✓ ✓21st C.

Database

100% Uptime � ✓ ✓21st C.Database Online Maintenance, Admin and

Schema Evolution � ✓ ✓21st C.

Database

Geo-distribution � ✓ ✓

21st C.Database

Developer Empowerment � ✓ ✓

21st C.Database

Zero Touch Backup � ✓ ✓

21st C.Database

“Zero” Admin � ✓ ✓

The 21st Century Database

Page 25: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © NuoDB 2012

The Elastically Scalable Database™

Page 26: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

Page 27: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © 2012 9sight Consulting, All Rights Reserved

Dr Barry Devlin Founder & Principal

9sight Consulting

The Perfect Storm: The Impact of Analytics, Big Data and Cloud The Briefing Room, 23 October 2012

Page 28: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

11 Copyright © 2010-12 9sight Consulting

Three key trends in business are driving rapid change.

1.  Closed-loop business – strategy to execution –  Merge operational, informational & collaborative –  Extreme flexibility in adapting to change

2.  Massive information volumes for use –  Volumes, sources, types

3.  Collaborate to innovate –  Millennials move into power –  Mobile users and applications

Faster Bigger

Distributed More flexible

More personal

Page 29: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

1.  Closed-loop business – strategy to execution 2.  Massive information volumes for use

3.  Collaborate to innovate

4.  SOA, Mobile Apps and Analytics –  Adaptive IT and design flexibility

5.  Advances in “Data Processing” –  RDBMS advances, Big Data and Cloud

6.  Web / Enterprise 2.0 and beyond –  Collaborative tools, semantic web and more

12 Copyright © 2010-12 9sight Consulting

Recent technology advances offer new ways to address emerging business needs.

Page 30: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Big data is really all data

13 Copyright © 2012 9sight Consulting

Human-sourced information

(Traditional) Business

Processes Machine-generated

data Process-mediated

data

Business Analytics

Flex

ibili

ty

Timeliness

Three domains §  Process-mediated data

–  “Traditional” operational & informational data

–  Via data entry & cleansing

§  Machine-generated data –  Output of machines & sensors –  High-speed, high-volume –  The Internet of Things

§  Human-sourced information –  Subjectively interpreted record

of personal experiences –  Model unknown before usage –  From Tweets to Videos

§  See: bit.ly/Big_Data_Zoo [In the context of these domains, “data” signifies well-structured and/or modeled and “information” is more loosely structured and human-centric.]

Page 31: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Technology drives and dictates progress

§  Vast improvements in price-performance for memory –  Critical data for most businesses can fit in main memory –  Traditional database design is disk-centric

– Commit means on disk – Disk I/O bottleneck is a key design point

§  Single processors cannot go any faster; the move to multi-core / multi-processing has been ongoing for over 5 years –  Traditional programming is single-CPU-centric –  MPP – from specialized / high-cost to wide-spread / low-cost

§  Physical data representation back at the forefront –  Row store vs column store vs key-value store –  Compression ratios –  Are column stores slow for update?

14 Copyright © 2012 9sight Consulting

Page 32: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Database – Innovation and evolution

§  Hierarchical & Network –  Speed of record update and

access –  Physical storage optimization

15 Copyright © 2012 9sight Consulting

1960 1970 1980 1990 2000 2010 2020

Features / Performance

Hierarchical & Network

Relational

Next wave?

Niche?

§  Relational –  A logical model of data’s

relationships to “reality” –  Predefined model

§  “Post-relational” –  Flexibility –  Scalability

“Post-relational”

Cumulative progress

Sustaining change

Disruptive change

Clayton M. Christensen, “The Innovator’s Dilemma”, 1997

Page 33: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

The emerging biz-tech ecosystem

§  Fully symbiotic existence of business and IT

1.  Interdependence –  New technology enables new

business possibilities; new business opportunities drive technology advances

2.  Reintegration –  Silos in business and IT deter Web-savvy customers;

coherence becomes mandatory

3.  Cross-over –  Business people need IT skills to see how to recreate the business

with new technology; IT people need business acumen to see how to satisfy business needs in new ways with emerging technology

16

Page 34: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Questions (1)

1.  You emphasize the object-oriented / distributed / message-oriented nature of NuoDB as well as in-memory operation. With improving memory price-performance and the possibility that many businesses will be able to fit all business-critical data in memory, why do you need both?

2.  It seems that disk storage is replaced first by distributed computer storage, and then “failback” to disk. Are you replacing disk I/O latencies with network latencies? How is this an advantage?

3.  As an in-memory database, how do you position NuoDB vs. SAP HANA?

4.  With advances in memory, MPP, columnar stores, etc., I see the possible end of the old operational vs informational split. What is your view? Where does NuoDB fit in that scenario?

5.  Big data – what do you mean by the term? On which aspects of big data does NuoDB focus?

17 Copyright © 2012 9sight Consulting

Page 35: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Questions (2)

6.  “NoSQL” databases emphasize flexibility to changing data structures mainly by exposing a key-value store to applications. Is that why you use a KV store? How do you benefit from the KV store as it is “locked behind” the relational model?

7.  The query optimizer is perhaps the key to database performance. For most new DBs, it has proven to be a long road to build an optimized optimizer – how will NuoDB address this?

8.  In your white paper you say “database designers don’t need to compromise on schema design by de-normalizing tables, removing joins” for performance… sounds like magic. Why not?

9.  You support indexing. Why do you need it / use it in an in-memory database?

10.  You put Multiversion Concurrency Control (MVCC) forward as the solution to ACID requirements. Do you always insert rather than update?

18 Copyright © 2012 9sight Consulting

Page 36: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Copyright © 2012 9sight Consulting, All Rights Reserved

Dr Barry Devlin Founder & Principal

9sight Consulting

Page 37: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Barry Devlin

20 Copyright © 2012 9sight Consulting

Founder and Principal 9sight Consulting, www.9sight.com

Dr. Barry Devlin is a founder of the data warehousing industry and among the foremost authorities worldwide on business intelligence (BI) and beyond. He is a widely respected consultant, lecturer and author of “Data Warehouse—from Architecture to Implementation”. Barry has 30 years of experience in the IT industry, previously with IBM, as an architect, consultant, manager and software evangelist. As founder and principal of 9sight Consulting (www.9sight.com), Barry provides strategic consulting and thought-leadership to buyers and vendors of BI solutions. He is currently developing a new architectural model for fully consistent business support—from informational to operational and collaborative—Business Integrated Insight (BI2). Based in Cape Town, South Africa, Barry’s knowledge and expertise are in demand both locally and internationally.

Email: [email protected] Twitter: @BarryDevlin

Page 38: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

Page 39: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room

This Month: Database

November: Cloud

December: Innovators

January: Big Data

2013 Editorial Calendar (www.insideanalysis.com)

Page 40: The Perfect Storm: The Impact of Analytics, Big Data and Analytics

Twitter Tag: #briefr

The Briefing Room