Understanding Data Skew in Business Applications with Oracle's Flash-Optimized SAN Storage Chris...

36

Transcript of Understanding Data Skew in Business Applications with Oracle's Flash-Optimized SAN Storage Chris...

Understanding Data Skew in Business Applications with Oracle's Flash-Optimized SAN StorageChris WoodDirector of product managementFlash Storage LOBOctober, 2014

Oracle Confidential – Internal/Restricted/Highly RestrictedCopyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 3

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Oracle Confidential – Internal/Restricted/Highly Restricted

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 4

Program Agenda

Data access patterns in common business applications

Lessons learned

Data Skew and the Law of Diminishing Returns

Dealing with Auto-tiering vs. capacity, IOPS & deduplication claims

Business Cycle Curves: Autotiering Ramifications

Business Summary

1

2

3

4

5

Oracle Confidential – Internal/Restricted/Highly Restricted

6

Oracle Confidential – Internal/Restricted/Highly Restricted 5Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Access patternsThe real challenge in sizing storage today

Note: The speaker notes for this slide include detailed instructions on how to customize this Section Header slide with your own picture.

Tip! Remember to remove this text box.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 6

Data Skew

• Most data is not hot, in fact about 95% is cold, and does not need a lot of IOPS

• Now, with flash storage we can put hot data on expensive flash and cold data on inexpensive capacity media.

• The problem is the host data is constantly changing– As Hot data ages it typically cools, but in some cases tends to heat up again– The trick is to both size the total amount of Flash you will need and then figure out

what data is hot and place it on flash. And remove it when it cools.

• Then factor all decisions by the business value of the data

The new important metric in storage

7 | © 2013 Oracle Corporation – Proprietary and Confidential

Application Workload Skew ExampleSPC-1 Workload (OLTP Benchmark)

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

5.5% of data gets 90.6% of IO’s

Accumulative percent of accesses in FLASH

% of Total of Capacity for Application

Cum

Acc

esse

s %

SPC-1 Workload

Pure Random Workload

Skew = 0%

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 8

OLTP Transactional Data (E-Business Suite)

• Data is hot during the transaction, • warm during the billing month (shipping, restocking, store dailies, etc.), and • cold (read only) until monthly/quarterly/annual reports are run.

– After this data is stone cold, but you can’t just delete it, so we need to store it on the least expensive media possible.

• Data is hot or warm for about 30 days (Monthly billing period), so over a year only 8.3% of your data is hot. This is probably an over estimate– Given the average billing latency is 15 days, a better number might be 4.15% or

something in between these two numbers.

The most common use case.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 9

Financial Services: Demand Deposit Accounting (FlexCube)

• Data is hot from transaction initiation, daily posting and balancing and warm through monthly statement and billing runs and payment processing.– 15-30 days hot to very warm, 30-45 days warm, <45 days cold. – Somewhat similar to pure OLTP type transactions

• We’re stuck, once again, with an inability to just delete this data. 7 year retention is a regulatory minimum.

• Summary: 4.15% to 8.3% of the data is read/write hot, while about 2-3X that is sequential read hot to pick up quarterly reporting.– We may want to consider different flash media types here…

Checks, ATM transactions, PayPal etc.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 10

Telco

• Make a call, generate a CDR and store it for an average of 15 days till the next billing cycle.

• Remediate the CDR’s so all parties to the call get paid their percentage• Sort the CDR’s by phone number for billing and archival purposes• Occasionally extract CDR’s for some Govt. agencies. Then archive them.• Again with a typical 30 day (Average 15 day latency) billing and remediation

payment cycle, 4.15 to 8.3% of the data is hot in a read/write mode. Almost all read thereafter.

• There is a pattern developing here.

Call Detail Records (CDR’s) remediation and billing

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 11

SPC-1 Benchmark

• If our prior examples are correct, then the SPC-1 Data Skew (Ratio of hot data to cold data) should be about the same.

• Remember?

Designed to benchmark a generic OLTP transactional workload

The SPC-1 Skew Ratio is:91.6% of the IO’s are accessing only

5.5% of the dataSo, I think were pretty close here!

12Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Lessons learned

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 13

Lessons learned

• Hot data tends to be in the 4-6% range based on a years worth of data. – If longer periods are modeled, the hot percentage will go down. A little Flash goes a

long way!

• Hot data is generally read/write, and becomes sequential read only as it cools.

• When thinking about auto-tiering with flash:– eMLC flash is great at reading and not so good at writing, but it costs a lot less than

SLC flash.– SLC flash is good at both, but costs more.– eMLC flash also has less write endurance than SLC so it should not be used in write

heavy environments.

There’s more to storage than IOPS and capacity

We probably should be thinking about multiple

flash tiers

14Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Skew and Diminishing Returns

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 15

Data Skew and the Law of diminishing Returns• I need 25,000 IOPS and I have 50 TB of data

– How much Flash should I buy?• Coming up. Next few slides.

– Is there a “Flash Analysis Utility”• No, excepting hokus-pokus marketing web tools. Useless.

– When does my data transition from read/write to read?• Generally after 30 days. Please see detailed White Paper at http://TBD.location.com

– Perhaps I should put all my data on flash as the All Flash Array vendors suggest?• With 95% of your data generally cold, why would you do this?• And they claim super-duper effective deduplication, so the $/TB is equal to HDD’s! Really? Stay

Tuned.

?????????

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 16

When to stop buying flash! A Rule of Thumb: Sum of the % of IOPS and % of capacity = 1

Diminishing Returns

High marginal Value Diminishing marginal Value

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

Diminishing Return Point

Capacity

IOPS

A Little Flash Goes a Long Way!

85% of IOPS + 15% of Capacity = 1

100% + 100% = 2

0% + 0% = 0

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 17

Lessons learned 2• Don’t over buy flash.

– Start with 4-6% of your application data total capacity– Add some margin for expected data growth– You can always add more later. Overbuying follows the law of diminishing returns

• The Oracle FS1 can model un-installed flash to tell you how it would use it if it was installed.

• Look for special Cases– VDI Boot Storms: Pin Golden Images to Flash. Just add them up, and then you know.– Data Warehouse/Business Intelligence Analytics

• Consider Exadata + Exalytics. Otherwise consider server-side flash with Oracle 12c & ADO.

– Certain simulation data are all “hot”: (Fluent, Crash, NASTRAN) Pin data in flash.• Input and Output is sequential, does not require flash. Flash for Scratch only

18Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Dealing with Auto-tiering vs. capacity, IOPS & deduplication claims

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 19

All Flash Arrays (AFA’s) vs. Hybrid ArraysDeduplication to the rescue, or not. First a word about media costs

0 1 2 3 4 5 6 7 80

2

4

6

8

10

12

Series1, $/IOP =10.00

3.00

0.310.13

$/GB

$/IO

P

As of January 2014, List prices, approximate Net values

Cap HDD

Perf HDD

Cap SSD

You cannot find a better technology than Flash if you need performance.

You cannot afford Flash if you don’t need the performance.

Perf SSD

30X $GB27X $IOP

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 20

Now to higher Math…

• Facts:– Capacity storage is 30X less expensive than SLC flash, and 16.5X less expensive than

eMLC flash(1).

• 5X deduplication leads to 5X lower storage costs– But, it’s still 6X less cost effective with SLC flash and 3X+ less cost effective compared to

eMLC read optimized flash. And, hopefully you really will get 5X deduplication.

• The Math looks pretty simple: Would you prefer 100% of your data on 5X deduped flash, or 95% of your data on 30X more cost effective media and 5% at “list price”?

• Let’s take a look at common applications. Is there a lot of duplicated data?(1) Current Commercial OEM vendor prices as of 10/1/2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 21

Now, Let’s look at 5X Deduplication: Real or not?• All Flash Arrays typically use 4K Fixed Window deduplication

– Each 4K data block is hashed and duplicates are replaced with a (smaller) pointer to one master copy of the data block.

– A few Questions:• Telco: How many Call Detail Records are identical? Answer: None• eStore: How many transactions are identical, including time and date stamps? Answer: None• Banking: How many ATM transactions are identical? Answer: None• How many backups are almost identical? Answer: A lot. But then we don’t store backups on All Flash

Arrays and we don’t use fixed window dedupe; do we?• Email: How many emails are identical when huge alias distributions are abused? Answer: A Lot

– But most modern email servers already dedupe attachments. They just substitute pointers to one copy. • My Statement: You are probably not going to get 5X deduplication, maybe 2-3X, maybe none

Final ThoughtOracle Data Bases do

not dedupe. Sorry Guys

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 22

The numbers don’t lie. (List prices, Oracle Media)Media Type Vendor Price Claimed Capacity &

Dedupe ratioEffective $/GB

400 GB SLC Flash $7.50 GB(1) 5X $1.50 GB

1.6 TB eMLC flash $4.12 GB(1) 5X $.82.4 GB

4 TB HDD $.25 GB(1) N/A $.25 GB

(1) Prices represent List Prices for media, Does not include any other Array components or S/W license fees.(2) Allowing 3X deduplication ratio for pricing calculations as it’s much more probable than 5X

All Flash Array costs are simple: #TB * Effective $GB5 TB example: 400 GB SLC = $37,500.00. With 3X(2) deduplication $12,500.00Hybrid Array with 5% hot data on flash (no dedupe) and 95% on 4 TB HDD5 TB example: 4.75 TB @ $.25 GB = $1187.50 + .25 TB @ $7.50 = $1875. Total = $3062.50

The conclusion is obvious: Don’t put cold data on flash media. 4X more expensive.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 23

Dedupe or QoS+?: List Prices Other Vendor’s media

Pure/Solid/Violin FS10

20000

40000

60000

80000

100000

120000

140000$125000

$37500$42,000

$12500

Net Cost for 5TB Solution3X Dedupe

No

Dedupe

3X

Dedupe

No

QoS+

QoS+

Oracle’s FS1:

1/3rd the Price

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 24

Ok, Oracle, so what data reduction does work on my data?

HCC & ACO• No concept of dedupe windows or

data misalignment• Works with Oracle Storage, but not

third party storage• Host I/O’s reduced by compression

factors from 3X to >30X• Queries run faster, no re-hydration

required, less data transferred.• ADO for automatic invocation

25Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Business Cycle Curves: Autotiering ramifications

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 26

Basic 30 Day Business Cycle CurveDoes this actually reflect reality?

Typically represented as a smooth curve. This may not be representative…

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 27

If all your data were quickly demoted to 4 TB capacity drives, you might not get your invoices out

The Bump in the Road, so to speak

This shows a billing and 90 day reporting reheat. This complicates auto-tiering!

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

Business Cycle Chart with Re-Heat times

Time

IOPS

Billing Reheat

Quarterly Reporting

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 28

How to address the “Bumps in the Road”QoS Plus: QoS fused with Auto-Tiering

Archive Low Medium High Premium

0%

20%

40%

60%

80%

100%

Perf SSD Cap SSDPerf HDDCap HDD

QoS Level: Business Priority

Nor

mal

ized

Acc

ess

Fre

quen

cy P

rofil

es

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 29

The need for multiple tiers of Flash and Quality of Service auto-tiering bias

• Remember: Data transitions from Read/Write to primarily Read only over time. – Most “times” are on 30 day cycles.– Data reheats at each cycle end and again at quarterly periods.

• Tiering to capacity HDD’s too soon will have very negative impacts to your business process.

• The Oracle way:– Multiple flash tiers: SLC Performance flash and eMLC Read optimized flash– Unique Quality of Service fused with Autotiering to align data migration with business

value and process.

Frequency of reference is not enough: Ignores business value and real life workloads

30Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Business Summary

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 31

The Final Frontier: Lets add it all up• FS1 Auto tiering 30X cost reduction (Flash vs.

Capacity HDD’s)

• HCC + ADO Data reduction without the Dedupe penalty– Bonus round: Query's run faster as no re-hydration required for I/O’s

– 10X compression for Warm Data, 15-50X for Archive

– Compression algorithms changed as your usage patterns change.

• Realized Data Cost Reduction: It’s not additive, it’s multiplicative!– OLTP (Oracle Advanced Data Compression) Hot Data 3X

– Reporting (Warm Data) 30X * 10X = 300X

– Archive (Cold Data) 30X * 20X = 600X

• Only from Oracle

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 32

Business at Warp 9.9 With the FS1 Flash Storage System

• 300-600X storage cost reduction for 90%+ of your data• Queries can run up to 10X faster due to reduced I/O transfer load• Compression not effected by 4k block misalignment• Flash performance without all flash price• Fire Phasers when ready Tasha!• Oracle: Engineered to work together

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 33

Oracle Open World 2014 – FS1 SessionsSession ID: CON7789 Optimizing Oracle Data Stores in Virtualized EnvironmentsDate and Time: 9/30/14, 10:45 - 11:30 Venue / Room: Intercontinental - Intercontinental C

Session ID: CON7830 Solving Data Skew in Oracle Business Applications with Oracle’s Flash-Optimized SAN Storage Date and Time: 9/30/14, 15:45 - 16:30 Venue / Room: Intercontinental - Intercontinental C

Session ID: CON7792 Optimizing Oracle Data Stores with Oracle Flash-Optimized SAN Storage Date and Time: 9/30/14, 17:00 - 17:45 Venue / Room: Intercontinental - Intercontinental C

Session ID: CON7832 Leveraging Oracle’s Flash-Optimized SAN Storage in a Cloud Deployment Date and Time: 10/1/14, 12:45 - 13:30 Venue / Room: Intercontinental - Intercontinental C

Session ID: CON7841 Maximizing Oracle Database 12c with Oracle's Flash-Optimized SAN Storage Date and Time: 10/2/14, 12:00 - 12:45 Venue / Room: Intercontinental - Union Square

Session ID: CON7831 Optimizing Storage for Oracle ASM with an Oracle Flash-Optimized SAN Date and Time: 10/2/14, 2:30 - 3:15 Venue / Room: Intercontinental - Union Square

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 34

Oracle Open World 2014 – FS1 DemoPods and HOL• DemoPods:

DemoID:3691 Leveraging Flash to Improve Latency of Multiple Database Instances, Location: SC-117

DemoID:3713 Quality of Service-Driven Autotiering, Location: SC-132

DemoID:3711 Maximizing Database Performance: Data Tiering vs Oracle HCC vs Deduplication, Location: SC-161

DemoID:3695 Simplifying storage management with Oracle Enterprise Manager, Location: SC-162

DemoID:4766 Hardware Showcase : Oracle FS1 Flash Storage System, Location: SC-133

• Hands On Lab (HOL) :

Session ID: HOL8687 Oracle Storage System GUI: Faster Database Performance with QoS EnhancementsDate and Time: 9/30/14, 18:45 - 19:45 Venue / Room: Hotel Nikko - Nikko Ballroom I

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 35Oracle Confidential – Internal/Restricted/Highly Restricted