Monthly Webinar Series - StorMagic · File Initialization 8MB In-Memory OLTP Checkpoint 1MB. How...

Understanding storage performance for hyperconverged infrastructure

Luke Pruen – Technical Services Director

“Virtual SANs made simple”

Monthly Webinar Series:

Enabling HCI

Pre-configured, certified and

supported by major vendors

Large and Small

Large and small deployments

from enterprises with 1000s of

site to SME’s with a single site

Global footprint

In 72 countries, customers

depend on StorMagic for sever

and storage infrastructure

Partner Network

Wherever you are, StorMagic has

resellers, integrators, and server

partners to meet your needs

30 + verticals

Including retail, financial

services, healthcare,

government, education,

energy, professional services,

pharma, and manufacturing

Introducing StorMagic

What is Hyperconverged Infrastructure?

“Tightly coupled compute, network and storagehardware that dispenses with the need for aregular storage area network (SAN).”

Magic Quadrant for Integrated Systems: Published October 10th 2016

There’s a lot of choice out there

The hyperconverged market• Market is maturing with many options now available• Be careful of pursuing a “one size fits all” approach

Customers• Few customers understand their requirements• Often blindly deploy over-spec’d solutions

Our take• Customers need to be able to measure their needs more accurately • Real world data often provides a surprising insight

Hyperconverged Architectures: Kernel based

• Storage “software” is within the hypervisor• Pools local server storage where the hypervisor is installed• Presents storage over a proprietary mechanism• Claims to be more efficient and able to deliver higher performance

More efficient because its where the hypervisor runs Less “hops” to the storage

SSD

HypervisorStorage

SW

SSD

HypervisorStorage

SW

SSD

HypervisorStorage

SW

Shared Storage

Hyperconverged Architectures: VSA based

• A virtual storage appliance resides on the hypervisor• Host storage assigned to the local VSA• Storage is generally presented as iSCSI or NFS• Claims to be more flexible than kernel based models

Hypervisor agnostic More storage flexibility Easier to troubleshoot storage issues vs hypervisor issues

SSD SSD SSD

Shared Storage

StorMagic SvSAN: Overview

“SvSAN turns the internal

disk, SSD and memory of 2

or more servers into highly

available shared storage”

StorMagic SvSAN: Benefits

Availability FlexibleCost-EffectiveRobust

Data & operations protected

No single point of failure Local and stretched cluster capable Split-brain risk eliminated

Proven at the IT edge and the datacenter From harshest to the most controlled Supports mission critical applications Tolerates poor, unreliable networks

Enterprise-class management Integrates with standard tools Automated deployment and recovery scripts Designed for use by any IT professional

Today’s needs, future proofedLightest footprint, lowest costAny site, any network

No more physical SANs Converge compute and storage Utilize power of commodity servers Eliminate storage networking components

Lowest CAPEX Start with only 2 servers - existing or new Significantly less CPU and memory One lightweight quorum for all clusters

Lowest OPEX Reduced power, cooling and spares Lower costs with centralized management Eliminate planned and unplanned

downtime

Performance and scale Leverage any CPU and storage type Active/Active synchronous mirroring Scale-up performance with 2 node cluster

Build-your-own Hyperconverged Eliminate appliance overprovisioning Configure to precise IOPS & capacity Auto-tier disk, SSD and memory

Flexibility and growth Hyperconverged or storage-only Hypervisor and server agnostic Non-disruptive upgrades

Optimizing Storage: All storage is not equal

9

Magnetic drives provide poor random performance• SATA 7.2k rpm 75 – 100 IOPS• SAS 10k/15k rpm 140 – 210 IOPS• Lower cost per GB• Higher cost per IOPS

• Flash and SSDs have good random performance• SSD/Flash 8.6K to 10 millions IOPS• Lower cost per IOPS• High cost per GB compared to magnetic

• Memory has even better performance• Orders of magnitude faster than Flash/SSD• Much higher cost per GB compared to SSD/Flash• Memory is volatile and typical low in capacity

*https://en.wikipedia.org/wiki/IOPS

*https://en.wikipedia.org/wiki/RAM_drive

Optimising Storage: The importance of caching

10

Virtualized environments suffer from the ‘I/O blender’ effect• Multiple Virtual Machines sharing a set of disks• Resulting in predominantly random I/O• Magnetic drives provide poor random performance• SSD & Flash storage ideal for workloads but expensive

Working sets of data• Driven by workloads which are ever changing• Refers to the amount of data most frequently accessed• Always related to a time period• Working sets sizes evolve as workloads change

Caching• Combat the I/O blender effect without the expense of all Flash or SSD• Working sets of data can be identified and elevated to cache

Optimising Storage: SSD/Flash caching

SSD/Flash Caching• Significantly improves overall I/O performance• Reduces the number of I/Os going directly to disk• Dynamic cache sizing read/write ratio

Writes operations• Data is written as variable sized extents• Extents are merged and coalesced in the background• Data in cache is flushed to hard disk regularly in small bursts

Read operations• SvSAN algorithm identifies and promotes data, based on access patterns• Frequently accessed data blocks are elevated on SSD/Flash• Least frequently accessed blocks are aged out

Optimising Storage: Cold & Hot data

Intelligent read caching algorithm• All read I/Os are monitored and analyzed• Most frequently used data – “Hot” data • Cache tiers are populated based on access frequency

Tiering• RAM: Most frequently accessed data• SSD/Flash: Next most frequently accessed data• HDD: Infrequently accessed data – “Cold” data

Sizing• Assign cache sizes to meet requirements• Grow caches as working sets change• Use any combination of Memory, SSD/Flash and Disk

Play to the strengths• Play to the strengths of all mediums• Memory Highest IOPS• SSD/Flash Magnetic drives providing low price per GB

Industry performance numbers

Lab produced• Numbers produced under strict conditions representing peak IOPS• Random workloads focus on small block sizes to produce BIG numbers• Sequential workloads focus on large block sizes to show BIG throughput• Set unrealistic expectations

Example• All Read: 4KiBs 100% random read• Mixed Read/Write: 4KiBs 70/30 random read• Sequential Read: 256KiB• Sequential Write: 256KiB

The real world• Multiple VMs running numerous mixed workloads• AD, DNS, DHCP: low IOPS requirement• Database, email and application servers: higher IOP requirement• Generally sharing the same storage subsystem

*SQL Server I/O block size ref table

Operation IO Block Size

Transaction log write 512 bytes – 60 KB

Checkpoint/ Lazywrite 8KB – 1MB

Read-Ahead Scans 128KB – 512KB

Bulk Loads 256KB

Backup/Restore 1MB

ColumnStore Read-Ahead 8MB

File Initialization 8MB

In-Memory OLTP Checkpoint 1MB

How much performance is enough?

What do you need?• Understand and document your storage requirements• Current IOP and latency requirements of the current environment• Lifecycle of solution?

How do you choose?• Don’t base your decision on a 4KiB 100% random read workload• When evaluating use realistic workloads• Does the management and functionality meet your needs?

What matters• Meets your current performance & capacity requirements• Meets your future performance & capacity requirements• Meets your deployment, management and availability requirements

Customer data analysis

Real life data• Real customer data collected and analysed • Exact data patterns simulated and replayed• Accurate expectations performance under their workloads

Results• Average customer would benefit from caching/tiering• Up to 70% of I/O being satisfied from read cache• A small amount of cache makes a big difference

Conclusion• Few customers have performed this exercise• Over provision hardware is common• Significant cost savings were identified

Customer examples• UK - Oil & Gas • US - On-demand consumer service• US - National retailer

Customer data analysis: Oil and Gas (UK)

Read Write

Read/Write Ratio 53% 47%

Average Per Day 93GB 84GB

Average Block Size 61KB 24KB

Average IOPS 18 41

Workloads• Back office apps• Back up service

Challenge• Customer was looking to understand current workloads• No definitive indication of current storage performance requirements• Concerned about growing number of disk failures

StorMagic analysis• Enabled I/O meta-data collection of a period of time

Distribution of I/O sizes Throughput and IOPS Locality of access

0.001

0.01

0.1

1

10

100

1000

10000

21

MB

26

4 G

B

52

9 G

B

79

4 G

B

1.0

TB

1.3

TB

1.6

TB

1.8

TB

2.1

TB

2.3

TB

2.6

TB

2.8

TB

3.1

TB

3.4

TB

3.6

TB

3.9

TB

4.1

TB

4.4

TB

4.7

TB

4.9

TB

5.2

TB

5.4

TB

5.7

TB

Nu

mb

er

of

acc

ess

es

(lo

ga

rith

mic

sca

le)

Locality of access

Read

Write

0

500

1000

1500

2000

2500

3000

3500

4000

14

:16

15

:04

15

:52

16

:40

17

:28

18

:16

19

:04

19

:52

20

:40

21

:28

22

:16

23

:04

23

:52

00

:40

01

:28

02

:16

03

:04

03

:52

04

:40

05

:28

06

:16

07

:04

07

:52

08

:40

09

:28

10

:16

11

:04

11

:52

12

:40

13

:28

IOP

S

Time of Day (UTC)

Throughput IOPS

Read

Write

Customer data analysis: Oil and Gas (UK)

Estimates• SSD & memory: 70% of I/O being satisfied from read cache when using 2GB

memory and 200GB SSD• Memory only: 56% of I/O being satisfied from read cache when using 2GB of

memory

Testing• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection

Conclusion • Current environment sufficient for today’s workloads• Allocate a small amount of memory and SSD for optimal caching• Using caching would increase disk MTBF

17%

23%

60%

0%

20%

40%

60%

80%

100%

Read Data Serviced by Tiers

Me mo ry

SSD

Disk

30%

14%

56%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Estimate % Read Serviced by Tiers

Memory

SSD

Disk

28.79 GB

13.52 GB

53.94 GB

0.00 GB

10.00 GB

20.00 GB

30.00 GB

40.00 GB

50.00 GB

60.00 GB

70.00 GB

80.00 GB

90.00 GB

100.00 GB

Actual Data Read From Tiers

Customer data analysis: On-demand consumer service (US)

Read Write

Read/Write Ratio 40% 60%

Average Per Day 9GB 13GB

Average Block Size 30KB 11KB

Average IOPS 5 15

Workloads• Network monitoring for on-demand service• Back office apps

Challenge• Customer was evaluating Hyperconverged solutions• Was considering full flash

StorMagic analysis• Enabled I/O meta-data collection of a period of time in a live POC


0

50

100

150

200

250

300

350

400

16

:13

16

:53

17

:33

18

:13

18

:53

19

:33

20

:13

20

:53

21

:33

22

:13

22

:53

23

:33

00

:13

00

:53

01

:33

02

:13

02

:53

03

:33

04

:13

04

:53

05

:33

06

:13

06

:53

07

:33

08

:13

08

:53

09

:33

IOP

S

Time of Day (UTC)

Throughput IOPS

Read

Write

0.001

0.01

0.1

1

10

100

1000

10000

0

42

GB

87

GB

13

2 G

B

17

7 G

B

22

2 G

B

26

7 G

B

31

2 G

B

35

7 G

B

40

2 G

B

44

7 G

B

49

2 G

B

53

6 G

B

58

1 G

B

62

6 G

B

67

1 G

B

71

6 G

B

76

1 G

B

80

6 G

B

85

1 G

B

89

6 G

B

94

1 G

B

98

6 G

B

Nu

mb

er

of

acc

ess

es

Th

ou

san

ds

Locality of access

Read

Write

Estimates• SSD & memory: 94% of I/O being satisfied from read cache, when using 2GB

memory and 120GB SSD• Memory only: 94% of I/O being satisfied from read cache when using 2GB of

memory

Testing• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection

Conclusion • Environment sufficient for workloads• Allocate a small amount of memory to satisfy almost all reads

6%

94%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%


Memory

SSD

Disk

0.37GB

67.85GB

0.00GB

10.00GB

20.00GB

30.00GB

40.00GB

50.00GB

60.00GB

70.00GB

80.00GB

% of Read Data Serviced by Tiers

Customer data analysis: On-demand consumer service (US)

Customer data analysis: National retailer (US)

Read Write

Read/Write % 77% 23%

Average Per Day 991 GB 294 GB

Average Block Size 58 KB 54 KB

Average IOPS 212 138

Workloads• Point of Sale • 78 in store applications• Back up service

Challenge• Customer is looking at a hardware refresh across store locations• How to size for current environment and future growth?

StorMagic analysis• Enabled I/O meta-data collection of a period of time


0

500

1000

1500

2000

2500

3000

3500

18

:39

21

:38

00

:37

03

:36

06

:35

09

:34

12

:33

15

:32

18

:31

21

:30

00

:29

03

:28

06

:27

09

:26

12

:25

15

:24

18

:23

21

:22

00

:21

03

:20

06

:19

09

:18

12

:17

15

:16

18

:15

IOP

s

Time of Day (UTC)

Throughput IOPs

Read

Write

0.001

0.01

0.1

1

10

100

1000

10000

21

MB

12

1 G

B

24

3 G

B

36

5 G

B

48

7 G

B

60

8 G

B

73

0 G

B

85

2 G

B

97

4 G

B

1.1

TB

1.2

TB

1.3

TB

1.4

TB

1.5

TB

1.7

TB

1.8

TB

1.9

TB

2.0

TB

2.1

TB

2.3

TB

2.4

TB

2.5

TB

2.6

TB

Nu

mb

er

of

acc

ess

es

(lo

ga

rith

mic

sca

le)

Th

ou

san

ds

Locality of access

Read

Write

Customer data analysis: National retailer (US)

Estimates• SSD and Memory: As high as 83% of I/O being satisfied from read cache when

using 8GB memory and 200GB SSD• Memory only: As high as 60% of I/O being satisfied from read cache when

using 8GB of memory

Testing• No SSD used in testing, only focused on memory read caching• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection

Conclusion • Use SATA and memory caching vs SAS disks• Reduced drive count but doubled performance and capacity• Reduced power and cooling• Less disks means a increase in Mean Time Between Failures (MTBF)

17%

23%

60%

0%

20%

40%

60%

80%

100%


Me mo ry

SSD

Disk

What did these customers learn

• Clarity into their environment workloads

• They could spend less on hardware

• How caching technologies will benefit their workloads

• Identifying their immediate requirements

• Know the performance limits of environment

• How the environment can scale to meet performance growth

Summary

• Understand your requirements

• Create a viable success criteria

• Only compare solutions that make sense

• Be wary of lab produced numbers!

• Storage performance can be affected by other factors

• Simulate workloads you plan to run

24

Q&A and Next Steps

SvSAN Product Information

Product Options

SvSAN license 2, 6, 12 and unlimited TBs

License entitlement 2 mirrored servers

Maintenance and support Platinum - 24x7 / Gold - 9x5

For further information, please contact: [email protected]

Further Reading:

An overview of SvSAN - http://stormagic.com/svsan/

SvSAN v6 Data Sheet - http://stormagic.com/svsan-data-sheet/

SvSAN v6 White Paper - http://stormagic.com/svsan-6/

Download your free trial of SvSAN

stormagic.com/trial

mailto:[email protected]

http://stormagic.com/svsan

http://stormagic.com/svsan-data-sheet/

http://stormagic.com/svsan-6/

Monthly Webinar Series - StorMagic · File Initialization 8MB In-Memory OLTP Checkpoint 1MB. How...

Documents

Transcript of Monthly Webinar Series - StorMagic · File Initialization 8MB In-Memory OLTP Checkpoint 1MB. How...