Monthly Webinar Series - StorMagic · File Initialization 8MB In-Memory OLTP Checkpoint 1MB. How...
Transcript of Monthly Webinar Series - StorMagic · File Initialization 8MB In-Memory OLTP Checkpoint 1MB. How...
Understanding storage performance for hyperconverged infrastructure
Luke Pruen – Technical Services Director
“Virtual SANs made simple”
Monthly Webinar Series:
Enabling HCI
Pre-configured, certified and
supported by major vendors
Large and Small
Large and small deployments
from enterprises with 1000s of
site to SME’s with a single site
Global footprint
In 72 countries, customers
depend on StorMagic for sever
and storage infrastructure
Partner Network
Wherever you are, StorMagic has
resellers, integrators, and server
partners to meet your needs
30 + verticals
Including retail, financial
services, healthcare,
government, education,
energy, professional services,
pharma, and manufacturing
Introducing StorMagic
What is Hyperconverged Infrastructure?
“Tightly coupled compute, network and storagehardware that dispenses with the need for aregular storage area network (SAN).”
Magic Quadrant for Integrated Systems: Published October 10th 2016
There’s a lot of choice out there
The hyperconverged market• Market is maturing with many options now available• Be careful of pursuing a “one size fits all” approach
Customers• Few customers understand their requirements• Often blindly deploy over-spec’d solutions
Our take• Customers need to be able to measure their needs more accurately • Real world data often provides a surprising insight
Hyperconverged Architectures: Kernel based
• Storage “software” is within the hypervisor• Pools local server storage where the hypervisor is installed• Presents storage over a proprietary mechanism• Claims to be more efficient and able to deliver higher performance
More efficient because its where the hypervisor runs Less “hops” to the storage
SSD
HypervisorStorage
SW
SSD
HypervisorStorage
SW
SSD
HypervisorStorage
SW
Shared Storage
Hyperconverged Architectures: VSA based
• A virtual storage appliance resides on the hypervisor• Host storage assigned to the local VSA• Storage is generally presented as iSCSI or NFS• Claims to be more flexible than kernel based models
Hypervisor agnostic More storage flexibility Easier to troubleshoot storage issues vs hypervisor issues
SSD SSD SSD
Shared Storage
StorMagic SvSAN: Overview
“SvSAN turns the internal
disk, SSD and memory of 2
or more servers into highly
available shared storage”
StorMagic SvSAN: Benefits
Availability FlexibleCost-EffectiveRobust
Data & operations protected
No single point of failure Local and stretched cluster capable Split-brain risk eliminated
Proven at the IT edge and the datacenter From harshest to the most controlled Supports mission critical applications Tolerates poor, unreliable networks
Enterprise-class management Integrates with standard tools Automated deployment and recovery scripts Designed for use by any IT professional
Today’s needs, future proofedLightest footprint, lowest costAny site, any network
No more physical SANs Converge compute and storage Utilize power of commodity servers Eliminate storage networking components
Lowest CAPEX Start with only 2 servers - existing or new Significantly less CPU and memory One lightweight quorum for all clusters
Lowest OPEX Reduced power, cooling and spares Lower costs with centralized management Eliminate planned and unplanned
downtime
Performance and scale Leverage any CPU and storage type Active/Active synchronous mirroring Scale-up performance with 2 node cluster
Build-your-own Hyperconverged Eliminate appliance overprovisioning Configure to precise IOPS & capacity Auto-tier disk, SSD and memory
Flexibility and growth Hyperconverged or storage-only Hypervisor and server agnostic Non-disruptive upgrades
Optimizing Storage: All storage is not equal
9
Magnetic drives provide poor random performance• SATA 7.2k rpm 75 – 100 IOPS• SAS 10k/15k rpm 140 – 210 IOPS• Lower cost per GB• Higher cost per IOPS
• Flash and SSDs have good random performance• SSD/Flash 8.6K to 10 millions IOPS• Lower cost per IOPS• High cost per GB compared to magnetic
• Memory has even better performance• Orders of magnitude faster than Flash/SSD• Much higher cost per GB compared to SSD/Flash• Memory is volatile and typical low in capacity
*https://en.wikipedia.org/wiki/IOPS
*https://en.wikipedia.org/wiki/RAM_drive
Optimising Storage: The importance of caching
10
Virtualized environments suffer from the ‘I/O blender’ effect• Multiple Virtual Machines sharing a set of disks• Resulting in predominantly random I/O• Magnetic drives provide poor random performance• SSD & Flash storage ideal for workloads but expensive
Working sets of data• Driven by workloads which are ever changing• Refers to the amount of data most frequently accessed• Always related to a time period• Working sets sizes evolve as workloads change
Caching• Combat the I/O blender effect without the expense of all Flash or SSD• Working sets of data can be identified and elevated to cache
Optimising Storage: SSD/Flash caching
SSD/Flash Caching• Significantly improves overall I/O performance• Reduces the number of I/Os going directly to disk• Dynamic cache sizing read/write ratio
Writes operations• Data is written as variable sized extents• Extents are merged and coalesced in the background• Data in cache is flushed to hard disk regularly in small bursts
Read operations• SvSAN algorithm identifies and promotes data, based on access patterns• Frequently accessed data blocks are elevated on SSD/Flash• Least frequently accessed blocks are aged out
Optimising Storage: Cold & Hot data
Intelligent read caching algorithm• All read I/Os are monitored and analyzed• Most frequently used data – “Hot” data • Cache tiers are populated based on access frequency
Tiering• RAM: Most frequently accessed data• SSD/Flash: Next most frequently accessed data• HDD: Infrequently accessed data – “Cold” data
Sizing• Assign cache sizes to meet requirements• Grow caches as working sets change• Use any combination of Memory, SSD/Flash and Disk
Play to the strengths• Play to the strengths of all mediums• Memory Highest IOPS• SSD/Flash Magnetic drives providing low price per GB
Industry performance numbers
Lab produced• Numbers produced under strict conditions representing peak IOPS• Random workloads focus on small block sizes to produce BIG numbers• Sequential workloads focus on large block sizes to show BIG throughput• Set unrealistic expectations
Example• All Read: 4KiBs 100% random read• Mixed Read/Write: 4KiBs 70/30 random read• Sequential Read: 256KiB• Sequential Write: 256KiB
The real world• Multiple VMs running numerous mixed workloads• AD, DNS, DHCP: low IOPS requirement• Database, email and application servers: higher IOP requirement• Generally sharing the same storage subsystem
*SQL Server I/O block size ref table
Operation IO Block Size
Transaction log write 512 bytes – 60 KB
Checkpoint/ Lazywrite 8KB – 1MB
Read-Ahead Scans 128KB – 512KB
Bulk Loads 256KB
Backup/Restore 1MB
ColumnStore Read-Ahead 8MB
File Initialization 8MB
In-Memory OLTP Checkpoint 1MB
How much performance is enough?
What do you need?• Understand and document your storage requirements• Current IOP and latency requirements of the current environment• Lifecycle of solution?
How do you choose?• Don’t base your decision on a 4KiB 100% random read workload• When evaluating use realistic workloads• Does the management and functionality meet your needs?
What matters• Meets your current performance & capacity requirements• Meets your future performance & capacity requirements• Meets your deployment, management and availability requirements
Customer data analysis
Real life data• Real customer data collected and analysed • Exact data patterns simulated and replayed• Accurate expectations performance under their workloads
Results• Average customer would benefit from caching/tiering• Up to 70% of I/O being satisfied from read cache• A small amount of cache makes a big difference
Conclusion• Few customers have performed this exercise• Over provision hardware is common• Significant cost savings were identified
Customer examples• UK - Oil & Gas • US - On-demand consumer service• US - National retailer
Customer data analysis: Oil and Gas (UK)
Read Write
Read/Write Ratio 53% 47%
Average Per Day 93GB 84GB
Average Block Size 61KB 24KB
Average IOPS 18 41
Workloads• Back office apps• Back up service
Challenge• Customer was looking to understand current workloads• No definitive indication of current storage performance requirements• Concerned about growing number of disk failures
StorMagic analysis• Enabled I/O meta-data collection of a period of time
Distribution of I/O sizes Throughput and IOPS Locality of access
0.001
0.01
0.1
1
10
100
1000
10000
21
MB
26
4 G
B
52
9 G
B
79
4 G
B
1.0
TB
1.3
TB
1.6
TB
1.8
TB
2.1
TB
2.3
TB
2.6
TB
2.8
TB
3.1
TB
3.4
TB
3.6
TB
3.9
TB
4.1
TB
4.4
TB
4.7
TB
4.9
TB
5.2
TB
5.4
TB
5.7
TB
Nu
mb
er
of
acc
ess
es
(lo
ga
rith
mic
sca
le)
Locality of access
Read
Write
0
500
1000
1500
2000
2500
3000
3500
4000
14
:16
15
:04
15
:52
16
:40
17
:28
18
:16
19
:04
19
:52
20
:40
21
:28
22
:16
23
:04
23
:52
00
:40
01
:28
02
:16
03
:04
03
:52
04
:40
05
:28
06
:16
07
:04
07
:52
08
:40
09
:28
10
:16
11
:04
11
:52
12
:40
13
:28
IOP
S
Time of Day (UTC)
Throughput IOPS
Read
Write
Customer data analysis: Oil and Gas (UK)
Estimates• SSD & memory: 70% of I/O being satisfied from read cache when using 2GB
memory and 200GB SSD• Memory only: 56% of I/O being satisfied from read cache when using 2GB of
memory
Testing• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection
Conclusion • Current environment sufficient for today’s workloads• Allocate a small amount of memory and SSD for optimal caching• Using caching would increase disk MTBF
17%
23%
60%
0%
20%
40%
60%
80%
100%
Read Data Serviced by Tiers
Me mo ry
SSD
Disk
30%
14%
56%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Estimate % Read Serviced by Tiers
Memory
SSD
Disk
28.79 GB
13.52 GB
53.94 GB
0.00 GB
10.00 GB
20.00 GB
30.00 GB
40.00 GB
50.00 GB
60.00 GB
70.00 GB
80.00 GB
90.00 GB
100.00 GB
Actual Data Read From Tiers
Customer data analysis: On-demand consumer service (US)
Read Write
Read/Write Ratio 40% 60%
Average Per Day 9GB 13GB
Average Block Size 30KB 11KB
Average IOPS 5 15
Workloads• Network monitoring for on-demand service• Back office apps
Challenge• Customer was evaluating Hyperconverged solutions• Was considering full flash
StorMagic analysis• Enabled I/O meta-data collection of a period of time in a live POC
Distribution of I/O sizes Throughput and IOPS Locality of access
0
50
100
150
200
250
300
350
400
16
:13
16
:53
17
:33
18
:13
18
:53
19
:33
20
:13
20
:53
21
:33
22
:13
22
:53
23
:33
00
:13
00
:53
01
:33
02
:13
02
:53
03
:33
04
:13
04
:53
05
:33
06
:13
06
:53
07
:33
08
:13
08
:53
09
:33
IOP
S
Time of Day (UTC)
Throughput IOPS
Read
Write
0.001
0.01
0.1
1
10
100
1000
10000
0
42
GB
87
GB
13
2 G
B
17
7 G
B
22
2 G
B
26
7 G
B
31
2 G
B
35
7 G
B
40
2 G
B
44
7 G
B
49
2 G
B
53
6 G
B
58
1 G
B
62
6 G
B
67
1 G
B
71
6 G
B
76
1 G
B
80
6 G
B
85
1 G
B
89
6 G
B
94
1 G
B
98
6 G
B
Nu
mb
er
of
acc
ess
es
Th
ou
san
ds
Locality of access
Read
Write
Estimates• SSD & memory: 94% of I/O being satisfied from read cache, when using 2GB
memory and 120GB SSD• Memory only: 94% of I/O being satisfied from read cache when using 2GB of
memory
Testing• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection
Conclusion • Environment sufficient for workloads• Allocate a small amount of memory to satisfy almost all reads
6%
94%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Read Data Serviced by Tiers
Memory
SSD
Disk
0.37GB
67.85GB
0.00GB
10.00GB
20.00GB
30.00GB
40.00GB
50.00GB
60.00GB
70.00GB
80.00GB
% of Read Data Serviced by Tiers
Customer data analysis: On-demand consumer service (US)
Customer data analysis: National retailer (US)
Read Write
Read/Write % 77% 23%
Average Per Day 991 GB 294 GB
Average Block Size 58 KB 54 KB
Average IOPS 212 138
Workloads• Point of Sale • 78 in store applications• Back up service
Challenge• Customer is looking at a hardware refresh across store locations• How to size for current environment and future growth?
StorMagic analysis• Enabled I/O meta-data collection of a period of time
Distribution of I/O sizes Throughput and IOPS Locality of access
0
500
1000
1500
2000
2500
3000
3500
18
:39
21
:38
00
:37
03
:36
06
:35
09
:34
12
:33
15
:32
18
:31
21
:30
00
:29
03
:28
06
:27
09
:26
12
:25
15
:24
18
:23
21
:22
00
:21
03
:20
06
:19
09
:18
12
:17
15
:16
18
:15
IOP
s
Time of Day (UTC)
Throughput IOPs
Read
Write
0.001
0.01
0.1
1
10
100
1000
10000
21
MB
12
1 G
B
24
3 G
B
36
5 G
B
48
7 G
B
60
8 G
B
73
0 G
B
85
2 G
B
97
4 G
B
1.1
TB
1.2
TB
1.3
TB
1.4
TB
1.5
TB
1.7
TB
1.8
TB
1.9
TB
2.0
TB
2.1
TB
2.3
TB
2.4
TB
2.5
TB
2.6
TB
Nu
mb
er
of
acc
ess
es
(lo
ga
rith
mic
sca
le)
Th
ou
san
ds
Locality of access
Read
Write
Customer data analysis: National retailer (US)
Estimates• SSD and Memory: As high as 83% of I/O being satisfied from read cache when
using 8GB memory and 200GB SSD• Memory only: As high as 60% of I/O being satisfied from read cache when
using 8GB of memory
Testing• No SSD used in testing, only focused on memory read caching• Replay the exact workload collected from live environment• No best guess synthetic workload but exact patterns from data collection
Conclusion • Use SATA and memory caching vs SAS disks• Reduced drive count but doubled performance and capacity• Reduced power and cooling• Less disks means a increase in Mean Time Between Failures (MTBF)
17%
23%
60%
0%
20%
40%
60%
80%
100%
Read Data Serviced by Tiers
Me mo ry
SSD
Disk
What did these customers learn
• Clarity into their environment workloads
• They could spend less on hardware
• How caching technologies will benefit their workloads
• Identifying their immediate requirements
• Know the performance limits of environment
• How the environment can scale to meet performance growth
Summary
• Understand your requirements
• Create a viable success criteria
• Only compare solutions that make sense
• Be wary of lab produced numbers!
• Storage performance can be affected by other factors
• Simulate workloads you plan to run
24
Q&A and Next Steps
SvSAN Product Information
Product Options
SvSAN license 2, 6, 12 and unlimited TBs
License entitlement 2 mirrored servers
Maintenance and support Platinum - 24x7 / Gold - 9x5
For further information, please contact: [email protected]
Further Reading:
An overview of SvSAN - http://stormagic.com/svsan/
SvSAN v6 Data Sheet - http://stormagic.com/svsan-data-sheet/
SvSAN v6 White Paper - http://stormagic.com/svsan-6/
Download your free trial of SvSAN
stormagic.com/trial