Download - Understanding Amazon EBS Availability and Performance

AWS Summit 2013Navigating the Cloud

Understanding Amazon EBS Availability and Performance

Eric Anderson

CopperEgg

April 18, 2013

CopperEgg: EBS Use Case• How CopperEgg uses EBS• EBS vs Provisioned IOPS EBS• EBS and RAID• Backup/Snapshot best practices• Filesystem selection and tuning• Monitoring/Migrations/Planning

How CopperEgg uses EBS

• Real-time monitoring (every 5s)– System information– Processes– Synthetic HTTP/TCP/etc– Application metrics– Tons more..

• Requirements:– Store many terabytes of data– Persist the data over long periods of time– Backups (use snapshots)– High IO: 50-60k+ ops/s per node

• SSD + Provisioned IOPS EBS

– Consistent IO behavior (non-spikey)

EBS vs Provisioned IOPS EBS

• Standard EBS– Good for low IO volume– Bursty workloads may be a good

fit: do the math

• Provisioned IOPS EBS– Great for steady IO patterns that

need consistency– Not always more expensive than

standard!– Be sure to use the IOPS you

provision!

EBS and RAID

• Which RAID?– Depends on your use case, but:

• We use stripes (RAID 0) for most things– Good performance, we build our fault tolerance at a different level

• RAID 10 (stripe of mirrors)– Good RAID0 performance, but increase in fault tolerance due to mirrors– Twice the cost of RAID 0

• RAID 0+1 (mirror of stripes)– Don’t do this – same performance, worse fault tolerance

• RAID 5 (stripe with parity)– Could be dangerous: software RAID 5 can be bad if you have any write caching enabled.– Maybe RAID 6 (dual parity) is an option..

• Block size– Use an appropriate stripe size for best results

• We use 64kb – but you need to test various configs to get the best fit for your application

Backup/Snapshot best practices

• Snapshot regularly– At least once per day, more if you can– First snapshots take a while, subsequent are faster– Schedule for when your IO load is lowest to reduce impact

• We do it at around 9pm CST

• Use consistent naming for snapshots– {hostname}-{raid device}-{device}-{timestamp}

• Use the API for creation– Faster kickoff, more likely to be consistent (script it!)– ec2-create-snapshot –d “{hostname}-{raid device}-{device}-{timestamp}” vol-d726382

• Move older snapshots to S3/Glacier for long-term storage• RAID makes this a bit more complex:

– Make sure you unmount/snapshot/remount your file system, or use fsfreeze to keep consistent snapshots!

Choosing a good file system

• We like ext3/4, but we love XFS– High performance, consistent– Robust and lots of options for tweaking/adjusting as needed

• Our favorite mount options: (your mileage may vary)– inode64, noatime, nodiratime, attr2, nobarrier, logbufs=8, logbsize=256k, osyncisdsync, nobootwait, noauto

– Yields great performance, reduces unnecessary writes, stable

• We like ZFS a lot too, but we want to see more runtime on linux first– But FreeBSD/ZFS would be a fine choice

• However: test your workload!– File systems behave differently under different workloads

EBS/File system performance tuning

• Tuning file systems:– Set the scheduler to use ‘deadline’ (for each disk in RAID array/EBS):

• [as root] echo deadline > /sys/block/[disk device]/queue/scheduler

– Adjust how aggressively the cache is written to disk. Tune these back if you are bursty in write IO:

• vm.dirty_ratio=30• vm.dirty_background_ratio=20

• Track what you change!– Before changing anything, monitor it– After you make the change, monitor it– Then: KEEP monitoring it – things can change over time in unexpected ways

Monitoring

• Observing:– iostat –xcd –t 1

• Watch the sum of r/s and w/s – this is your IOPS metric. For PIOPS, you want it close to the provisioned amount. We monitor this using CopperEgg custom metrics, and alert if it goes low, or high.

– grep –A 1 dirty /proc/vmstat• If nr_dirty approaches nr_dirty_threshold, you need to tune down vm.dirty to flush writes more often.• Reference: http://docs.neo4j.org/chunked/stable/linux-performance-guide.html

• Useful stats to capture:– In /proc/fs/xfs/stat

• xs_trans* -> transactions• xs_read/write* -> read/write operations stats• xb_* -> buffer stats

• Ignore SMART - does not work for EBS• Watch the console log

– Use the AWS API to look for warning signs of EBS issues

http://docs.neo4j.org/chunked/stable/linux-performance-guide.html

Migrations and Capacity Planning

• Using PIOPS?– Plan on a data migration path if you need to increase PIOPS

• You can’t (yet) increase IOPS on the fly

• Migration steps from an EBS backed RAID:1. Snapshot 1hr before, then again, and again – each time it takes less time

2. Stop all services

3. Unmount the filesystem

4. Stop the RAID (mdadm –stop /dev/md0)

5. Take final snapshot

6. Create new volumes based on last snapshot

7. RAID attach new volumes – mdadm should detect the array and magically make it work.

8. Mount the filesystem

9. Restart services