Gluster for Geeks: Performance Tuning Tips & Tricks

18
Jacob Shucart August 25 th , 2011 Gluster for Geeks: Performance Tips & Tricks

description

In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.Topics covered will include: • Protocols (CIFS, NFS, GlusterFS)• Hardware configuration• Tuning parameters• Performance benchmarks

Transcript of Gluster for Geeks: Performance Tuning Tips & Tricks

Page 1: Gluster for Geeks: Performance Tuning Tips & Tricks

Jacob Shucart

August 25th, 2011

Gluster for Geeks:

Performance Tips &

Tricks

Page 2: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 2

How To Ask a Question?

Some Housekeeping Items…

Ask a question at any time

Questions will be answered at

the end of the webinar

Slides will be available after

the webinar

The webinar is being

recorded

Page 3: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 3

Gluster for Geeks

The Gluster for Geeks webinar series is designed

for technical audiences who are familiar with

GlusterFS

In this edition, “Performance tuning tips and tricks”

we will discuss in detail the performance related

considerations for a running a GlusterFS

deployment

Page 4: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 4

Topics

Planning

Configuration

Implementing

Tuning

Benchmarking

Top 5 Issues

Page 5: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 5

Planning – Key Considerations

Performance requirements

– What performance do you need to hit & how do you plan to get to it?• Read• Write• Throughput• Availability

For a given performance level what type is required?

– E.g. for a throughput of X and capacity of Y what is needed?

Workloads

– What is the workload in the environment? – Small files? – Large files? – Is throughput your only consideration? – What is the application?

Page 6: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 6

Planning - Sizing and Architecture

Gluster performance relies on hardware/underlying infrastructure

– CPU, memory, disks, network

– Virtual machine & cloud infrastructure

– Number of systems in the cluster depends on performance and capacity requirements

– There are many ways to meet organizational needs

– For on-prem 2U & 4U DAS systems and JBODS are great building blocks

Examples: 3 common deployment scenarios

– Capacity-centric environments• 2U & 4U DAS systems with multiple JBODS

• Lower RAM and CPU requirements

• Lower network requirements

– Mixed capacity and performance environments• 2U & 4U DAS systems with 1-2 JBODS max

• Higher RAM and CPU requirements

• Low to high network requirements

– High performance environments• 1U or 2U systems with no JBODS

• Highest RAM and CPU requirements

• Fast disks and fast network

Page 7: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 7

Configuration

Choosing the correct volume type for a workload

Volume options include– Distribute – higher performance, no redundancy– Replicate(or distribute+replicate) – general purpose, HA, faster

reads– Stripe(or distribute+stripe) – high concurrent reads, low writes, no

redundancy

Protocols & performance– GlusterFS gives the best overall performance (pNFS like

functionality)– NFS gives excellent performance given right workload– CIFS should only be used for Windows systems

Data flow– How do supported protocols differ?

Page 8: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 8

Implementing – Cluster Hardware Configuration

Node and cluster configurations

– More CPU means greater parallel threads on servers

– More RAM means more cached operations

– More network means more throughput

Dedicated backend network for node

communication

– Dedicated back end network should be used for NFS and

CIFS

– Recommend 10GBe minimum

GlusterFS native only uses inter-node

communication for management calls

Page 9: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 9

Implementing Gluster - Fundamentals

Distribute only

• Non-redundant at the brick level• Cuts hardware, software costs in half.

• Failure of a brick or node results in loss of access to the data on those bricks.

• Writes destined to the failed brick will fail.

• Redundant RAID, hardware is strongly recommended.

Page 10: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 10

Implementing Gluster - Fundamentals

Distribute with replica

• Redundant at the brick level• Failure of a brick or node does not affect I/O.

• Writes are written simultaneously to each replica.

• Any number of replicas are supported.

• Gluster Native, CIFS, and NFS support stateful failover. (Gluster Native only in AWS)

• Redundant RAID, hardware is strongly recommended.

Page 11: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 11

Implementing Gluster - Fundamentals

Gluster Native client data flow

Page 12: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 12

Implementing Gluster - Fundamentals

NFS, CIFS dataflow

Page 13: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 13

Tuning

Key tuning parameters

– performance.write-behind-window-size 65535 (in bytes)

– performance.cache-refresh-timeout 1 (in seconds)

– performance.cache-size 1073741824 (in bytes)

– performance.read-ahead off (only for 1GbE)

– Default settings are suitable for mixed workloads

Tuning for different environments

– For Amazon, m1.xlarge or greater

– Understand hardware/firmware settings and their impact on

performance(for example, CPU frequency scaling and IB,

10GbE and the TCP Offload Engine)

Page 14: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 14

Benchmarking

From the Gluster Performance white paper

– iozone –R –l 3 –u 5 –r 512k –s 256m –F /mnt/1 /mnt/2 /mnt/3 /mnt/4 /mnt/5

– dd if=/dev/zero of=/mnt/test bs=1M count=1

Performance expectations

– Get a baseline benchmark of disks on systems– What can you expect from your network?

IOPS vs. throughput

– Is your workload better measured in throughput– Certain operations have different impact(dir creation)– If IOPS is your measurement remember latency

Page 15: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 15

Top 5 Causes for Performance Issues

Straight from our professional services

performance team

1. Underpowered/mis-configured disks

2. Underpowered/mis-configured network

3. Faulty hardware(broken/bad blocks/etc)

4. Too few servers

5. Wrong protocol for the job

Page 16: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 16

Conclusion

GlusterFS performance depends heavily on the underlying

hardware

You should understand your workloads to guide your

hardware configuration

The default parameters work well for general workloads

Several tuning parameters are available

When experiencing performance issues check the disks

and network first

Page 17: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 17

Polling Question

What should we talk about in next months Gluster

Geeks Only webinar?

A. Setting up a basic Gluster cluster

B. Gluster Geo-Replication

C. Frequently Asked Questions

D. Gluster Translators

E. Other technical topics

Page 18: Gluster for Geeks: Performance Tuning Tips & Tricks

A Better Way To Do Storage 18

Questions & Resources

What are your performance questions?– Ask now using the Go-to-webinar questions panel

Helpful resources– Performance white paper posted here:

http://www.gluster.com/products/resources/

– Documentation: http://gluster.com/community/documentation

– Questions?: http://community.gluster.org/