Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Post on 17-May-2015

5.626 views 0 download

Tags:

description

Performance Tuning: Pulling a Rabbit From a Hat George Barnett of Atlassian

Transcript of Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

1

George Barnett, Performance Engineer, Atlassian

How to make your apps run faster

Pulling a rabbit from a hat

22

Why are we here?• How to make your Atlassian software run faster

• What doesnʼt work

• What Atlassian does to improve performance

• No development HOWTOs or code!

33

Who can improve performance?• Atlassian

• YOU! (Thanks for watching!)

44

Performance Engineering• Isolate performance issues before they become headaches.

• Make sure these issues get fixed.

55

Simple code lifecycle

6

Developer writes code

Unit & Functional tests

Software artefact

Performance Testing

Reporting

Reporting

commit

tests pass

6

Performance Testing Scripts• Apache JMeter

• In beta for JIRA• Ask support@atlassian.com

• Available for Confluence• http://confluence.atlassian.com/display/DOC/Performance+Testing+Scripts

77

Performance Testing• Real world data and load

• Public Atlassian instances

• Performance test suite• JMeter & scripts

• Automation• Maven & Bamboo

88

Performance Testing• Daily performance tests

• Most products & libraries

• Weekly soak tests• Run for much longer than daily tests

• Reporting for developers• Build screens, dashboards, notifications

99

Reporting

1010

Does it work?

11

0

23

45

68

90

Browse Issue (ms)

Jira 3.13 Jira 4.0

0

150

300

450

600

Browse Page (ms)

Confluence 2.10 Confluence 3.0

11

Upgrade your Atlassian Software!• Dozens of Performance issues fixed

• Release criteria includes performance

1212

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

1313

The contenders• Dell PE 850 - “WALL-E”

• Xeon X3220 2.4Ghz Quad Core• Quad Core, 2 x 4Mb L3

• 8Gb DDR2 RAM

• 2 x 7.2K SAS Disks

• 1 x Intel X25-E 32GB SSD

• ~$1500

1414

The contenders• Dell PE 2950 - “Johnny 5”

• 2 x Xeon E5405 2.0Ghz• Quad Core, 2 x 6M L3

• 32Gb RAM

• 2 x 72Gb 15K Drives

• ~$4000

1515

The contenders• Dell PE R610 - “EVE”

• 2 x Xeon E5520 2.2Ghz• Quad Core, 8M “SmartCache”

• 32Gb Ram

• 2 x 146G 15K drives

• ~$4000

1616

The Workload• JIRA 4, MySQL 5, RHEL 5

• ~70,000 Issues, ~80 projects

• ~30 reqs/sec, JMeter 2.3.4 (patched, JSR-223)

• Traffic modelled on http://jira.atlassian.com with higher writes

1717

About the graphs• Most show Average Response Time in Milliseconds

• Includes time to generate AND deliver the response• Does not include any browser time

• Arrows show if lower or higher is better!

1818

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

1919

Must. Have.

2020

Upgrade Java 1.5 to 1.6• Advances in compiler / VM technology

• Produces better runtime code• Able to use modern instructions, eg SSE• Able to use modern hardware, eg NUMA

2121

Java 1.5 vs 1.6 (Johnny 5)

0

375

750

1125

1500

Average Response Time (ms)

Java 1.5.0_18 Java 1.6.0_16

6.3X22

22

Java 1.5 vs 1.6 (Johnny 5)

0

95

190

285

380

Average CPU (%) 26%23

Java 1.5.0_18 Java 1.6.0_16

23

Upgrade Java to 1.6• 3 x HotSpot VM updates in 1.6.0_XX

• Advancing technology• Compressed OOPS on 64bit• Escape Analysis• G1 Garbage Collector• NUMA

2424

Java 1.6 vs 1.6 (Eve)

138

141

143

146

148

Average Response Time (ms)

Java 1.6.0_7 Java 1.6.0_16

7%25

25

Hardware Upgrades• Hyper-Threading

• QuickPath Interconnect

• Nehalem E5520 (Eve) have 3X on die DDR3 Memory Controllers

• Memory throughput improvements - good for Java!

Technology Speed

DDR2-800 (dual channel) 6.4 GB/sec

DDR3-1333 (dual channel) 21.3 GB/sec

2626

FSB architecture• Shared bus

• 1333 MT/s = 1333M Transfers / sec

• 4 transfers/clock = 266Mhz.

CPUNorthBridge

RAM

I/O: SATA, USB, etc

FSB

CPU South Bridge

2727

Quick Path Interconnect• 3.2Ghz, 6.4 GT/sec

• 25.6 GB/sec

CPU

QPI

CPU

RAM

RAM

IOH

2828

Hyper Threading

CPUThread 1

Thread 2

Thread 1 runs until it hits a cache miss.

Thread 2 runs on the same CPU while T1 is awaiting data

2929

Hardware Upgrades

0

500

1000

1500

2000

Average Response Time (ms)

2.4Ghz Xeon (Wall-E) 2.0Ghz Xeon (Johnny 5) 2.2Ghz Xeon (Eve)

30

8-12X

30

Improved GC Throughput

0

7500

15000

22500

30000

GC Throughput (MB/sec)

2.4Ghz Xeon (Wall-E) 2.0Ghz Xeon (Johnny 5) 2.2Ghz Xeon (Eve)

31

2-3X

31

• Javaʼs Virtual Machine manages memory

• New objects are continually created during runtime

• GC is the process of collecting dead objects to reclaim memory

Garbage Collection

3232

Tune Garbage Collection• Getting your GC tuning wrong is bad

• How bad?

• Are the defaults sane?

3333

Tune Garbage Collection

0

1000

2000

3000

4000

Wall-E (ms) Johnny 5 (ms) Eve (ms)

CMS ParNew ParOld

34Average Response Time

2X

34

Tune Garbage Collection

0

125

250

375

500

Johnny 5 (ms) Eve (ms)

CMS ParNew ParOld

35Average Response Time

2X

35

Tune Garbage Collection• Defaults are sane

• Parallel New + Parallel Old is fastest

• CMS prioritises low pauses over CPU usage

• CMS does not compact - can lead to heap fragmentation

3636

VMWare• Does being in a VM hurt?

• What if the risks are managed?

3737

Remove VMWare

0

43

85

128

170

Average Response Time (ms)

Real ESX 3.5 ESX 4i

38

43%

38

Remove VMWare• On average ~30% slower

• Under extreme resource starvation, up to 100% slower

• Avoid VMWare for high traffic instances

• See the Atlassian docs for recommendations• http://confluence.atlassian.com/display/DOC/Running+Confluence+in+a+Virtualised+Environment• http://confluence.atlassian.com/display/JIRA/Running+JIRA+in+a+Virtualised+Environment

3939

Upgrade your browsers!• Use a modern browser

• IE8, Safari 4 or Firefox 3.6

• Pick the browser thats right for you

• Avoid IE6 (unsupported and slow)• http://sixrevisions.com/infographics/performance-comparison-of-major-web-browsers/

4040

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

4141

Maybe.

4242

Gigabit Network vs 100Mbit• Gigabit is faster, sure

• But 100Mb is OK if youʼre not doing big transfers, RIGHT?!

4343

Use Gigabit

1600

1675

1750

1825

1900

Average Response Time (ms)

100mb/sec 1000mb/sec

44

10%

44

Use Gigabit• Lower RTT on Gigabit

• Unlikely to hit throughput limit

• If you have enough CPU power, keep the DB on localhost

4545

Use Gigabit

150

163

175

188

200

ReIndex Time (seconds)

100Mbit 1000Mbit

46

26%

46

Heap size tuning• Memory starvation is bad, but can we drown the system with too much

memory

4747

Heap size - Wall-E

0

500

1000

1500

2000

Average Response Time

1.5G 3G 6G

48

23%

48

Heap size - Johnny 5

0

73

145

218

290

Average Response Time

1.5G 3G 6G 15G

49

49%

49

Heap size - Eve

0

43

85

128

170

Average Response Time

1.5G 3G 6G 15G

50

28%

50

Heap size tuning• Heap too small - bad

• Heap too big - not bad!

• Bigger heaps give the JVM more room to work

• Diminishing returns from really big heaps

5151

Buy an SSD• SSDs are FAST!

• SSDs are EXPENSIVE!

• SSDs are all the rage!

• DBAs are finally happy (almost!)

5252

Buy an SSD

0

200

400

600

800

Browse Issue Create Issue

SATA SSD

53

33%

53

Buy an SSD• Fast writes

• 0.1ms Avg Service Time

• Most improvement comes from SSD on the DB

• Writing to the Lucene index means re-opening IndexReaders• Much faster on SSD

• Lucene Search Term Cache fits in OS buffers

5454

• How I did the testing• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

You are here

5555

Meh.

5656

Replace MySQL with MySQL• MySQL 5.0.45 - RHEL 5

• MySQL 5.0.84 Percona B18.

• Percona HighPerf patches fix contention points

• Improved I/O

• Better scaling

5757

Replace MySQL with MySQL

1000

1172

1344

1515

1687

Average Response Time (ms)

MySQL 5.0.45 Percona MySQL 5.0.85

58

0.4%

58

Replace MySQL with MySQL

0

6

13

19

25

Average CPU(%)

MySQL 5.0.45 Percona MySQL 5.0.84

59

0%

59

Replace MySQL with MySQL• JIRA doesnʼt do enough DB queries

• JIRA uses Lucene for many queries

• Percona Highperf MySQL really shines at high query rates• Try this if you share your DB server with other apps!

6060

Tune MySQL• InnoDB-Heavy-4G.conf

sort_buffer_size = 8Mjoin_buffer_size = 8Mread_buffer_size = 2Mread_rnd_buffer_size = 16Mthread_concurrecy = 16

6161

Tune MySQL

1000

1178

1355

1533

1710

Average Response Time (ms)

Defaults Tuned

62

1%

62

Tune MySQL

MyISAM ALL Engines

key_cache_age_threshold, key_cache_block_size, key_cache_division_limit, key_buffer_sizeread_buffer_size, read_rnd_buffer_sizebulk_insert_buffer_size

join_buffer_size - Large joins without Indexes

sort_buffer_size - Used by ORDER BY / GROUP BY

query_cache_sizequery_cache_limitquery_cache_min_res_unit

6363

Tune MySQL• Be sure to set innodb_buffer_pool_size

• Note! innodb_log_file_size will affect shutdown speed

6464

Remember!• Upgrade!

• Atlassian software• Java to 1.6• Your server

• Use Parallel Garbage Collection with a BIG heap

• Remove VMWare if you can

• Use 1000mbit network

• Buy an SSD for write workloads

6565

A final word• Donʼt resist change but TEST EVERYTHING!

6666

Thanks! Questions?• Come chat to me in the hallway!

• Charlie Lounge, 10 June, 3pm - 4pm

• http://blogs.atlassian.com/developer

Flickr Attributions:Siesta in Denmark - http://www.flickr.com/photos/jakecaptive/27123533/Day 11 - http://www.flickr.com/photos/ivanomak/4057632458/Blue Streak - http://www.flickr.com/photos/jettajet/2061715292/

6767