Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

67
1

description

Performance Tuning: Pulling a Rabbit From a Hat George Barnett of Atlassian

Transcript of Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Page 1: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

1

Page 2: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

George Barnett, Performance Engineer, Atlassian

How to make your apps run faster

Pulling a rabbit from a hat

22

Page 3: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Why are we here?• How to make your Atlassian software run faster

• What doesnʼt work

• What Atlassian does to improve performance

• No development HOWTOs or code!

33

Page 4: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Who can improve performance?• Atlassian

• YOU! (Thanks for watching!)

44

Page 5: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Performance Engineering• Isolate performance issues before they become headaches.

• Make sure these issues get fixed.

55

Page 6: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Simple code lifecycle

6

Developer writes code

Unit & Functional tests

Software artefact

Performance Testing

Reporting

Reporting

commit

tests pass

6

Page 7: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Performance Testing Scripts• Apache JMeter

• In beta for JIRA• Ask [email protected]

• Available for Confluence• http://confluence.atlassian.com/display/DOC/Performance+Testing+Scripts

77

Page 8: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Performance Testing• Real world data and load

• Public Atlassian instances

• Performance test suite• JMeter & scripts

• Automation• Maven & Bamboo

88

Page 9: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Performance Testing• Daily performance tests

• Most products & libraries

• Weekly soak tests• Run for much longer than daily tests

• Reporting for developers• Build screens, dashboards, notifications

99

Page 10: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Reporting

1010

Page 11: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Does it work?

11

0

23

45

68

90

Browse Issue (ms)

Jira 3.13 Jira 4.0

0

150

300

450

600

Browse Page (ms)

Confluence 2.10 Confluence 3.0

11

Page 12: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Upgrade your Atlassian Software!• Dozens of Performance issues fixed

• Release criteria includes performance

1212

Page 13: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

1313

Page 14: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

The contenders• Dell PE 850 - “WALL-E”

• Xeon X3220 2.4Ghz Quad Core• Quad Core, 2 x 4Mb L3

• 8Gb DDR2 RAM

• 2 x 7.2K SAS Disks

• 1 x Intel X25-E 32GB SSD

• ~$1500

1414

Page 15: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

The contenders• Dell PE 2950 - “Johnny 5”

• 2 x Xeon E5405 2.0Ghz• Quad Core, 2 x 6M L3

• 32Gb RAM

• 2 x 72Gb 15K Drives

• ~$4000

1515

Page 16: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

The contenders• Dell PE R610 - “EVE”

• 2 x Xeon E5520 2.2Ghz• Quad Core, 8M “SmartCache”

• 32Gb Ram

• 2 x 146G 15K drives

• ~$4000

1616

Page 17: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

The Workload• JIRA 4, MySQL 5, RHEL 5

• ~70,000 Issues, ~80 projects

• ~30 reqs/sec, JMeter 2.3.4 (patched, JSR-223)

• Traffic modelled on http://jira.atlassian.com with higher writes

1717

Page 18: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

About the graphs• Most show Average Response Time in Milliseconds

• Includes time to generate AND deliver the response• Does not include any browser time

• Arrows show if lower or higher is better!

1818

Page 19: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

1919

Page 20: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Must. Have.

2020

Page 21: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Upgrade Java 1.5 to 1.6• Advances in compiler / VM technology

• Produces better runtime code• Able to use modern instructions, eg SSE• Able to use modern hardware, eg NUMA

2121

Page 22: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Java 1.5 vs 1.6 (Johnny 5)

0

375

750

1125

1500

Average Response Time (ms)

Java 1.5.0_18 Java 1.6.0_16

6.3X22

22

Page 23: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Java 1.5 vs 1.6 (Johnny 5)

0

95

190

285

380

Average CPU (%) 26%23

Java 1.5.0_18 Java 1.6.0_16

23

Page 24: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Upgrade Java to 1.6• 3 x HotSpot VM updates in 1.6.0_XX

• Advancing technology• Compressed OOPS on 64bit• Escape Analysis• G1 Garbage Collector• NUMA

2424

Page 25: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Java 1.6 vs 1.6 (Eve)

138

141

143

146

148

Average Response Time (ms)

Java 1.6.0_7 Java 1.6.0_16

7%25

25

Page 26: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Hardware Upgrades• Hyper-Threading

• QuickPath Interconnect

• Nehalem E5520 (Eve) have 3X on die DDR3 Memory Controllers

• Memory throughput improvements - good for Java!

Technology Speed

DDR2-800 (dual channel) 6.4 GB/sec

DDR3-1333 (dual channel) 21.3 GB/sec

2626

Page 27: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

FSB architecture• Shared bus

• 1333 MT/s = 1333M Transfers / sec

• 4 transfers/clock = 266Mhz.

CPUNorthBridge

RAM

I/O: SATA, USB, etc

FSB

CPU South Bridge

2727

Page 28: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Quick Path Interconnect• 3.2Ghz, 6.4 GT/sec

• 25.6 GB/sec

CPU

QPI

CPU

RAM

RAM

IOH

2828

Page 29: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Hyper Threading

CPUThread 1

Thread 2

Thread 1 runs until it hits a cache miss.

Thread 2 runs on the same CPU while T1 is awaiting data

2929

Page 30: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Hardware Upgrades

0

500

1000

1500

2000

Average Response Time (ms)

2.4Ghz Xeon (Wall-E) 2.0Ghz Xeon (Johnny 5) 2.2Ghz Xeon (Eve)

30

8-12X

30

Page 31: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Improved GC Throughput

0

7500

15000

22500

30000

GC Throughput (MB/sec)

2.4Ghz Xeon (Wall-E) 2.0Ghz Xeon (Johnny 5) 2.2Ghz Xeon (Eve)

31

2-3X

31

Page 32: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

• Javaʼs Virtual Machine manages memory

• New objects are continually created during runtime

• GC is the process of collecting dead objects to reclaim memory

Garbage Collection

3232

Page 33: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune Garbage Collection• Getting your GC tuning wrong is bad

• How bad?

• Are the defaults sane?

3333

Page 34: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune Garbage Collection

0

1000

2000

3000

4000

Wall-E (ms) Johnny 5 (ms) Eve (ms)

CMS ParNew ParOld

34Average Response Time

2X

34

Page 35: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune Garbage Collection

0

125

250

375

500

Johnny 5 (ms) Eve (ms)

CMS ParNew ParOld

35Average Response Time

2X

35

Page 36: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune Garbage Collection• Defaults are sane

• Parallel New + Parallel Old is fastest

• CMS prioritises low pauses over CPU usage

• CMS does not compact - can lead to heap fragmentation

3636

Page 37: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

VMWare• Does being in a VM hurt?

• What if the risks are managed?

3737

Page 38: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Remove VMWare

0

43

85

128

170

Average Response Time (ms)

Real ESX 3.5 ESX 4i

38

43%

38

Page 39: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Remove VMWare• On average ~30% slower

• Under extreme resource starvation, up to 100% slower

• Avoid VMWare for high traffic instances

• See the Atlassian docs for recommendations• http://confluence.atlassian.com/display/DOC/Running+Confluence+in+a+Virtualised+Environment• http://confluence.atlassian.com/display/JIRA/Running+JIRA+in+a+Virtualised+Environment

3939

Page 40: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Upgrade your browsers!• Use a modern browser

• IE8, Safari 4 or Firefox 3.6

• Pick the browser thats right for you

• Avoid IE6 (unsupported and slow)• http://sixrevisions.com/infographics/performance-comparison-of-major-web-browsers/

4040

Page 41: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

You are here• How I did the testing

• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

4141

Page 42: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Maybe.

4242

Page 43: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Gigabit Network vs 100Mbit• Gigabit is faster, sure

• But 100Mb is OK if youʼre not doing big transfers, RIGHT?!

4343

Page 44: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Use Gigabit

1600

1675

1750

1825

1900

Average Response Time (ms)

100mb/sec 1000mb/sec

44

10%

44

Page 45: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Use Gigabit• Lower RTT on Gigabit

• Unlikely to hit throughput limit

• If you have enough CPU power, keep the DB on localhost

4545

Page 46: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Use Gigabit

150

163

175

188

200

ReIndex Time (seconds)

100Mbit 1000Mbit

46

26%

46

Page 47: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Heap size tuning• Memory starvation is bad, but can we drown the system with too much

memory

4747

Page 48: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Heap size - Wall-E

0

500

1000

1500

2000

Average Response Time

1.5G 3G 6G

48

23%

48

Page 49: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Heap size - Johnny 5

0

73

145

218

290

Average Response Time

1.5G 3G 6G 15G

49

49%

49

Page 50: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Heap size - Eve

0

43

85

128

170

Average Response Time

1.5G 3G 6G 15G

50

28%

50

Page 51: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Heap size tuning• Heap too small - bad

• Heap too big - not bad!

• Bigger heaps give the JVM more room to work

• Diminishing returns from really big heaps

5151

Page 52: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Buy an SSD• SSDs are FAST!

• SSDs are EXPENSIVE!

• SSDs are all the rage!

• DBAs are finally happy (almost!)

5252

Page 53: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Buy an SSD

0

200

400

600

800

Browse Issue Create Issue

SATA SSD

53

33%

53

Page 54: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Buy an SSD• Fast writes

• 0.1ms Avg Service Time

• Most improvement comes from SSD on the DB

• Writing to the Lucene index means re-opening IndexReaders• Much faster on SSD

• Lucene Search Term Cache fits in OS buffers

5454

Page 55: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

• How I did the testing• Hardware & Software

• Must have improvements• Upgrade Java, Hardware, Remove VMWare and tune Garbage Collection

• Worth doing• Gigabit vs 100mbit, Heap size tuning, use an SSD

• Sadly, not useful• Tune MySQL, Replace MySQL

You are here

5555

Page 56: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Meh.

5656

Page 57: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Replace MySQL with MySQL• MySQL 5.0.45 - RHEL 5

• MySQL 5.0.84 Percona B18.

• Percona HighPerf patches fix contention points

• Improved I/O

• Better scaling

5757

Page 58: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Replace MySQL with MySQL

1000

1172

1344

1515

1687

Average Response Time (ms)

MySQL 5.0.45 Percona MySQL 5.0.85

58

0.4%

58

Page 59: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Replace MySQL with MySQL

0

6

13

19

25

Average CPU(%)

MySQL 5.0.45 Percona MySQL 5.0.84

59

0%

59

Page 60: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Replace MySQL with MySQL• JIRA doesnʼt do enough DB queries

• JIRA uses Lucene for many queries

• Percona Highperf MySQL really shines at high query rates• Try this if you share your DB server with other apps!

6060

Page 61: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune MySQL• InnoDB-Heavy-4G.conf

sort_buffer_size = 8Mjoin_buffer_size = 8Mread_buffer_size = 2Mread_rnd_buffer_size = 16Mthread_concurrecy = 16

6161

Page 62: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune MySQL

1000

1178

1355

1533

1710

Average Response Time (ms)

Defaults Tuned

62

1%

62

Page 63: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune MySQL

MyISAM ALL Engines

key_cache_age_threshold, key_cache_block_size, key_cache_division_limit, key_buffer_sizeread_buffer_size, read_rnd_buffer_sizebulk_insert_buffer_size

join_buffer_size - Large joins without Indexes

sort_buffer_size - Used by ORDER BY / GROUP BY

query_cache_sizequery_cache_limitquery_cache_min_res_unit

6363

Page 64: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Tune MySQL• Be sure to set innodb_buffer_pool_size

• Note! innodb_log_file_size will affect shutdown speed

6464

Page 65: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Remember!• Upgrade!

• Atlassian software• Java to 1.6• Your server

• Use Parallel Garbage Collection with a BIG heap

• Remove VMWare if you can

• Use 1000mbit network

• Buy an SSD for write workloads

6565

Page 66: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

A final word• Donʼt resist change but TEST EVERYTHING!

6666

Page 67: Performance Tuning: Pulling a Rabbit From a Hat - Atlassian Summit 2010

Thanks! Questions?• Come chat to me in the hallway!

• Charlie Lounge, 10 June, 3pm - 4pm

• http://blogs.atlassian.com/developer

Flickr Attributions:Siesta in Denmark - http://www.flickr.com/photos/jakecaptive/27123533/Day 11 - http://www.flickr.com/photos/ivanomak/4057632458/Blue Streak - http://www.flickr.com/photos/jettajet/2061715292/

6767