G1 collector and tuning and Cassandra

64
Chris Lohfink G1 Garbage Collector Tuning for Cassandra

Transcript of G1 collector and tuning and Cassandra

Chris LohfinkG1 Garbage Collector Tuning for Cassandra

Who Am I● Co-Organizer (Jeff is the real deal)● 2014 - 2015 Apache Cassandra MVP● DataStax Software Engineer

○ OpsCenter

© 2015 DataStax, All Rights Reserved. 2

1 About G12 G1 Monitoring3 Cassandra G1 Tuning4 Questions

3© 2015 DataStax, All Rights Reserved.

G1 CollectorOverview

Garbage collection

5source: http://deepakmodi2006.blogspot.com/

● Allow JVM to reclaim space used by objects no longer referenced

● Generational heap○ Young

■ function locals■ loops

○ Old■ Things that have been around long time■ SingletonProxyFactoryBeanImpl

Garbage collection

6source: http://deepakmodi2006.blogspot.com/

Garbage CollectionHistorically

7

source: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

Garbage CollectionHistorically

8

About G1 Collector• Garbage First Collector• A Low pause, Parallel & Concurrent Collector• Easy to use/tune

“just works”• Targeted for larger heaps and multiple processors

Server-Style Collector

© 2015 DataStax, All Rights Reserved. 9

When to use the G1 Collector• When low/predictable pauses valued over maximum

throughput• Predictable GC pause durations• Larger Heaps• Tuning GCs doesn’t sound fun

© 2015 DataStax, All Rights Reserved. 10

How it Works

© 2015 DataStax, All Rights Reserved. 11

Take heap and break it into many fixed sized (1-32mb)

regions, goal ~2048.-XX:G1HeapRegionSize=n

Regions can be one of Eden, Survivor, or Humongous (for objects >50% region size)

Eden

Survivor

Old

Humongous

How it Works

© 2015 DataStax, All Rights Reserved. 12

There is not a fixed size number of regions for each. Adaptively changes to meet goal set in -XX:MaxGCPauseMillis=200

Unless overridden in -Xmn or -XX:NewRatio

S E

E E

O O

H H S

Eden

Survivor

Old

Humongous

Young Generation GCs• Stop the world• Triggered when the number of eden regions are filled

© 2015 DataStax, All Rights Reserved. 13

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 14

• Stops the world• Builds collection set (CSet)

• Regions that are involved in collection• Consists of Young and Survivor regions *

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 15

• This gives us the time spent by each worker thread scanning, starting from the external roots (globals, registers, thread stacks and VM data structures)

• E.g. the internal JVM System dictionary which holds all the classes that are loaded by the system

• This gives us the time spent by each worker thread scanning, starting from the external roots (globals, registers, thread stacks and VM data structures)• E.g. the internal JVM System dictionary which holds all the

classes that are loaded by the system

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 16

● Update Remember Sets○ pauses refinement threads○ finish processing dirty card table queue

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 17

● Update Remember Sets○ pauses refinement threads○ finish processing dirty card table queue

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 18

O2

O1

O3

E1

Card TableRSet

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 19

O2

O1

O3

E1

Remember SetCard Table

Program Executes:O1.attribute = E1

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 20

O2

O1

O3

E1

Remember SetCard Table

Program Executes:O1.attribute = E1

Card table is updated, saying something in that part of region contains a reference to young gen

Remember set includes reference to the card

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 21

● “Scan RS”● Walk the elements in old regions that are marked by cards, referenced

in remember set

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 22

• During Object Copy phase, the live objects are copied to the destination regions

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 23

• During Object Copy phase, the live objects are copied to the destination regions

S E

E E

O O

H H S

Eden

Survivor

Old

Humongous

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 24

• During Object Copy phase, the live objects are copied to the destination regions

S E

E E

O O

H H S

Eden

Survivor

Old

Humongous

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 25

• During Object Copy phase, the live objects are copied to the destination regions

S E

O E S S E

O O

H H S

Eden

Survivor

Old

Humongous

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 26

• During Object Copy phase, the live objects are copied to the destination regions

S E

O E S S E

O O

H H S

Eden

Survivor

Old

Humongous

Young Generation Collection

© 2015 DataStax, All Rights Reserved. 27

• Weak/Soft/Phantom Reference processing• Always enable -XX:ParallelRefProcEnabled

Old Generation GCs• Partially concurrent• Triggered when heap occupancy percent reached

-XX:IntiatingHeapOccupancyPercent=n [45%]• Uses young gc for initial mark and evacuation• Compacts when copying to new regions

© 2015 DataStax, All Rights Reserved. 28

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 29

● Run with a young gc● Begins marking roots in survivor regions during STW that reference old

generation● Snapshot-At-The-Beginning (SATB)● Tri-color marking

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 30

● Scans initial marked references into the old generation● This phase runs concurrently (not STW) with the application● Cannot be interrupted by young generation gcs

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 31

● Tri color marking of the heap● Concurrent to application running● Young generation GCs can run during this

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 32

● Stop the world● Process the SATB buffers marked during write barriers since initial mark● Prevents possible lost objects from concurrent changes

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 33

● STW and concurrent● If region completely garbage free it● Find live/garbage ratio to decide which ones clean first (clear garbage

heavy ones first*)● Application threads run

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 34

● Mixed young generation gc to evacuate old regions○ STW○ Adds some old generations to CSet

Old Generation Collection

© 2015 DataStax, All Rights Reserved. 35

● Take [candidate regions/8] of the garbagiest regions and add it to young gen CSet

○ -XX:MixedGCCountTarget=8○ can tune region selection with

■ -XX:G1MixedGCLiveThresholdPercent=65● Only consider if reclaimable below this

■ -XX:G1HeapWastePercent=10● Only consider if reclaimable above this

G1 CollectorMonitoring

Monitoring G1• Always enable garbage collection logging, yes in production

– Very little overhead, massive visibility difference• -XX:+PrintGCDateStamps• -XX:+PrintGCApplicationStoppedTime• -XX:+PrintGCDetails

– print phases• -XX:+PrintAdaptiveSizePolicy

– YU Full GC?• -XX:+PrintTenuringDistribution

– aging information of survivor regions• -XX:+PrintReferenceGC

– soft/weak/phantom/finalizers/jni reference information• -XX:+UnlockExperimentalVMOptions -XX:G1LogLevel=finest

– individual thread timings included in logs

© 2015 DataStax, All Rights Reserved. 37

Monitoring JMXDomains

• java.lang:type=GarbageCollector,name=G1 Young Generation• java.lang:type=GarbageCollector,name=G1 Old Generation• java.lang:type=GarbageCollector,name=G1 Mixed Generation

Attributes• CollectionCount• CollectionTime

38

Swiss Java Knife• SJK is a command line tool for JVM diagnostic, troubleshooting

and profiling.

• SJK exploits standard diagnostic interfaces of JVM (such as JMX, JVM attach and perf counters) and add some more logic on top to be useful for common troubleshooting case.

• https://github.com/aragozin/jvm-tools

39

Swiss Java Knifeubuntu@ip-10-95-215-157:~/cassandra$ java -jar sjk.jar gc -p 63563MBean server connectedCollecting GC stats ...[GC: G1 Young Generation#46 time: 8ms mem: G1 Survivor Space: 8192k+0k->8192k Compressed Class Space: 3958k+0k->3958k[max:1048576k] Metaspace: 34491k+0k->34491k G1 Old Gen: 78m+2m->81m[max:8192m] G1 Eden Space: 5021696k-5021696k->0k][GC: G1 Young Generation#47 time: 8ms interval: 15611ms mem: G1 Survivor Space: 8192k+4096k->12288k[rate:262.38kb/s] Compressed Class Space: 3993k+0k->3993k[max:1048576k,rate:0.00kb/s] Metaspace: 34770k+0k->34770k[rate:0.00kb/s] G1 Old Gen: 81m-1m->79m[max:8192m,rate:-126.01kb/s] G1 Eden Space: 4034560k-4034560k->0k[rate:-258443.41kb/s]][GC: G1 Young Generation#48 time: 51ms interval: 10079ms mem: G1 Survivor Space: 12288k+139264k->151552k[rate:13817.24kb/s] Compressed Class Space: 3989k+0k->3989k[max:1048576k,rate:0.00kb/s] Metaspace: 35363k+0k->35363k[rate:0.00kb/s] G1 Old Gen: 61m+4m->66m[max:8192m,rate:505.74kb/s] G1 Eden Space: 5017600k-5017600k->0k[rate:-497827.17kb/s]]

40

Swiss Java Knifeubuntu@ip-10-95-215-157:~/cassandra$ java -jar sjk.jar gc -p 63563MBean server connectedCollecting GC stats ...[GC: G1 Young Generation#46 time: 8ms mem: G1 Survivor Space: 8192k+0k->8192k Compressed Class Space: 3958k+0k->3958k[max:1048576k] Metaspace: 34491k+0k->34491k G1 Old Gen: 78m+2m->81m[max:8192m] G1 Eden Space: 5021696k-5021696k->0k][GC: G1 Young Generation#47 time: 8ms interval: 15611ms mem: G1 Survivor Space: 8192k+4096k->12288k[rate:262.38kb/s] Compressed Class Space: 3993k+0k->3993k[max:1048576k,rate:0.00kb/s] Metaspace: 34770k+0k->34770k[rate:0.00kb/s] G1 Old Gen: 81m-1m->79m[max:8192m,rate:-126.01kb/s] G1 Eden Space: 4034560k-4034560k->0k[rate:-258443.41kb/s]][GC: G1 Young Generation#48 time: 51ms interval: 10079ms mem: G1 Survivor Space: 12288k+139264k->151552k[rate:13817.24kb/s] Compressed Class Space: 3989k+0k->3989k[max:1048576k,rate:0.00kb/s] Metaspace: 35363k+0k->35363k[rate:0.00kb/s] G1 Old Gen: 61m+4m->66m[max:8192m,rate:505.74kb/s] G1 Eden Space: 5017600k-5017600k->0k[rate:-497827.17kb/s]]

41

JStatin cassandra-env.sh comment outJVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem"

jstat <pid> -gcutil 1s 30

© 2015 DataStax, All Rights Reserved. 42

Visualize GCs● GcViewerr● JClarity Censum● IBM PMAT/GCMV● verbosegcanalyzer

© 2015 DataStax, All Rights Reserved. 43

JClarity Censum

© 2015 DataStax, All Rights Reserved. 44

Reading Logs2015-12-07T04:55:28.273+0000: 7417.822: [GC pause (G1 Evacuation Pause) (young) [Parallel Time: 49.4 ms, GC Workers: 13] [GC Worker Start (ms): Min: 7417822.3, Avg: 7417822.5, Max: 7417822.6, Diff: 0.2] [Ext Root Scanning (ms): Min: 2.2, Avg: 2.3, Max: 2.4, Diff: 0.2, Sum: 29.8] [Update RS (ms): Min: 1.8, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 24.4] [Processed Buffers: Min: 4, Avg: 9.7, Max: 16, Diff: 12, Sum: 126] [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.2, Diff: 0.1, Sum: 2.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [Object Copy (ms): Min: 44.4, Avg: 44.5, Max: 44.5, Diff: 0.1, Sum: 578.0] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Termination Attempts: Min: 1, Avg: 3.6, Max: 10, Diff: 9, Sum: 47] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.6] [GC Worker Total (ms): Min: 48.8, Avg: 48.9, Max: 49.2, Diff: 0.4, Sum: 636.2] [GC Worker End (ms): Min: 7417871.3, Avg: 7417871.4, Max: 7417871.5, Diff: 0.2] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 1.0 ms] [Other: 2.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.6 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.4 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 1.0 ms] [Eden: 6912.0M(6912.0M)->0.0B(7008.0M) Survivors: 232.0M->184.0M Heap: 14.2G(16.0G)->7533.7M(16.0G)] [Times: user=0.73 sys=0.00, real=0.05 secs]

45

Reading Logs2015-12-07T04:55:28.273+0000: 7417.822: [GC pause (G1 Evacuation Pause) (young) [Parallel Time: 49.4 ms, GC Workers: 13] [GC Worker Start (ms): Min: 7417822.3, Avg: 7417822.5, Max: 7417822.6, Diff: 0.2] [Ext Root Scanning (ms): Min: 2.2, Avg: 2.3, Max: 2.4, Diff: 0.2, Sum: 29.8] [Update RS (ms): Min: 1.8, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 24.4] [Processed Buffers: Min: 4, Avg: 9.7, Max: 16, Diff: 12, Sum: 126] [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.2, Diff: 0.1, Sum: 2.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [Object Copy (ms): Min: 44.4, Avg: 44.5, Max: 44.5, Diff: 0.1, Sum: 578.0] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Termination Attempts: Min: 1, Avg: 3.6, Max: 10, Diff: 9, Sum: 47] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.6] [GC Worker Total (ms): Min: 48.8, Avg: 48.9, Max: 49.2, Diff: 0.4, Sum: 636.2] [GC Worker End (ms): Min: 7417871.3, Avg: 7417871.4, Max: 7417871.5, Diff: 0.2] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 1.0 ms] [Other: 2.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.6 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.4 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 1.0 ms] [Eden: 6912.0M(6912.0M)->0.0B(7008.0M) Survivors: 232.0M->184.0M Heap: 14.2G(16.0G)->7533.7M(16.0G)] [Times: user=0.73 sys=0.00, real=0.05 secs]

46

Reading Logshttps://blogs.oracle.com/poonam/entry/understanding_g1_gc_logs

47

Examples with C*Fabricated - not real life example

A Scenario to GC tune● Harder than sounds

○ wide rows○ ttled wide rows (queue)○ massive sstables○ huge data○ move things off heap○ row cache

© 2015 DataStax, All Rights Reserved. 49

A Scenario to GC tune● Harder than sounds

○ wide rows○ ttled wide rows (queue)○ massive sstables○ huge data○ move things off heap○ row cache

© 2015 DataStax, All Rights Reserved. 50

A Scenario to GC tune● Harder than sounds

○ wide rows○ ttled wide rows (queue)○ massive sstables○ huge data○ move things off heap○ row cache

© 2015 DataStax, All Rights Reserved. 51

When you want something slowRepairs to the rescue● ~1tb in ~600 sstables● Add a node, increase RF and repair

© 2015 DataStax, All Rights Reserved. 52

When you want something slowRepairs to the rescue● ~1tb in ~600 sstables● Add a node, increase RF and repair

© 2015 DataStax, All Rights Reserved. 53

When you want something slowRepairs to the rescue

© 2015 DataStax, All Rights Reserved. 54

.99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb 5.8, 151.5, 154.4, 194.3, 0.00393, 0, 1, 147, 147, 0, 4124 6.8, 289.4, 292.0, 195.3, 0.00391, 0, 1, 285, 285, 0, 2112 8.6, 251.6, 257.7, 196.3, 0.00412, 0, 2, 458, 458, 13, 10350 9.8, 232.8, 235.2, 197.3, 0.00432, 0, 2, 198, 375, 11, 2112 11.2, 188.2, 201.3, 198.3, 0.00453, 0, 2, 163, 302, 12, 1170 100.5, 737.4, 755.6, 199.4, 0.00477, 0, 4, 853, 1028, 273, 8514 138.7, 386.8, 392.4, 200.4, 0.00522, 0, 3, 552, 683, 111, 2574 11.3, 203.5, 207.5, 201.4, 0.00564, 0, 3, 479, 651, 59, 3000 103.7, 304.0, 306.2, 202.7, 0.00590, 0, 4, 260, 479, 25, 2230 8.5, 175.7, 183.1, 203.8, 0.00607, 0, 3, 217, 347, 28, 3696 21.4, 153.9, 159.4, 204.9, 0.00626, 0, 2, 105, 209, 1, 1466 6.7, 108.2, 115.0, 205.9, 0.00623, 0, 1, 102, 102, 0, 868 12.1, 173.2, 192.2, 206.9, 0.00637, 0, 3, 170, 297, 40, 4273 9.2, 156.3, 160.1, 207.9, 0.00640, 0, 1, 100, 100, 0, 798 11.7, 157.7, 162.3, 209.0, 0.00644, 0, 2, 117, 189, 23, 1125 8.2, 121.2, 145.7, 210.0, 0.00642, 0, 1, 114, 114, 0, 717 63.6, 109.9, 116.7, 211.0, 0.00660, 0, 3, 169, 250, 10, 5952 11.7, 53.5, 65.3, 212.0, 0.00662, 0, 3, 48, 113, 7, 1888

JClarityClose… but why?

55

Logs015-12-07T07:27:31.326+0000: 3306.992: [GC pause (G1 Evacuation Pause) (young)... (to-space exhausted), 1.8667062 secs] [Parallel Time: 1694.1 ms, GC Workers: 13]... [Object Copy (ms): Min: 1672.4, Avg: 1672.6, Max: 1672.8, Diff: 0.4, Sum: 21744.0]... [Evacuation Failure: 167.8 ms]

© 2015 DataStax, All Rights Reserved. 56

Evacuation Failure

© 2015 DataStax, All Rights Reserved. 57

Not enough memory for survivors, promoted objects or both

S O E O

E E S S E

E O O H

H H S H

Eden

Survivor

Old

Humongous

Evacuation Failure

© 2015 DataStax, All Rights Reserved. 58

Not enough memory for survivors, promoted objects or both

S O E O

E E S S E

E O O H

H H S H

Eden

Survivor

Old

Humongous

Evacuation Failure

© 2015 DataStax, All Rights Reserved. 59

Not enough memory for survivors, promoted objects or both

S O E O

E E S S E

E O O H

H H S H

Eden

Survivor

Old

Humongous

Resolution● There is an easy fix

○ Increase Heap● Not always possible

○ off heap memory requirements○ limited hardware

60

Workaround● Increase amount of memory reserved as empty (to-space)● -XX:G1ReservePercent=10● -XX:G1ReservePercent=25

61

Anti climatic ending● Throughput similar, no discernable difference in “normal running”● Survived repair with higher allocation rate

62

Anti climatic ending● Throughput similar, no discernable difference in “normal running”● Survived repair with higher allocation rate

63

Questions?Complaints?

© 2015 DataStax, All Rights Reserved. 64