Jvm Performance Tunning

71
JavaStudy Network Daehyub Cho JVM [Java Virtual Machine] Performance Tuning

description

Java Virtual Machine Tuning guide.It describes JVM Memory model and tuning guide

Transcript of Jvm Performance Tunning

Page 1: Jvm Performance Tunning

JavaStudy NetworkDaehyub Cho

JVM [Java Virtual Machine]

PerformanceTuning

Page 2: Jvm Performance Tunning

AGENDA

Basic concept of JVM Tuning1

Hotspot compiler2

Threading Model3

Memory Model4

Page 3: Jvm Performance Tunning

Basic Concept of JVM TuningBasic concept of JVM Tuning

Page 4: Jvm Performance Tunning

Basic of performance tuning

1. Decide what performance level is “good enough”2. Test & measurement

• Scenario based• Stress Tool (Load Runner)• Profiling Tool (J probe, etc)

3. Profile application to find bottlenecks4. Tuning

• Application *• Middleware [WAS]• OS• JVM

5. Return to Step 2 [feedback]

Page 5: Jvm Performance Tunning

JVM Tuning

• Improve performance about 10~20%• Find appropriate parameter for your application

– Hotspot compile option– Thread model option *– GC and memory related option **

• Changing parameter is very dangerous action– Need more test and feed back– Ref spec.org

Page 6: Jvm Performance Tunning

Hotspot CompilerHotspot compiler

Page 7: Jvm Performance Tunning

JVM Layout

• Hotspot from JDK 1.3

VM

ClientCompiler

ServerCompiler

• Runtime• GC• Interpreter• Threading & Locking• ….

JVM

Hotspot Compiler

Page 8: Jvm Performance Tunning

Hotspot compiler

• JIT (Just-In-Time Compiler)– Compile byte code to native code– Compile as rules of optimization (Not thinking)– At execution/installation– Compile byte code to native code

• Hotspot– Compile byte code to native code– ‘Thinking’ to trying find where optimization can take place– Adaptive Optimizing in runtime

Page 9: Jvm Performance Tunning

Hotspot Detection

• Hotspot detection• Method Inlining• Dynamic Deoptimization

Page 10: Jvm Performance Tunning

Hotspot Detection and Method Inlining

• Literal constants are folded

• String concatenation is sometimes folded

• Constant fields are inlined

int foo = 9* 10; int foo = 90;

String foo = “Hello “ + (9*10); String foo = “Hello 90”;

public class A{ public static final VALUE=99;}public class B{ static int VALUE2=A.VALUE;}

public class B{ static int VALUE2=99;}

When after compiling class B

Page 11: Jvm Performance Tunning

Hotspot detection / Method Inlining

• Dead code branches are eliminated

public class A{ static final boolean DEBUG = false; public void methodA() if(DEBUG) System.out.println(“DEBUG MODE); System.out.println(“Say Hello”); }// method A}// class A

↓public class A{ static final boolean DEBUG = false; public void methodA() System.out.println(“Say Hello”); }// method A}// class A

Page 12: Jvm Performance Tunning

Hotspot Client compiler

• Java Option : -client• Focused on Simple & Fast start up• 3 Phase compiler

– HIR (High Level Intermediate Representation)– LIR (Low Level Intermediate Representation)– Machine code

• It focuses on local code quality and does very few global optimizations since those are often the most expensive in terms of compile time

• It has for inlining any function that has no exception handlers or synchronization and also supports deoptimization for debugging and inlining

Page 13: Jvm Performance Tunning

Hotspot Server compiler

• Java Option : -server• Focused on optimization• SSA (Static Single Assignment)-based IR

Page 14: Jvm Performance Tunning

Hotspot compiler Option

• Hotspot compile option– -XX:MaxInlineSize=<size>

• Integer specifying maximum number of bytecode instructions in a method which gets inlined.

– -XX:FreqInlineSize=<size>• Integer specifying maximum number of bytecode instructions in a

frequently executed method which gets inlined.

– -Xint• Interpreter only (no JIT compilation)

– -XX:+PrintCompilation

Page 15: Jvm Performance Tunning

ThreadingThreading model

Page 16: Jvm Performance Tunning

Threading Model

• Thread Model– Java is multi threaded programming language– Native thread model from JDK 1.2

• Thread mapping (M:N and 1:1)• Thread synchronization

JavaApplication

Java Thread

OperatingSystemThread Handling

Thread SchedulingLock Mgmt (synchronization)

JVM

Page 17: Jvm Performance Tunning

Solaris M:N Thread Model

JavaApplication

Java Thread

JVM

Solaris OS

OS Kernel

Solaris Thread

LWP

Kernel Thread

Page 18: Jvm Performance Tunning

Solaris M:N Thread Model

• Solaris M:N Thread Model– Thread based synchronization– LWP based synchronization

Thread based sync LWP based sync

JDK1.2 N/A Default

JDK1.3 Default -XX:+UseLWPSynchronization

JDK1.4 -XX:-UseLWPSynchronization Default

Page 19: Jvm Performance Tunning

Solaris 1:1 Thread Model

JavaApplication

Java Thread

JVM

Solaris OS

OS Kernel

Solaris Thread

LWP

Kernel Thread

Page 20: Jvm Performance Tunning

Solaris 1:1 Thread Model

• Solaris 1:1 Thread Model– Bound thread– Alternate Libthread

Bound Thread Alternate Libthread*

JDK1.2 N/A export LD_LIBRARY_PATH=/usr/lib/lwp

JDK1.3 -XX:+UseBoundThreads export LD_LIBRARY_PATH=/usr/lib/lwp

JDK1.4 -XX:+UseBoundThreads export LD_LIBRARY_PATH=/usr/lib/lwp

※ In Solaris 9, alternate lib thread is default, do not add /usr/lib/lwp to LD_LIBRARY_PATH

Page 21: Jvm Performance Tunning

JVM Performance Test on Solaris

Architecture Cpus Threads Model %diff in throughput (against Standard Model)

Sparc 30 400/2000 Standard ---

Sparc 30 400/2000 LWP Synchronization 215%/800%

Sparc 30 400/2000 Bound Threads -10%/-80%

Sparc 30 400/2000 Alternate One-to-one 275%/900%

Sparc 4 400/2000 Standard ---

Sparc 4 400/2000 LWP Synchronization 30%/60%

Sparc 4 400/2000 Bound Threads -5%/-45%

Sparc 4 400/2000 Alternate One-to-one 30%/50%

Sparc 2 400/2000 Standard ---

Sparc 2 400/2000 LWP Synchronization 0%/25%

Sparc 2 400/2000 Bound Threads -30%/-40%

Sparc 2 400/2000 Alternate One-to-one -10%/0%

Intel 4 400/2000 Standard ---

Intel 4 400/2000 LWP Synchronization 25%/60%

Intel 4 400/2000 Bound Threads 0%/-10%

Intel 4 400/2000 Alternate One-to-one 20%/60%

Intel 2 400/2000 Standard ---

Intel 2 400/2000 LWP Synchronization 15%/45%

Intel 2 400/2000 Bound Threads -10%/-15%

Intel 2 400/2000 Alternate One-to-one 15%/35%

< Solaris 8 with JVM 1.3 >See next page graph!!

Page 22: Jvm Performance Tunning

JVM Performance Test on Solaris

• Performance Test Result Graph

Page 23: Jvm Performance Tunning

Memory TuningMemory Model

Page 24: Jvm Performance Tunning

Memory Tuning

• Garbage Collection• JVM Memory Layout• Garbage Collection Model• Server VM and Client VM• Garbage Collection Measurement & Analysis• Tuning Garbage Collection

Page 25: Jvm Performance Tunning

Generational Garbage Collection

Page 26: Jvm Performance Tunning

JVM Memory Layout

• New/Young – Recently created object• Old – Long lived object• Perm – JVM classes and methods

Eden Old Perm

New/Young Old

Used in Application JVM

Total Heap Size

SS1 SS2

Page 27: Jvm Performance Tunning

Garbage Collection

• Garbage Collection– Collecting unused java object– Cleaning memory– Minor GC

• Collection memory in New/Young generation

– Major GC (Full GC)• Collection memory in Old generation

Page 28: Jvm Performance Tunning

Minor GC

• Minor Collection– New/Young Generation– Copy and Scavenge – Very Fast

Page 29: Jvm Performance Tunning

Minor GC

Eden SS1 SS1

Copy live objects to Survivor area

New Object

Garbage

Lived Object

1st Minor GC

Old

Old

Old

Page 30: Jvm Performance Tunning

Minor GC

2nd Minor GC

Old

Old

Old

New Object

Garbage

Lived Object

Page 31: Jvm Performance Tunning

Minor GC

OLD

3rd Minor GC

Objects moved old space when they become tenured

New Object

Garbage

Lived Object

Page 32: Jvm Performance Tunning

Major GC

• Major Collection– Old Generation– Mark and compact– Slow

• 1st – goes through the entire heap, marking unreachable objects• 2nd – unreachable objects are compacted

Page 33: Jvm Performance Tunning

Major GC

Eden SS1 SS2

Eden SS1 SS2

Mark the objects to be removed

Eden SS1 SS2

Compact the objects to be removed

Page 34: Jvm Performance Tunning

Server option versus Client option

• -X:NewRatio=2 (1.3) , -Xmn128m(1.4), -XX:NewSize=<size> -XX:MaxNewSize=<size>

Page 35: Jvm Performance Tunning

GC Tuning Parameter

• Memory Tuning Parameter– Perm Size : -XX:MaxPermSize=64m– Total Heap Size : -ms512m –mx 512m– New Size

• -XX:NewRatio=2 Old/New Size• -XX:NewSize=128m• -Xmn128m (JDK 1.4)

– Survivor Size : -XX:SurvivorRatio=64 (eden/survivor)– Heap Ratio

• -XX:MaxHeapFreeRatio=70• -XX:MinHeapFreeRatio=40

– Suvivor Ratio• -XX:TargetSurvivorRatio=50

Page 36: Jvm Performance Tunning

Support for –XX Option

• Options that begin with -X are nonstandard (not guaranteed to be supported on all VM implementations), and are subject to change without notice in subsequent releases of the Java 2 SDK.

• Because the -XX options have specific system requirements for correct operation and may require privileged access to system configuration parameters, they are not recommended for casual use. These options are also subject to change

without notice.

Page 37: Jvm Performance Tunning

Garbage Collection Model

• New type of GC– Default Collector– Parallel GC for young generation - JDK 1.4– Concurrent GC for old generation - JDK 1.4 – Incremental Low Pause Collector (Train GC)

Page 38: Jvm Performance Tunning

Parallel GC

• Parallel GC– Improve performance of GC– For young generation (Minor GC)– More than 4CPU and 256MB Physical

memory required

threads

timegc

threads

Default GC Parallel GC

Young Generation

Page 39: Jvm Performance Tunning

Parallel GC

• Two Parallel Collectors– Low-pause : -XX:+UseParNewGC

• Near real-time or pause dependent application• Works with

– Mark and compact collector– Concurrent old area collector

– Throughput : -XX:+UseParallelGC• Enterprise or throughput oriented application• Works only with the mark and compact collector

Page 40: Jvm Performance Tunning

Parallel GC

• Throughput Collector– –XX:+UseParallelGC– -XX:ParallelGCThreads=<desired number>– -XX:+UseAdaptiveSizePolicy

• Adaptive resizing of the young generation

Page 41: Jvm Performance Tunning

Parallel GC

• Throughput Collector– AggressiveHeap

• Enabled By-XX:+AggresiveHeap• Inspect machine resources and attempts to set various parameters to

be optimal for long-running,memory-intensive jobs– Useful in more than 4 CPU machine, more than 256M– Useful in Server Application– Do not use with –ms and –mx

• Example) HP Itanium 1.4.2 java -XX:+ServerApp -XX:+AggresiveHeap -Xmn3400m -spec.jbb.JBBmain -propfile Test1

Page 42: Jvm Performance Tunning

Concurrent GC

• Concurrent GC– Reduce pause time to collect

Old Generation– For old generation (Full GC)

– Enabled by -XX:+UseConcMarkSweepGC

threads

timegc

threads

Default GC Concurrent GC

OldGeneration

Page 43: Jvm Performance Tunning

Incremental GC

• Incremental GC– Enabled by –XIncgc (from JDK 1.3)– Collect Old generation whenever collect young generation– Reduce pause time for collect old generation– Disadvantage

• More frequently young generation GC has occurred.• More resource is needed• Do not use with –XX:+UseParallelGC and –XX:+UseParNewGC

Page 44: Jvm Performance Tunning

Incremental GC

• Incremental GC

Minor GC

After many time of Minor GC

Full GC

Minor GC

Minor GC

Old Generation is collected in Minor GC

Default GC Incremental GC

Young Generation

OldGeneration

Page 45: Jvm Performance Tunning

Incremental GC

• Incremental GC-client –XX:+PrintGCDetails -Xincgc –ms32m –mx32m

[GC [DefNew: 540K->35K(576K), 0.0053557 secs][Train: 3495K->3493K(32128K), 0.0043531 secs] 4036K->3529K(32704K), 0.0099856 secs][GC [DefNew: 547K->64K(576K), 0.0048216 secs][Train: 3529K->3540K(32128K), 0.0058683 secs] 4041K->3604K(32704K), 0.0109779 secs][GC [DefNew: 575K->64K(576K), 0.0164904 secs] 4116K->3670K(32704K), 0.0169019 secs][GC [DefNew: 576K->64K(576K), 0.0057541 secs][Train: 3671K->3651K(32128K), 0.0051286 secs] 4182K->3715K(32704K), 0.0113042 secs][GC [DefNew: 575K->56K(576K), 0.0114559 secs] 4227K->3745K(32704K), 0.0191390 secs][Full GC [Train MSC: 3689K->3280K(32128K), 0.0909523 secs] 4038K->3378K(32704K), 0.0910213 secs][GC [DefNew: 502K->64K(576K), 0.0173220 secs][Train: 3329K->3329K(32128K), 0.0066279 secs] 3782K->3393K(32704K), 0.0325125 secs

Young Generation GC Old Generation GC in Minor GC TimeMinor GC

Full GC

Sun JVM 1.4.1 in Windows OS

Page 46: Jvm Performance Tunning

Mark-compact Better throughput

Incremental GC(Train) Better Pause

Parallel GC Best Throughput

Concurrent GC Best Pause

Page 47: Jvm Performance Tunning

Garbage Collection Measurement

• -verbosegc (All Platform)• -XX:+PrintGCDetails ( JDK 1.4)• -Xverbosegc (HP)

Page 48: Jvm Performance Tunning

Garbage Collection Measurement

• -verbosegc

[GC 40549K->20909K(64768K), 0.0484179 secs][GC 41197K->21405K(64768K), 0.0411095 secs][GC 41693K->22995K(64768K), 0.0846190 secs][GC 43283K->23672K(64768K), 0.0492838 secs][Full GC 43960K->1749K(64768K), 0.1452965 secs][GC 22037K->2810K(64768K), 0.0310949 secs][GC 23098K->3657K(64768K), 0.0469624 secs][GC 23945K->4847K(64768K), 0.0580108 secs]

Full GC

Total Heap Size

GC Time

Heap size after GC

Heap size before GC

Page 49: Jvm Performance Tunning

GC Log analysis using AWK script

• Awk script

BEGIN{ printf("Minor\tMajor\tAlive\tFree\n");}{ if( substr($0,1,4) == "[GC "){ split($0,array," "); printf("%s\t0.0\t",array[3])

split(array[2],barray,"K") before=barray[1] after=substr(barray[2],3) reclaim=before-after printf("%s\t%s\n",after,reclaim) }

if( substr($0,1,9) == "[Full GC "){ split($0,array," "); printf("0.0\t%s\t",array[4])

split(array[3],barray,"K") before = barray[1] after = substr(barray[2],3) reclaim = before - after printf("%s\t%s\n",after,reclaim) } next;}

% awk –f gc.awk gc.log

※ Usage

gc.awk

Minor       Major       Alive       Freed0.0484179   0.0         20909       196400.0411095   0.0         21405       197920.0846190   0.0         22995       186980.0492838   0.0         23672       196110.0         0.1452965   1749        422110.0310949   0.0         2810        192270.0469624   0.0         3657        194410.0580108   0.0         4847        19098

gc.log

Page 50: Jvm Performance Tunning

GC Log analysis using AWK script

< GC Time >

Page 51: Jvm Performance Tunning

GC Log analysis using HPJtune

※ http://www.hp.com/products1/unix/java/java2/hpjtune/index.html

Page 52: Jvm Performance Tunning

GC Log analysis using AWK script

< GC Amount >

Page 53: Jvm Performance Tunning

Garbage Collection Tuning

• GC Tuning– Find Most Important factor

• Low pause? Or High performance?• Select appropriate GC model (New Model has risk!!)

– Select “server” or “client”– Find appropriate Heap size by reviewing GC log– Find ratio of young and old generation

Page 54: Jvm Performance Tunning

Garbage Collection Tuning

• GC Tuning– Full GC Most important factor in GC tuning

• How frequently ? How long ?• Short and Frequently decrease old space• Long and Sometimes increase old space• Short and Sometimes decrease throughput by Load balancing

– Fix Heap size• Set “ms” and “mx” as same• Remove shrinking and growing overhead

– Don’t• Don’t make heap size bigger than physical memory (SWAP)• Don’t make new generation bigger than half the heap

Page 55: Jvm Performance Tunning

Jmeter / Threads Histogram

Page 56: Jvm Performance Tunning

Jmeter /Threads Group Histogram

Page 57: Jvm Performance Tunning

Example

Page 58: Jvm Performance Tunning

Example

2004-01-08 오후 7:14

2004-01-09 오전 8 시 전후

2004-01-09 오후 7 시 전후

금요일 업무시간

2004-01-10오전 10 시 전후

2004-01-10오후 6 시 전후

PEAK TIME52000~56000 sec9 시 ~ 1 시간 가량

Before TunedOld Area

Page 59: Jvm Performance Tunning

Example

Peak Time 시에 Old GC 시간이 4~8 sec 로 이로 인한 Hang 현상 유발이 가능함

Before TunedGC Time

Page 60: Jvm Performance Tunning

Example

12 일 03:38A12 일 05:58P13 일 07:18A13 일 09:38P14 일 11:58A15 일 01:18A15 일 03:38P16 일 05:58A16 일 07:18P17 일 08:38A17 일 10:58P

Weekend

Mon Office

Our

Tue Office

Our

Thur Office

Our

Fri Office

Our

After AP TunedGC Time

Page 61: Jvm Performance Tunning

Example

12 일 03:38A12 일 05:58P13 일 07:18A13 일 09:38P14 일 11:58A15 일 01:18A15 일 03:38P16 일 05:58A16 일 07:18P17 일 08:38A17 일 10:58P

Weekend

Mon Office

Our

Tue Office

Our

Thur Office

Our

Fri Office

Our

Page 62: Jvm Performance Tunning

Summary

Page 63: Jvm Performance Tunning

JVM Tuning Summary

• Determine JVM performance goal• Gather statistics on your application• Select hotspot compiler• Tuning heap• Check threading model• Feedback

Page 64: Jvm Performance Tunning

More TipsMore Tips

Page 65: Jvm Performance Tunning

Thread dump

• Thread dump– Enabled by

• Unix “kill –3 [JAVA PID]”• Windows “Ctrl+Break”

– Snapshot of java application– Can profiling “hang-up”, and “slow-down”

Page 66: Jvm Performance Tunning

Thread dump example

""

• Thread dump when slowdown in WAS

ExecuteThread: '232' for queue: 'default'" daemon prio=5 tid=0x573ca630 nid=0xd2c waiting for monitor entry [0x5cebf000..0x5cebfdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.util.ListResourceBundle.handleGetObject(ListResourceBundle.java:122) at java.util.ResourceBundle.getObject(ResourceBundle.java:371) at java.util.ResourceBundle.getObject(ResourceBundle.java:374) at java.text.DateFormatSymbols.initializeData(DateFormatSymbols.java:483) at java.text.DateFormatSymbols.<init>(DateFormatSymbols.java:99) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:275) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.util.CmLog.setFileLog(CmLog.java:171) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:371) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)

"ExecuteThread: '231' for queue: 'default'" daemon prio=5 tid=0x573f9a60 nid=0x13a8 waiting for monitor entry [0x5ce7f000..0x5ce7fdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols.java:333) at java.text.DecimalFormatSymbols.<init>(DecimalFormatSymbols.java:55) at java.text.NumberFormat.getInstance(NumberFormat.java:565) at java.text.NumberFormat.getInstance(NumberFormat.java:324) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:327) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:276) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:67) at XXX.uv.com.datastu.DateTime.setCurrentTime(DateTime.java:190) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:239) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)

Page 67: Jvm Performance Tunning

• Profiling CPU usage/HP UX– HP UX : Glance + Thread Dump

HP Glance

Press “G”

Thread monitoring

Page 68: Jvm Performance Tunning

• Profiling CPU usage/HP UX

"Application Manager Thread" prio=8 tid=0x002a6c00 nid=62 lwp_id=15999 waiting on monitor [0x64bce000..0x64bce4b8] at java.lang.Thread.sleep(Native Method) at weblogic.management.mbeans.custom.ApplicationManager$ApplicationPoller.run(ApplicationManager.java:1137)

CPU Load of Thread 15999 is 17.7%

Thread 15999 is working on weblogic.management.mbeans.custom.ApplicationManager(ApplicationManger.java 1137)

Glance Thread Monitoring

Java Thread Dump

Page 69: Jvm Performance Tunning

• Other tools– Profile with Java option– Analyze using HP Jmeter– Jprobe– Stress Test

• Load Runner• MS Stress (Free)

Page 70: Jvm Performance Tunning

• Related URL– Java Thread http://java.sun.com/docs/hotspot/threads/threads.htm– Java Performance http://java.sun.com/docs/hotspot/PerformanceFAQ.html– Java Thread http://www.javaworld.com/javaworld/jw-09-1998/jw-09-threads.html– Pick up performance with generational gc

http://www.javaworld.com/javaworld/jw-01-2002/jw-0111-hotspotgc.html– JVM1.4 GC Tunning http://java.sun.com/docs/hotspot/gc1.4.2/index.html– HP Jmeter,Jtune,Jconfig http://www.hp.com/products1/unix/java/developers/index.html– SPECjvm98– SPECjAppServer2001/2002

Page 71: Jvm Performance Tunning

Thank you