Diagnose apps back to basics

26
Diagnose Java apps on JVM - Back to Basics Imran Bohoran @imranbohoran

Transcript of Diagnose apps back to basics

Page 1: Diagnose apps   back to basics

Diagnose Java apps on JVM - Back to Basics

Imran Bohoran@imranbohoran

Page 2: Diagnose apps   back to basics

Formalities

● Who am I○ Developer at Unruly

● What do I like○ Coding (obviously)○ Diagnosing application failures○ Discovering new tools and tricks I never knew of

Page 3: Diagnose apps   back to basics

What basics?

● Various tools at our disposal● Do the tools tell us what we need to find out?● Tools are not a bad thing - they are great.● But going back to some basics can help

more○ What do we have available for us from the JVM

Page 4: Diagnose apps   back to basics

The application is down● Who has experienced it?● Is it fun?

Page 5: Diagnose apps   back to basics

The application is down

● The load is high● The JVM is not responsive● Everyone’s breathing down your neck

Page 6: Diagnose apps   back to basics

What do you do?

● Do you panic?● Do you run around telling everyone?● Do you try to work out what’s going on using

your favourite tool?● Do you restart the app?● Do you inspect the logs?● Do you take a thread dump(s)?

Page 7: Diagnose apps   back to basics

Thread dumps

● Who has taken a thread dump or stared at them?

● What are they?● How do you take thread dumps

○ jstack <pid>○ kill -SIGQUIT or kill -3

Page 8: Diagnose apps   back to basics

Thread dumps

● Application Threads● VM Threads

Page 9: Diagnose apps   back to basics

A Thread dump - Application threads"http-nio-8080-ClientPoller-1" #23 daemon prio=5 os_prio=31 tid=0x00007fbf1c7e0000 nid=0x6b03 runnable [0x000000012aa2f000]

java.lang.Thread.State: RUNNABLE

at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)

at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)

at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)

at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)

- locked <0x0000000782cb26f0> (a sun.nio.ch.Util$2)

- locked <0x0000000782cb26e0> (a java.util.Collections$UnmodifiableSet)

- locked <0x0000000782938878> (a sun.nio.ch.KQueueSelectorImpl)

at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)

at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1179)

at java.lang.Thread.run(Thread.java:745)

Page 10: Diagnose apps   back to basics

A Thread dump - VM threads"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fbf19025800 nid=0x4903 runnable [0x0000000000000000]

java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fbf1c011800 nid=0x471f waiting on condition [0x0000000000000000]

java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fbf19024800 nid=0x3503 in Object.wait() [0x0000000123cec000]

java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fbf19024000 nid=0x3303 in Object.wait() [0x0000000123be9000]

java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

"VM Thread" os_prio=31 tid=0x00007fbf1901f000 nid=0x3103 runnable

"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fbf1901d800 nid=0x2103 runnable

Page 11: Diagnose apps   back to basics

Thread dump - Important elements"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fbf19025800 nid=0x4903 runnable [0x0000000000000000]

java.lang.Thread.State: RUNNABLE

● Thread state ○ RUNNABLE○ WAITING○ TIMED_WAITING○ BLOCKED

● Thread ID - tid● Native ID - nid● Thread name

Page 12: Diagnose apps   back to basics

How does this help?

● They can tell us all threads that are running, blocking and waiting

● They can point to code lines that are blocking threads

● They can tell us if we are running out of memory○ VM threads busy○ With heap utilisation summary (kill -3)

Page 13: Diagnose apps   back to basics

How does this help?

● Find out what thread is eating up the most CPU○ Individual jvm thread is mapped to its own process○ top -H -p <pid>○ Other ways

■ topthreads plugin for jconsole■ JMC JMX view

Page 14: Diagnose apps   back to basics

Whats hard about thread dumps

● Reading them and understanding them● The can be loooong● What can we do

○ spend time in understanding on how to interpret them

○ use tools

Page 15: Diagnose apps   back to basics

Tools that are helpful

● IBM Thread and Monitor Dump Analyzer for Java

(only if you want to get a thread dump from Visual VM - not sure why though)

Page 16: Diagnose apps   back to basics

Closing thoughts on Thread dumps

● Application logs○ Log thread id/name on log statements○ Separate them from stdout

● Look at them more to understand your app and dig deeper on problems

● Automate them

Page 17: Diagnose apps   back to basics

Show of hands..

Whats the most common problem you have hit on a JVM app

Page 18: Diagnose apps   back to basics

OOME

● Not enough heap size● PermGen/Metaspace size● GC Overhead limit

Page 19: Diagnose apps   back to basics

Lets talk about GC

● Best source of information for GC○ GC Logs

● How do you get them○ -verbose:gc, -XX:+PrintGCDateStamps, -XX:

+PrintGCTimeStamps, -Xloggc:● Can I set them dynamically

○ jinfo -flags <flag>

Page 20: Diagnose apps   back to basics

What can a GC log tell you

● How long it takes to young and old GC● Workout allocation rates● Premature promotions● Memory leaks

Page 21: Diagnose apps   back to basics

Its all text - show me tools● GC Viewer - https://github.com/chewiebug/GCViewer

● HP Jmeter - https://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=HPJMETER

● IBM Monitoring and Diagnostics Tools - http://www.ibm.com/developerworks/java/jdk/tools/gcmv

● JClarity - http://www.jclarity.com/censum/

Page 22: Diagnose apps   back to basics

What’s the next best thing

● Heap dumps● Automate them

○ -XX:+HeapDumpOnOutOfMemoryError○ -XX:+HeapDumpPath

● To manually get heap dumps○ jmap

Page 23: Diagnose apps   back to basics

I have a heap dump - now what

local profiling (development only - production needs $$$£££)

Page 24: Diagnose apps   back to basics

When looking at GC issues

● Is your heap sizing sensible● Your code is broken● Frameworks are not perfect● You have no reason to use finalizers

Page 25: Diagnose apps   back to basics

To wrap things up..

● Know what your JVM has to offer● Third party tools are great, but fundamental

JVM tools can be more efficient and beneficial (and cheaper)

● Know your tools (get to know them, practice)● Add metrics to your application● Monitor the JVM

○ Standard and custom JMX

Page 26: Diagnose apps   back to basics

Question?