Java on Linux for devs and ops

25
What Dev and Ops should know about Java on Linux? Alexey Ragozin [email protected]

Transcript of Java on Linux for devs and ops

Page 1: Java on Linux for devs and ops

What Dev and Ops should know about

Java on Linux?

Alexey Ragozin

[email protected]

Page 2: Java on Linux for devs and ops

Java Memory

Java Heap

Young Gen

Old Gen

Perm Gen

Non-Heap

JVM Memory

Thread Stacks

NIO Direct Buffers

Metaspace

Compressed Class Space

Code Cache

Native JVM Memory

Non-JVM Memory (native libraries)

Java 7

Java 8

Java 8

-Xms/-Xmx-Xmn

-XX:PermSize

-XX:MaxDirectMemorySize

-XX:ReservedCodeCacheSize

-XX:MaxMetaspaceSize

-XX:CompressedClassSpaceSize

Ja

va

Pro

ce

ss M

em

ory

-XX:ThreadStackSize per thread

Page 3: Java on Linux for devs and ops

Linux memory

Memory is managed in pages (4k) on x86 / AMD64

(Huge page support is mostly defunct in Linux)

Pages from process point of view

- Virtual address reservation

- Committed memory page

- File mapped memory page

Page 5: Java on Linux for devs and ops

Understanding memory metrics

Page 6: Java on Linux for devs and ops

Understanding memory metrics

OS Memory

Memory Used/Free – misleading metric

Swap used – should be zero

Buffers/Cached – essentially this is free memory*

Process

VIRT – address space reservation - not a memory!

RES – resident size - key memory footprint

SHR – shared size

Page 7: Java on Linux for devs and ops

Understanding memory metrics

Buffers – pages used for file system metadata

Cached – pages mapped to file data

Non-dirty pages used for buffers/cache can immediately to fulfill memory allocation request.

Dirty pages – writable file mapped pages which has modifications not synchronized to disk.

Page 8: Java on Linux for devs and ops

Linux Process Memory Summary

Virtu

al

Co

mm

ited R

esid

ent

Zeroed pages Swapped pages

Page 9: Java on Linux for devs and ops

Java Memory Facts

Swapping intolerance GC does heap wide scans SWT pauses prolonged by swapping

are affecting whole application threads

Java never give up memory to OS Strictly speaking serial GC and G1 does Practically you should assume it does not

JVM Process footprint > JVM Heap size

Page 10: Java on Linux for devs and ops

JVM Out of Memory

JVM heap is full and at –Xmx limit Full GC, then OOM error if not enough memory reclaimed OOM error is not recoverable, useful to shutdown gracefully -XX:OnOutOfMemoryError="kill -9 %p“

JVM heap is full but below –Xmx limit Heap is extended by requesting more memory from OS

If OS rejects memory requests JVM would crash (no OOM error)

NIO direct buffers capacity is capped by JVM -XX:MaxDirectMemorySize=16g

Cap is enfored by JVM OOM error in case is limit has been reached - recoverable

If request for memory from JVM rejected by OS JVM would crash

Page 11: Java on Linux for devs and ops

Low memory conditions

Low memory condition on server Swapping / Paging

Dramatic application performance degradation

Application freezes

JVM crashes

Always plan server memory capacity

You should always have physical memory reserve.

Page 12: Java on Linux for devs and ops

Linux paravirtualization

In Docker container Guest resources are capped via Linux cgroups https://en.wikipedia.org/wiki/Cgroups Kernel memory pools can be limited

resident / swap / memory mapped

Limits are global for container

Resources restrictions violations remediated by kill -9

Plan your container size carefully

Page 13: Java on Linux for devs and ops

ulimits

> ulimit -a

core file size (blocks, -c) 1

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 0

file size (blocks, -f) unlimited

pending signals (-i) 4134823

max locked memory (kbytes, -l) 64

max memory size (kbytes, -m) 449880520

open files (-n) 1024

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 8192

cpu time (seconds, -t) unlimited

max user processes (-u) 4134823

virtual memory (kbytes, -v) 425094640

file locks (-x) unlimited

May prevent you form starting large JVM

Core dump disabled

Page 14: Java on Linux for devs and ops

Setting up JVM

-Xms = -Xmx – reserve memory on start

GC logging options (-XX:+PrintGCDetails, etc) http://blog.ragozin.info/2016/10/hotspot-jvm-garbage-collection-options.html

GC logging is synchronous avoid network / slow mounts

Do JVM sizing exercise

Choose right GC parallel threads (-XX:ParallelGCThredas) Sometimes less is better

Getting dump on crash -XX:+HeapDumpOnOutOfMemoryError – may “crash” Linux Java heap dump can be produced from Linux core dump https://docs.oracle.com/javase/8/docs/technotes/guides/ troubleshoot/bugreports004.html#CHDHDCJD

Page 15: Java on Linux for devs and ops

Network tuning

Cross region data transfers (client or server) Tune options at socket level Tune Linux network caps (sysctl) net.ipv4.tcp_rmem

net.ipv4.tcp_wmem

UDP based communications net.core.wmem_max

net.core.rmem_max

Page 16: Java on Linux for devs and ops

Other OS related tuning

NUMA numactl --cpunodebind=xxx

ignore JVM Numa* options KVM hypervisor does not support NUMA for guests

Assigning threads to cores taskset

Exploiting CPU isolation Kernel level configuration Threads should be taskset explicitly

Page 17: Java on Linux for devs and ops

Troubleshooting Diagnostics

Page 18: Java on Linux for devs and ops

Troubleshooting / Diagnostics

Native Linux tools

ps / top / vmstat / pmap / etc

JDK tools

PID based

JVM Attach based tools

Perf counter based tools

JMX based tools (JVisualVM / JConsole)

JVM Flight Recorder – post analysis

GC / JVM logs

Page 19: Java on Linux for devs and ops

Troubleshooting / Diagnostics

Native Linux tools

ps / top / vmstat / pmap / etc

JDK tools

PID based

JVM Attach based tools

Perf counter based tools

JMX based tools (JVisualVM / JConsole)

JVM Flight Recorder – post analysis

GC / JVM logs

Affected by JVM freezes

Page 20: Java on Linux for devs and ops

Thread CPU usage

ragoale@axcord02:~> ps -T -p 6857 -o pid,tid,%cpu,time,comm

PID TID %CPU TIME COMMAND

6857 6857 0.0 00:00:00 java

6857 6858 0.0 00:00:00 java

6857 6859 0.0 00:00:16 java

6857 6860 0.0 00:00:16 java

6857 6861 0.0 00:00:18 java

6857 6862 0.1 00:13:05 java

6857 6863 0.0 00:00:00 java

6857 6864 0.0 00:00:00 java

6857 6877 0.0 00:00:00 java

6857 6878 0.0 00:00:00 java

6857 6880 0.0 00:00:20 java

6857 6881 0.0 00:00:04 java

6857 6886 0.0 00:00:00 java

6857 6887 0.0 00:03:07 java

...

This thread mapping is “typical” and not accurate, use jstack to get Java thread information for thread ID

VM Thread

GC Threads

Other application and JVM threads

Page 21: Java on Linux for devs and ops

Thread CPU usage

jstack (JDK tool) Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode):

"Attach Listener" #65 daemon prio=9 os_prio=0 tid=0x0000000000cbc800 nid=0x1f0 waiting on condition [0x0000000000000000]

java.lang.Thread.State: RUNNABLE

"pool-1-thread-20" #64 prio=5 os_prio=0 tid=0x00000000009d5000 nid=0x1c04 waiting on condition [0x00007fa109e55000]

java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for <0x00000000d3ab9e50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)

at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-19" #63 prio=5 os_prio=0 tid=0x0000000000a1e800 nid=0x1bff waiting on condition [0x00007fa109f56000]

java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for <0x00000000d3ab9e50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)

at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

...

Linux thread ID in hex

jstack forces STW pause in target JVM!

Page 22: Java on Linux for devs and ops

Thread CPU usage

sjk ttop command - https://github.com/aragozin/jvm-tools 2016-07-27T07:47:20.674-0400 Process summary

process cpu=8.11%

application cpu=2.17% (user=1.52% sys=0.65%)

other: cpu=5.95%

GC cpu=0.00% (young=0.00%, old=0.00%)

heap allocation rate 1842kb/s

safe point rate: 1.1 (events/s) avg. safe point pause: 0.43ms

safe point sync time: 0.01% processing time: 0.04% (wallclock time)

[003120] user= 1.12% sys= 0.24% alloc= 983kb/s - RMI TCP Connection(1)-172.17.168.11

[000039] user= 0.30% sys= 0.26% alloc= 701kb/s - DB feed - UserPermission.DBWatcher

[000053] user= 0.00% sys= 0.05% alloc= 50kb/s - Statistics

[000038] user= 0.00% sys= 0.05% alloc= 4584b/s – Reactor-0

[000049] user= 0.00% sys= 0.03% alloc= 38kb/s - DB feed - UserInfo.DBWatcher

[000036] user= 0.00% sys= 0.03% alloc= 0b/s - Abandoned connection cleanup thread

[003122] user= 0.00% sys= 0.03% alloc= 4915b/s - JMX server connection timeout 3122

[000040] user= 0.10% sys=-0.09% alloc= 8321b/s - DB feed - Report.DBWatcher

[000050] user= 0.00% sys= 0.01% alloc= 24kb/s - DB feed - Rule.DBWatcher

[000051] user= 0.00% sys= 0.01% alloc= 9034b/s - DB feed - EmailAccount.DBWatcher

[000044] user= 0.00% sys= 0.01% alloc= 4840b/s - DB feed - Analytics.DBWatcher

[000041] user= 0.00% sys= 0.01% alloc= 9999b/s - DB feed - Contact.DBWatcher

[000054] user= 0.00% sys= 0.01% alloc= 3481b/s – Statistics

[000001] user= 0.00% sys= 0.00% alloc= 0b/s - main

[000002] user= 0.00% sys= 0.00% alloc= 0b/s - Reference Handler

[000003] user= 0.00% sys= 0.00% alloc= 0b/s - Finalizer

[000005] user= 0.00% sys= 0.00% alloc= 0b/s - Signal Dispatcher

[000008] user= 0.00% sys= 0.00% alloc= 0b/s - JFR request timer

[000010] user= 0.00% sys= 0.00% alloc= 0b/s - VM JFR Buffer Thread

Does not infer STW pauses on target process

Page 23: Java on Linux for devs and ops

Leaking OS resources

Linux OS has number cap on file handles if exceeded … Cannot open new files Cannot connect / accept socket connections

Garbage collector closes handles automatically Files and sockets Eventually … Always close your files and sockets

Resources which cannot be explicitly disposed File memory mappings NIO direct buffers

Unfinalized objects can be inspected in heap dump

Page 24: Java on Linux for devs and ops

Other useful JDK tools

jinfo query / update XX JVM options (e.g. enable/adjust GC logging)

query system properties (including dynamically updated)

jstat can be used for monitoring heap dynamics

jcmd universal tool for JVM Attach interface

jcmd PID PerfCounter.print – dumps all JVM perfcounters, useful for monitoring

Page 25: Java on Linux for devs and ops

THANK YOU

Alexey Ragozin [email protected]

http://blog.ragozin.info