Java Memory Model

65
JMM Java Memory Model Łukasz Koniecki 24/10/2016

Transcript of Java Memory Model

Page 1: Java Memory Model

JMMJava Memory Model

Łukasz Koniecki24/10/2016

Page 2: Java Memory Model

About me

JavaUniverse

SpringMyFaces

JSF

PlaySpark

GWT

Vadin

Tapestry

WicketSpring MVC

StrutsGrails

REST API

JPA

GC

JVM

JAVA EE

TomcatSpark

Page 3: Java Memory Model

Goal

• Familiarize with the JMM,

• How processor works?

• Recall how Java compiler and JVM work,

• JIT in action,

• Explain what is a data race and a correctly synchronized program,

• Talk about synchronization and atomicity,

• Based on examples...

• Next-gen JMM...

Page 4: Java Memory Model

§17.4 Memory Model

Page 5: Java Memory Model

John von Neumann

Page 6: Java Memory Model

Wikipedia: http://bit.ly/2cMU0GB

Von Neumann Architecture

Page 7: Java Memory Model

Dummy program

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 8: Java Memory Model

RAM

i = 0

j = 0

Cache

Program execution

System Bus

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

The Java Memory Model for Practitioners: http://bit.ly/2cMXklJ

Page 9: Java Memory Model

RAM

i = 0

j = 0

Cache

Program execution

System Bus

i = 0

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 10: Java Memory Model

RAM

i = 0

j = 0

Cache

Program execution

System Bus

i = 1

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 11: Java Memory Model

RAM

i = 0

j = 0

Cache

Program execution

System Bus

i = 1

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 12: Java Memory Model

RAM

i = 1

j = 0

Cache

Program execution

System Bus

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 13: Java Memory Model

RAM

i = 1

j = 0

Cache

Program execution

System Bus

j = 0

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 14: Java Memory Model

RAM

i = 1

j = 0

Cache

Program execution

System Bus

j = 1

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 15: Java Memory Model

RAM

i = 1

j = 1

Cache

Program execution

System Bus

j = 1

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 16: Java Memory Model

RAM

i = 1

j = 1

Cache

Program execution

System Bus

Sequentialy consistentexecution

public class Example {

int i, j;

public void myDummyMethod() {

i+=1;

j+=1;

i+=1;

...

}

}

Page 17: Java Memory Model

PC World: http://bit.ly/2cE9f7q

Haswell-E processor

Page 18: Java Memory Model

Our world in data: http://bit.ly/1NLxNcH

Moore’s Law

Page 19: Java Memory Model

Moore’s Law

Our world in data: http://bit.ly/1NLxNcH

2006

Page 20: Java Memory Model

Processor technology

• ...

• 22 nm – 2012

• 14 nm – 2014

• 10 nm – 2017

• 7 nm – ~2019

• 5 nm – ~2021

Wikipedia: http://bit.ly/2cMWoNg

Page 21: Java Memory Model

Processor vs. Memory Performance

How L1 and L2 CPU caches work, and why they’re an essential part of modern chips: http://bit.ly/2cpHu1x

Page 22: Java Memory Model

Wikipedia: http://bit.ly/2cm33me

Cache hierarchy in a modern processor

Page 23: Java Memory Model

Wikipedia: http://bit.ly/2cm33me

Cache hierarchy in a modern processor

Page 24: Java Memory Model

Important latency numbers

Page 25: Java Memory Model

Core i7 Xeon 5500 Series Data Source Latency (approximate)

local L1 CACHE hit, ~4 cycles ( 2.1 - 1.2 ns )local L2 CACHE hit, ~10 cycles ( 5.3 - 3.0 ns )local L3 CACHE hit, line unshared ~40 cycles ( 21.4 - 12.0 ns )local L3 CACHE hit, shared line in another core ~65 cycles ( 34.8 - 19.5 ns )local L3 CACHE hit, modified in another core ~75 cycles ( 40.2 - 22.5 ns )

remote L3 CACHE (Ref: Fig.1 [Pg. 5]) ~100-300 cycles ( 160.7 - 30.0 ns )

local DRAM ~60 nsremote DRAM ~100 ns

Performance Analysis Guide for Intel® Core™ i7 Processor and Intel® Xeon™ 5500 processors: http://intel.ly/2cV1ZFZ

Cache latency

Page 26: Java Memory Model

Weak vs. Strong hardware Memory Models

Weak vs. Strong Memory Models: http://bit.ly/2cC4avk

Page 27: Java Memory Model

x86/x64 processor memory model

R-R R-W

W-R W-W

Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2

Processor P can read B

before it’s write to A is seen

by all processors

(processor can move its

own reads in front of its

own writes)

Page 28: Java Memory Model

x86/x64 processor memory model

R-R R-W

W-R W-W

Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2

Processor P can read B

before it’s write to A is seen

by all processors

(processor can move its

own reads in front of its

own writes)

Page 29: Java Memory Model

How Java compiler works?

javacSourcecode

Bytecode

Bytecode verifier

Class loader

JIT

JVM

OSNativecode

Bytecode

Page 30: Java Memory Model

JIT

•Profile guided,

•Speculatively optimizing,

•Backup strategies,

•Optimizes code for us,

•We don’t have to care so much about cache-wise operations

Page 31: Java Memory Model

Tiered compilation

time

throuput

startup

interpreted

C1

C2

sampling full speed

deoptimize

bail to interpreter

Page 32: Java Memory Model

Tiered compilation (interpreter)

time

throuput

startup

interpreted

C1

C2

sampling full speed

deoptimize

bail to interpreter

Interpreter• extremly slow,• not profiling

Page 33: Java Memory Model

Tiered compilation (C1 compiler)

time

throuput

startup

interpreted

C1

C2

sampling full speed

deoptimize

bail to interpreter

C1• client,• fast but dummy,• does the profiling,• e.g: branches, typechecks,

Page 34: Java Memory Model

Tiered compilation (C2 compiler)

time

throuput

startup

interpreted

C1

C2

sampling full speed

deoptimize

bail to interpreter

C2• server,• slow but clever,• aggresively optimizing,• based on profile,• e.g.: loop optimizations(unswitching, unrolling),Implicit Null Checking

Page 35: Java Memory Model

Why do we need a JMM?

• Different platform memory models (none of them match the JMM!!!)

• Many JVM implementations,

• People don’t know how to program concurrently,

• Programmers: write reliable and multithreaded code,

• Compiler writers: implement optimization which will be a legal, optimization according to the JLS

• Compiler: produce fast and optimal native code,

Page 36: Java Memory Model

JMM

• Action: read and write to variable, lock and unlock of monitor, starting and joining with thread,

• Happens-before partial order,

• Thread executing action B can see the results of action A (any thread), there must be a happens-before relationship between A and B,

• Otherwise JVM is free to reorder,

Page 37: Java Memory Model

Happens-before orderings

• Unlock of a monitor / lock of that monitor,

• Write to a volatile variable / read of that variable,

• Call to start() / any action in the started thread,

• All actions in a thread / any other thread successfully returns from join() on that thread,

• Setting default values for variables, setting value to a final field in the constructor / constructor finish,

• Write to an Atomic variable / read from that variable,

• Many java.util.concurrent methods,

Page 38: Java Memory Model

JMM

• A promise for programmers: sequential consistency must be sacrificed to allow optimizations, but it will still hold for data race free programs. This is the data race free (DRF) guarantee.

• A promise for security: even for programs with data races, values should not appear “out of thin air”, preventing unintended information leakage.

• A promise for compilers: common hardware and software optimizations should be allowed as far as possible without violating the first two requirements.

Java Memory Model Examples: Good, Bad and Ugly: http://bit.ly/2cZfF1I

Page 39: Java Memory Model

Example

@NotThreadSafeclass DataRace {

int a, b;int x, y;

void thread1() {y = a;b = 1;

}

void thread2() {x = b;a = 2;

}}

y == 2, x == 1 ???

Page 40: Java Memory Model

How can this happen?

• Processor can reorder statements (out-of-order execution, HT)

• Lazy synchronization between caches and main memory,

• Compiler can reorder statements (or keep values is registers),

• Aggressive optimizations in JIT,

Page 41: Java Memory Model

Example

@NotThreadSafeclass DataRace {

int a, b;int x, y;

void thread1() {y = a;b = 1;

}

void thread2() {x = b;a = 2;

}}

time

Thread 1 Thread 2

y = a;

b = 1;

x = b;

a = 2;

Page 42: Java Memory Model

Example

@NotThreadSafeclass DataRace {

int a, b;int x, y;

void thread1() {y = a;b = 1;

}

void thread2() {x = b;a = 2;

}}

time

Thread 1 Thread 2

b = 1;

y = a;

x = b;

a = 2;

Page 43: Java Memory Model

Example

@NotThreadSafeclass DataRace {

int a, b;int x, y;

void thread1() {y = a;b = 1;

}

void thread2() {x = b;a = 2;

}}

time

Thread 1 Thread 2

b = 1;

y = a;

a = 2;

x = b;

Page 44: Java Memory Model

Example

@NotThreadSafeclass DataRace {

int a, b;int x, y;

void thread1() {y = a;b = 1;

}

void thread2() {x = b;a = 2;

}}

time

Thread 1 Thread 2

b = 1;

a = 2;

x = b;

y = a;

y == 2, x == 1

Page 45: Java Memory Model

Example of x86/x64 test results

Page 46: Java Memory Model

Test using jstress

@JCStressTest

@Description("Data race")

@Outcome(id = {"0, 0", "0, 1", "2, 0"}, expect = ACCEPTABLE,

desc = "Trivial under sequential consistency")

@Outcome(id = {"2, 1"}, expect = ACCEPTABLE, desc = "Racy read of x")

@State

public class DataRace {

int a, b;

int x, y;

@Actor

void thread1(IntResult2 r) {

y = a;

b = 1;

r.r1 = y;

}

@Actor

void thread2(IntResult2 r) {

x = b;

a = 2;

r.r2 = x;

}

}

jcstress: http://bit.ly/2daSL5Q

Page 47: Java Memory Model

Example of x86/x64 test results

R-R R-W

W-R W-W

Page 48: Java Memory Model

Test results interpretation

y==0, x==0

y==0, x==1

y==2, x==0

time

.

.

.

y = a;

b = 1;

.

.

.

x = b;

a = 2;

Page 49: Java Memory Model

Test results interpretation

y==0, x==0

y==0, x==1

y==2, x==0

time

.

.

.

y = a;

b = 1;

.

.

.

x = b;

a = 2;

Page 50: Java Memory Model

Test results interpretation

y==0, x==0

y==0, x==1

y==2, x==0

time

.

.

.

y = a;

b = 1;

.

.

.

x = b;

a = 2;

Page 51: Java Memory Model

Visibility between threads@ThreadSafepublic class DataRace {

int a, b;int x, y;

void thread1() {synchronized (this) {

y = a;b = 1;

}}

void thread2() {synchronized (this) {

x = b;a = 2;

}}

}

Page 52: Java Memory Model

Visibility between threads

time

Thread 1 Thread 2

(Th2 starts after Th1)

Programorder

Programorder

synchronizationorder

Every operation thathappens before

an unlock (release)

Is visible to an operation thathappens after

a later lock (aquire)happens-beforeorder

@ThreadSafepublic class DataRace {

int a, b;int x, y;

void thread1() {synchronized (this) {

y = a;b = 1;

}}

void thread2() {synchronized (this) {

x = b;a = 2;

}}

}

.

.

.<enter this>

y = a;b = 1;<exit this>

<enter this>x = b;a = 2;<exit this>...

Possible results:y==0, x == 1y==2, x == 0

Page 53: Java Memory Model

Synchronization

High level• java.util.concurrent

Low level• synchronized() blocks and methods,• java.util.concurrent.locks

Low level primitives• volatile variables• java.util.concurrent.atomic

Page 54: Java Memory Model

Volatile@ThreadUnsafepublic class Looper {

static boolean done;

public static void main(String[] args)throws InterruptedException {

new Thread(new Runnable() {@Overridepublic void run() {

int count = 0;while (!done) {

count++;}System.out.println("Ending this task");

}}).start();

Thread.sleep(1000);System.out.println("Waiting done");done = true;

}}

Page 55: Java Memory Model

Volatile@ThreadSafepublic class Looper {

volatile static boolean done;

public static void main(String[] args)throws InterruptedException {

new Thread(new Runnable() {@Overridepublic void run() {

int count = 0;while (!done) {

count++;}System.out.println("Ending this task");

}}).start();

Thread.sleep(1000);System.out.println("Waiting done");done = true;

}}

Programorder

Programorder

synchronizationorder

Thread 1

time

Thread 2

.

.

.done = true;

while (!done)...

happens-beforeorder

Page 56: Java Memory Model

More about volatile

• Volatile reads are very cheep (no locks compared to synchronized)

• Volatile increment is not atomic (!!!)

• Elements in volatile collection are not volatile (e.g. volatile int[])

• Consider using java.util.concurrent

Page 57: Java Memory Model

What operations in Java are atomic?

• Read/write on variables of primitive types (except of longand double – Word Tearing problem),

• Read/write on volatile variables of primitive type (including long and double),

• All read/writes to references are always atomic (http://bit.ly/2c8kn8i),

• All operations on java.util.concurrent.atomic types,

Page 58: Java Memory Model

ExamplesBe careful what you’re doing...

Page 59: Java Memory Model

Double-checked locking

@ThreadSafepublic class DoubleCheckedLocking {

private volatile Helper helper = null;

public Helper getHelper() {

if (helper == null) {

synchronized (this) {

if (helper == null)

helper = new Helper();

}

}

return helper;

}

}

The "Double-Checked Locking is Broken" Declaration: http://bit.ly/2cIDBnA

Page 60: Java Memory Model

Final@ThreadUnsafeclass UnsafePublication {

private int a;

private static UnsafePublication instance;

private UnsafePublication() {

a = 1;

}

void thread1() throws InterruptedException {

instance = new UnsafePublication();

}

void thread2() {

if (instance != null) {

System.out.println(instance.a);

}

}

}

What statecan thread 2 see???

null, 0, 1

Page 61: Java Memory Model

Final@ThreadSafeclass SafePublication {

private final int a;

private static SafePublication instance;

private SafePublication() {

a = 1;

}

void thread1() throws InterruptedException {

instance = new SafePublication();

}

void thread2() {

if (instance != null) {

System.out.println(instance.a);

}

}

}

Page 62: Java Memory Model

Next-JMM

• JEP 188,

• Improve formalization,

• JVM coverage,

• Extend scope,

• Testing support,

• Tool support,

• Enh: atomic r/w for long and double,

Page 63: Java Memory Model

To sum up...

• Concurrent programming isn’t easy,

• Design your code for concurrency (make it right before you make it fast),

• Do not code against the implementation. Code against the specification,

• Use high level synchronization wherever possible,

• Watch out for useless synchronization,

• Use Thread Safe Immutable objects,

Page 64: Java Memory Model

Further reading

•Aleksey Shipilëv: One Stop Page (http://bit.ly/2cqBt4x),

•Rafael Winterhalter: The Java Memory Model for Practitioners (http://bit.ly/2cMXklJ),

•Brian Goetz: Java Concurrency in Practice (http://amzn.to/2cloe76)

Page 65: Java Memory Model

Thank you!