Lifting The Veil - Reading Java Bytecode

67
Lifting The Veil – Reading Java Byte Code Alexander Shopov

description

Java Bytecode Explained

Transcript of Lifting The Veil - Reading Java Bytecode

Page 1: Lifting The Veil - Reading Java Bytecode

Lifting The Veil – Reading Java Byte Code

Alexander Shopov

Page 2: Lifting The Veil - Reading Java Bytecode

Alexander Shopov

By day: Software Engineer at CiscoBy night: OSS contributorCoordinator of Bulgarian Gnome TP

Contacts:E-mail: [email protected]: [email protected]: http://www.linkedin.com/in/alshopovGoogle: Just search “al_shopov“

Page 3: Lifting The Veil - Reading Java Bytecode

Please Learn And Share

License: CC-BY v3.0Creative Commons Attribution v3.0

Page 4: Lifting The Veil - Reading Java Bytecode

Disclaimer

My opinions, knowledge and experience!

Not my employer's.

Page 5: Lifting The Veil - Reading Java Bytecode

Contents

● Why read?● How to read?

● JVM Internals;● JVM Data Types;● JVM Opcodes.

● Let's read some code.● What next?

Page 6: Lifting The Veil - Reading Java Bytecode

Why Read Byte code?

● Understand your platform● It is interesting and not too hard● How does Java function? How does X function?● Job interviews● Catch compiler bugs/optimizations● Learn to read before you write● Source may not correspond to binary● C/C++ people know their assembler● Java language evolution vs. Java platform evolution

Page 7: Lifting The Veil - Reading Java Bytecode

Bad News And Good News

Bad:We will be

reading assembler

Good:Easiest

assembler in world

Page 8: Lifting The Veil - Reading Java Bytecode

What Is The JVM?

● Stack based, byte oriented virtual machine without registers easily implementable on 32 bit hardware.

● 206 (<256) instructions that are easy to group and there is no need to remember them all

● Some leeway in implementations (even with Oracle)

Page 9: Lifting The Veil - Reading Java Bytecode

Dramatis Personæ

● The JVM● The threads● The frames● The stacks – LIFO● The local variables – array of slots● The runtime constant pool – array of values● The bytecode – the instructions● Class files – serialized form of constants and byte

code

Page 10: Lifting The Veil - Reading Java Bytecode

Enter JVM

JVM OS process

Page 11: Lifting The Veil - Reading Java Bytecode

Enter Threads

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

Page 12: Lifting The Veil - Reading Java Bytecode

Enter Frames

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

Page 13: Lifting The Veil - Reading Java Bytecode

Enter Frames, Really!

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

Page 14: Lifting The Veil - Reading Java Bytecode

What Is A Frame Actually?

F0

Page 15: Lifting The Veil - Reading Java Bytecode

Let's Peek Inside A Frame

F0

Page 16: Lifting The Veil - Reading Java Bytecode

F0

Enter Local Variables

0 1 2 3 4 5 6 …

Local variables

Page 17: Lifting The Veil - Reading Java Bytecode

F0

Enter Stack

0 1 2 3 4 5 6 …

Local variables

Stack

Page 18: Lifting The Veil - Reading Java Bytecode

F0

Enter Pool Of Constants

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Page 19: Lifting The Veil - Reading Java Bytecode

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Page 20: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Page 21: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

Where Is The Code?

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

Page 22: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

Where is the code?

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

Page 23: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

Load

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

6

Method codePC

Class

Page 24: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

And…

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

Page 25: Lifting The Veil - Reading Java Bytecode

JVM (heap)

F0

Store

6 8

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

Page 26: Lifting The Veil - Reading Java Bytecode

JVM Datatypes

● Primitive types● Java { numeric – integral: byte (±8), short (±16),

int (±32), long (±64), char (+16), floating point: float (±32), double (±64); boolean (int or byte) }

● returnAddress – pointers to the opcodes of JVM (jumps - loops)

● Reference types● class, array, interface● null

Page 27: Lifting The Veil - Reading Java Bytecode

JVM Datatypes Descriptors

Java type Type descriptor

boolean Z

char C

byte B

short S

int I

float F

long J

double D

Object Ljava/lang/Object;

byte[] [B

String[][] [[Ljava/lang/String;

void V

Page 28: Lifting The Veil - Reading Java Bytecode

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4()

Page 29: Lifting The Veil - Reading Java Bytecode

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

(Ljava/lang/Object;Ljava/lang/Long;)J

Page 30: Lifting The Veil - Reading Java Bytecode

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

long m5(Object, Long) (Ljava/lang/Object;Ljava/lang/Long;)J

Page 31: Lifting The Veil - Reading Java Bytecode

206 instructions

DON'T PANIC!

Page 32: Lifting The Veil - Reading Java Bytecode

Level 1 – Do Nothing/1

● nop

Page 33: Lifting The Veil - Reading Java Bytecode

Level 2 – Load Constants/20

● aconst_null, ● iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3,

iconst_4, iconst_5● lconst_0, lconst_1, ● fconst_0, fconst_1, fconst_2● dconst_0, dconst_1● bipush, sipush – 1, 2 bytes● ldc, ldc_w, ldc2_w – load from index in constant

pool 1,2,2 bytes for index

Page 34: Lifting The Veil - Reading Java Bytecode

Level 3 – Load Variables/33

● iload, lload, fload, dload, aload● iload_0, iload_1, iload_2, iload_3, lload_0,

lload_1, lload_2, lload_3, fload_0, fload_1, fload_2, fload_3, dload_0, dload_1, dload_2, dload_3, aload_0, aload_1, aload_2, aload_3

● iaload, laload, faload, daload, aaload, baload, caload, saload – consume reference to array and int index in it

Page 35: Lifting The Veil - Reading Java Bytecode

Level 4 – Conversions/15

● i2l, i2f, i2d, l2i, l2f, l2d, f2i, f2l, f2d, d2i, d2l, d2f, i2b, i2c, i2s

Page 36: Lifting The Veil - Reading Java Bytecode

Level 6 – Maths/37

● iadd, ladd, fadd, dadd, isub, lsub, fsub, dsub, imul, lmul, fmul, dmul, idiv, ldiv, fdiv, ddiv, irem, lrem, frem, drem, ineg, lneg, fneg, dneg, ishl, lshl, ishr, lshr, iushr, lushr, iand, land, ior, lor, ixor, lxor

● Iinc - increment local variable #index by signed byte const

Page 37: Lifting The Veil - Reading Java Bytecode

Level 7 – Stores/33

● istore, lstore, fstore, dstore, astore, istore_0, istore_1, istore_2, istore_3, lstore_0, lstore_1, lstore_2, lstore_3, fstore_0, fstore_1, fstore_2, fstore_3, dstore_0, dstore_1, dstore_2, dstore_3, astore_0, astore_1, astore_2, astore_3, iastore, lastore, fastore, dastore, aastore, bastore, castore, sastore

Page 38: Lifting The Veil - Reading Java Bytecode

Level 8 – No-branch Comparisons/5

● lcmp, fcmpl, fcmpg, dcmpl, dcmpg (beware NaN)

Page 39: Lifting The Veil - Reading Java Bytecode

Level 9 – Objects/15

● getstatic, putstatic● getfield, putfield● invokevirtual, invokespecial, invokestatic,

invokeinterface● new, newarray, anewarray● arraylength● athrow● checkcast, instanceof (difference is treatment of

null)

Page 40: Lifting The Veil - Reading Java Bytecode

Level 10 – Return/6

● ireturn, lreturn, freturn, dreturn, areturn, return

Page 41: Lifting The Veil - Reading Java Bytecode

165 of 206

81%

Page 42: Lifting The Veil - Reading Java Bytecode

We Have Enough Mana/Resources!

Let's dive in bytecode!

Page 43: Lifting The Veil - Reading Java Bytecode

Enter Bytecode

javap – your only true friend now

javap -classpath PATH -p -c -l -s CLASS

Page 44: Lifting The Veil - Reading Java Bytecode

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

Page 45: Lifting The Veil - Reading Java Bytecode

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

public static int whatIsThis (int a, int b, int c) {int result = a + b;result += c;return result;}

Page 46: Lifting The Veil - Reading Java Bytecode

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

Page 47: Lifting The Veil - Reading Java Bytecode

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

public static int whatIsThis (int a, int b, int c) {result a + b + c;}

Page 48: Lifting The Veil - Reading Java Bytecode

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

Page 49: Lifting The Veil - Reading Java Bytecode

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

public static int whatIsThis (int a, float b, double c) {

return (int) (a + b + c);}

Page 50: Lifting The Veil - Reading Java Bytecode

Example 4

public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

Page 51: Lifting The Veil - Reading Java Bytecode

More verbosity

javap -v -classpath PATH -p -c -l -s CLASS

Page 52: Lifting The Veil - Reading Java Bytecode

Example 4

Constant pool: #1 = Class #2 // org/kambanaria/readbytecode/bgoug/Example4 #2 = Utf8 org/kambanaria/readbytecode/bgoug/Example4… #16 = Fieldref #17.#19 // java/lang/System.out:Ljava/io/PrintStream;… #22 = String #23 // BGOUG #23 = Utf8 BGOUG #24 = Methodref #25.#27 // java/io/PrintStream.println:(Ljava/lang/String;)V…

Page 53: Lifting The Veil - Reading Java Bytecode

Example 4

public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

public static void main (String[] args){

System.out.println("BGOUG");}

// Hello, BGOUG!

Page 54: Lifting The Veil - Reading Java Bytecode

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

Page 55: Lifting The Veil - Reading Java Bytecode

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public char[] whatIsThis() { return content; }

Page 56: Lifting The Veil - Reading Java Bytecode

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public static void main (String[] args) { System.out.println(

Arrays.toString( new Example5(). whatIsThis()));}

Page 57: Lifting The Veil - Reading Java Bytecode

Level 11 – Stack/9

● pop a ➔● pop2 ba ➔● dup a aa➔● dup_x1 ba aba➔● dup_x2 cba acba➔● dup2 ba baba➔● dup2_x1 cba bacba➔● dup2_x2 dcba badcba➔● swap ba ab➔

Page 58: Lifting The Veil - Reading Java Bytecode

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class

java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field

s:Ljava/lang/String; 17: return

Page 59: Lifting The Veil - Reading Java Bytecode

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class

java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field

s:Ljava/lang/String; 17: return

public void whatIsThis(String s) { if (null == s) { throw new NullPointerException(); } this.s = s;}

Page 60: Lifting The Veil - Reading Java Bytecode

Level 12 – conditions, branches, loops/19

● ifeq, ifne, iflt, ifge, ifgt, ifle● if_icmpeq, if_icmpne, if_icmplt, if_icmpge,

if_icmpgt, if_icmple● if_acmpeq, if_acmpne● ifnull, ifnonnull● goto, jsr, ret

Page 61: Lifting The Veil - Reading Java Bytecode

193 of 206

94%

Page 62: Lifting The Veil - Reading Java Bytecode

Example 7

public static int parse(java.lang.String); Code: 0: aload_0 1: invokestatic #16 // Method java/lang/Integer.parseInt:(Ljava/lang/String;)I 4: ireturn 5: astore_1 6: iconst_0 7: ireturn Exception table: from to target type 0 4 5 Class java/lang/NumberFormatException

public static int parse(String s) {try {

return Integer.parseInt(s);} catch (NumberFormatException e) {

return 0;}

}

Page 63: Lifting The Veil - Reading Java Bytecode

Example 8

public class org.kambanaria.readbytecode.bgoug.Example8 { static final boolean $assertionsDisabled; static {}; Code: 0: ldc #1 // class org/kambanaria/readbytecode/bgoug/Example8 2: invokevirtual #10 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #16 // Field $assertionsDisabled:Z 16: return

public class Example8 {private static String repeat(String s){

assert s != null;return s + s;

}}

Page 64: Lifting The Veil - Reading Java Bytecode

Example 8 private static java.lang.String repeat(java.lang.String); Code: 0: getstatic #16 // Field $assertionsDisabled:Z 3: ifne 18 6: aload_0 7: ifnonnull 18 10: new #28 // class java/lang/AssertionError 13: dup 14: invokespecial #30 // Method java/lang/AssertionError."<init>":()V 17: athrow 18: new #31 // class java/lang/StringBuilder 21: dup 22: aload_0 23: invokestatic #33 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26: invokespecial #39 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 29: aload_0 30: invokevirtual #42 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 33: invokevirtual #46 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 36: areturn

Page 65: Lifting The Veil - Reading Java Bytecode

Now You Know

Beware Asserts In Public Methods!

Page 66: Lifting The Veil - Reading Java Bytecode

Further resources

● Oracle: The JVM Specification, Java SE 7 Edition

● A. Arhipov: Java Bytecode For Discriminating Developers

● Wikipedia: Java Bytecode Instruction Listings● S. H. Park Understanding JVM Internals● C. McGlone:

Looking "Under the Hood" with javap● P. Haggar: Java bytecode

Page 67: Lifting The Veil - Reading Java Bytecode

Presentation background

● Alexander Wilms: Hexagons