Lifting The Veil - Reading Java Bytecode During Lunchtime

Post on 10-May-2015

1.295 views 0 download

Tags:

Transcript of Lifting The Veil - Reading Java Bytecode During Lunchtime

Lifting The Veil – Reading Java Byte Code During Lunchtime

Alexander ShopovCisco Lunch&Learn

Alexander Shopov

By day: Software Engineer at CiscoBy night: OSS contributorCoordinator of Bulgarian Gnome TP

Contacts:E-mail: ash@kambanaria.orgJabber: al_shopov@jabber.minus273.orgLinkedIn: http://www.linkedin.com/in/alshopovGoogle: Just search “al_shopov“

Please Learn And Share

License: CC-BY v3.0Creative Commons Attribution v3.0

Disclaimer

My opinions, knowledge and experience!

Not my employer's.

Contents

● Why read?● How to read?

● JVM Internals;● JVM Data Types;● JVM Opcodes.

● Let's read some code.● What next?

Why Read Byte code?

● Understand your platform● It is interesting and not too hard● How does Java function? How does X function?● Job interviews● Catch compiler bugs/optimizations● Learn to read before you write● Source may not correspond to binary● C/C++ people know their assembler● Java language evolution vs. Java platform evolution

Bad News And Good News

Bad:We will be

reading assembler

Good:Easiest

assembler in world

What Is The JVM?

● Stack based, byte oriented virtual machine without registers easily implementable on 32 bit hardware.

● 206 (<256) instructions that are easy to group and there is no need to remember them all

● Some leeway in implementations (even with Oracle)

Dramatis Personæ

● The JVM● The threads● The frames● The stacks – LIFO● The local variables – array of slots● The runtime constant pool – array of values● The bytecode – the instructions● Class files – serialized form of constants and byte

code

Enter JVM

JVM OS process

Enter Threads

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

Enter Frames

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

Enter Frames, Really!

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

What Is A Frame Actually?

F0

Let's Peek Inside A Frame

F0

F0

Enter Local Variables

0 1 2 3 4 5 6 …

Local variables

F0

Enter Stack

0 1 2 3 4 5 6 …

Local variables

Stack

F0

Enter Pool Of Constants

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

JVM (heap)

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

JVM (heap)

F0

Where Is The Code?

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

JVM (heap)

F0

Where is the code?

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

JVM (heap)

F0

Load

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

6

Method codePC

Class

JVM (heap)

F0

Load

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

6

Method codePC

Class

JVM (heap)

F0

And…

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

JVM (heap)

F0

Store

6 8

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

JVM Datatypes

● Primitive types● Java { numeric – integral: byte (±8), short (±16),

int (±32), long (±64), char (+16), floating point: float (±32), double (±64); boolean (int or byte) }

● returnAddress – pointers to the opcodes of JVM (jumps - loops)

● Reference types● class, array, interface● null

JVM Datatypes Descriptors

Java type Type descriptor

boolean Z

char C

byte B

short S

int I

float F

long J

double D

Object Ljava/lang/Object;

byte[] [B

String[][] [[Ljava/lang/String;

void V

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4()

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

(Ljava/lang/Object;Ljava/lang/Long;)J

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

long m5(Object, Long) (Ljava/lang/Object;Ljava/lang/Long;)J

206 instructions

DON'T PANIC!

Level 1 – Do Nothing/1

● nop

Level 2 – Load Constants/20

● aconst_null, ● iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3,

iconst_4, iconst_5● lconst_0, lconst_1, ● fconst_0, fconst_1, fconst_2● dconst_0, dconst_1● bipush, sipush – 1, 2 bytes● ldc, ldc_w, ldc2_w – load from index in constant

pool 1,2,2 bytes for index

Level 3 – Load Variables/33

● iload, lload, fload, dload, aload● iload_0, iload_1, iload_2, iload_3, lload_0,

lload_1, lload_2, lload_3, fload_0, fload_1, fload_2, fload_3, dload_0, dload_1, dload_2, dload_3, aload_0, aload_1, aload_2, aload_3

● iaload, laload, faload, daload, aaload, baload, caload, saload – consume reference to array and int index in it

Level 4 – Conversions/15

● i2l, i2f, i2d, l2i, l2f, l2d, f2i, f2l, f2d, d2i, d2l, d2f, i2b, i2c, i2s

Level 6 – Maths/37

● iadd, ladd, fadd, dadd, isub, lsub, fsub, dsub, imul, lmul, fmul, dmul, idiv, ldiv, fdiv, ddiv, irem, lrem, frem, drem, ineg, lneg, fneg, dneg, ishl, lshl, ishr, lshr, iushr, lushr, iand, land, ior, lor, ixor, lxor

● Iinc - increment local variable #index by signed byte const

Level 7 – Stores/33

● istore, lstore, fstore, dstore, astore, istore_0, istore_1, istore_2, istore_3, lstore_0, lstore_1, lstore_2, lstore_3, fstore_0, fstore_1, fstore_2, fstore_3, dstore_0, dstore_1, dstore_2, dstore_3, astore_0, astore_1, astore_2, astore_3, iastore, lastore, fastore, dastore, aastore, bastore, castore, sastore

Level 8 – No-branch Comparisons/5

● lcmp, fcmpl, fcmpg, dcmpl, dcmpg (beware NaN)

Level 9 – Objects/15

● getstatic, putstatic● getfield, putfield● invokevirtual, invokespecial, invokestatic,

invokeinterface● new, newarray, anewarray● arraylength● athrow● checkcast, instanceof (difference is treatment of

null)

Level 10 – Return/6

● ireturn, lreturn, freturn, dreturn, areturn, return

165 of 206

81%

We Have Enough Mana/Resources!

Let's dive in bytecode!

Enter Bytecode

javap – your only true friend now

javap -classpath PATH -p -c -l -s CLASS

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

PCClass

0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

3

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

7

Stack

3

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

10

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 10

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

10

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 10

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

4

Stack

10

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 10

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

14

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 10

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 14

JVM (heap)

F0

Example 1

3 7

Cla

ss

Pool of constants

0 1 2 3

Local variables

Stack

14

PC

Class0: iload_01: iload_12: iadd3: istore_34: iload_35: iload_26: iadd7: istore_38: iload_39: ireturn

4 14

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

public static int //whatIsThis(int a, int b, int c) { int result = a + b; result += c; return result;}

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

public static int //whatIsThis(int a, int b, int c) { return a + b + c;}

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

public static int //whatIsThis(int a, float b, // double c) { return (int) (a + b + c);}

Example 4

public static void main(java.lang.String[]); Code: 0: getstatic #16

// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24

// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

More verbosity

javap -v -classpath PATH -p -c -l -s CLASS

Example 4Constant pool:

#1=Class #2 // org/kambanaria/readbytecode/bgoug/Example4

#2=Utf8 org/kambanaria/readbytecode/bgoug/Example4

#16=Fieldref #17.#19 // java/lang/System.out:Ljava/io/PrintStream;

#17=Class #18 // java/lang/System

#18=Utf8 java/lang/System

#19=NameAndType #20:#21 // out:Ljava/io/PrintStream;

#20=Utf8 out

#21=Utf8 Ljava/io/PrintStream;

#22=String #23 // There

#23=Utf8 There

#24=Methodref #25.#27 //java/io/PrintStream.println:(Ljava/lang/String;)V

Example 4

public static void main(java.lang.String[]); Code: 0: getstatic #16

// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24

// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

public static void //main(String[] args) { System.out.println("There");}

// Hello There!

Example 4

public static void main(java.lang.String[]); Code: 0: getstatic #16

// Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String There 5: invokevirtual #24

// Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

Example 4 0: getstatic #16 getstatic = 0xb2, 16 = 0x00 10 3: ldc #22 ldc = 0x12, 22 = 0x16 5: invokevirtual #24 invokevirtual = 0xb6, 24 = 0x00 18 8: return return = 0xb1

b2 00 10 12 16 b6 00 18 b1

od -t x1 Example4.class | tail -60001000 00 0e 00 0f 00 01 00 07 00 00 00 37 00 02 00 010001020 00 00 00 09 b2 00 10 12 16 b6 00 18 b1 00 00 000001040 02 00 0a 00 00 00 0a 00 02 00 00 00 07 00 08 000001060 08 00 0b 00 00 00 0c 00 01 00 00 00 09 00 1e 000001100 1f 00 00 00 01 00 20 00 00 00 02 00 210001115

Example 5

public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn

public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

Example 5

public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn

public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public char[] whatIsThis() { return this.content;}

Example 5

public char[] whatIsThis(); Code: 0:aload_0 1:getfield #12 // Field content:[C 4:areturn

public static void main(java.lang.String[]); Code: 0:getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream; 3:new #1 // class org/kambanaria/readbytecode/bgoug/Example5 6:dup 7:invokespecial #28 // Method "<init>":()V 10:invokevirtual #29 // Method whatIsThis:()[C 13:invokestatic #31 // Method java/util/Arrays.toString:([C)Ljava/lang/String; 16:invokevirtual #37 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public static void //main(String[] args) { System.out.println( // Arrays.toString( // new Example5() // .whatIsThis()));}

Level 11 – Stack/9

● pop a ➔● pop2 ba ➔● dup a aa➔● dup_x1 ba aba➔● dup_x2 cba acba➔● dup2 ba baba➔● dup2_x1 cba bacba➔● dup2_x2 dcba badcba➔● swap ba ab➔

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field s:Ljava/lang/String; 17: return

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field s:Ljava/lang/String; 17: return

public void //whatIsThis(String s) { if (null == s) { throw new NullPointerException(); } this.s = s;}

Level 12 – conditions, branches, loops/19

● ifeq, ifne, iflt, ifge, ifgt, ifle● if_icmpeq, if_icmpne, if_icmplt, if_icmpge,

if_icmpgt, if_icmple● if_acmpeq, if_acmpne● ifnull, ifnonnull● goto, jsr, ret

193 of 206

94%

Example 7

public static int parse(java.lang.String); Code: 0: aload_0 1: invokestatic #16 // Method

java/lang/Integer.parseInt:(Ljava/lang/String;)I 4: ireturn 5: astore_1 6: iconst_0 7: ireturn Exception table: from to target type 0 4 5 Class java/lang/NumberFormatException

public static int parse(String s) { try { return Integer.parseInt(s); } catch (NumberFormatException e) { return 0;}

Example 8

public class org.kambanaria.readbytecode.bgoug.Example8 { static final boolean $assertionsDisabled;

static {}; Code: 0: ldc #1 // class org/kambanaria/readbytecode/bgoug/Example8 2: invokevirtual #10 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #16 // Field $assertionsDisabled:Z 16: return

public class Example8 { private static String repeat(String s) { assert s != null; return s + s; }}

Example 8private static java.lang.String repeat(java.lang.String); Code: 0:getstatic #16 // Field $assertionsDisabled:Z 3:ifne 18 6:aload_0 7:ifnonnull 18 10:new #28 // class java/lang/AssertionError 13:dup 14:invokespecial #30 // Method java/lang/AssertionError."<init>":()V 17:athrow 18:new #31 // class java/lang/StringBuilder 21:dup 22:aload_0 23:invokestatic #33 // Method

java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26:invokespecial #39 // Method

java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 29:aload_0 30:invokevirtual #42 // Method

java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 33:invokevirtual #46 // Method

java/lang/StringBuilder.toString:()Ljava/lang/String; 36:areturn }

Now You Know

Beware Asserts In Public Methods!

Example 9

package org.kambanaria.readbytecode.bgoug;

public class Example9 { public class Inner {}

public static void // main(String[] args) throws Exception { Example9 exmpl = Example9.class.newInstance(); Inner innr = Inner.class.newInstance(); }}

java -cp bin/ org.kambanaria.readbytecode.bgoug.Example9Exception in thread "main" java.lang.InstantiationException:

org.kambanaria.readbytecode.bgoug.Example9$Inner at java.lang.Class.newInstance0(Class.java:357) at java.lang.Class.newInstance(Class.java:325) at org.kambanaria.readbytecode.bgoug.Example9.main(Example9.java:9)

Example 9public class org.kambanaria.readbytecode.bgoug.Example9 { public OKRB.Example9(); Code: 0:aload_0 1:invokespecial #8 // Method java/lang/Object."<init>":()V 4:return…}

public class org.kambanaria.readbytecode.bgoug.Example9$Inner { final OKRB.Example9 this$0; public OKRB.Example9$Inner(OKRB.Example9); Code: 0:aload_0 1:aload_1 2:putfield #10 //Field this$0:Lorg/kambanaria/readbytecode/bgoug/Example9; 5:aload_0 6:invokespecial #12 // Method java/lang/Object."<init>":()V 9:return }

Example 9

package org.kambanaria.readbytecode.bgoug;

public class Example9 { public class Inner {}

public static void // main(String[] args) throws Exception { Example9 exmpl = new Example9(); Inner innr = exmpl.new Inner(); }}

Further resources

● Oracle: The JVM Specification, Java SE 7 Edition● A. Arhipov:

Java Bytecode For Discriminating Developers● Wikipedia: Java Bytecode Instruction Listings● S. H. Park Understanding JVM Internals● C. McGlone: Looking "Under the Hood" with javap● P. Haggar: Java bytecode● C. Nutter: JVM Bytecode for Dummies

Presentation background

● Alexander Wilms: Hexagons