Lifting The Veil - Reading Java Bytecode

Post on 10-May-2015

1.531 views 6 download

Tags:

description

Java Bytecode Explained

Transcript of Lifting The Veil - Reading Java Bytecode

Lifting The Veil – Reading Java Byte Code

Alexander Shopov

Alexander Shopov

By day: Software Engineer at CiscoBy night: OSS contributorCoordinator of Bulgarian Gnome TP

Contacts:E-mail: ash@kambanaria.orgJabber: al_shopov@jabber.minus273.orgLinkedIn: http://www.linkedin.com/in/alshopovGoogle: Just search “al_shopov“

Please Learn And Share

License: CC-BY v3.0Creative Commons Attribution v3.0

Disclaimer

My opinions, knowledge and experience!

Not my employer's.

Contents

● Why read?● How to read?

● JVM Internals;● JVM Data Types;● JVM Opcodes.

● Let's read some code.● What next?

Why Read Byte code?

● Understand your platform● It is interesting and not too hard● How does Java function? How does X function?● Job interviews● Catch compiler bugs/optimizations● Learn to read before you write● Source may not correspond to binary● C/C++ people know their assembler● Java language evolution vs. Java platform evolution

Bad News And Good News

Bad:We will be

reading assembler

Good:Easiest

assembler in world

What Is The JVM?

● Stack based, byte oriented virtual machine without registers easily implementable on 32 bit hardware.

● 206 (<256) instructions that are easy to group and there is no need to remember them all

● Some leeway in implementations (even with Oracle)

Dramatis Personæ

● The JVM● The threads● The frames● The stacks – LIFO● The local variables – array of slots● The runtime constant pool – array of values● The bytecode – the instructions● Class files – serialized form of constants and byte

code

Enter JVM

JVM OS process

Enter Threads

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

Enter Frames

Thr

ead

A

Thr

ead

B

Thr

ead

C

Thr

ead

D

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

Enter Frames, Really!

F0

F1

F2

F3

F4

F0

F1

F2

F0

F1

F0

F1

F2

F3

What Is A Frame Actually?

F0

Let's Peek Inside A Frame

F0

F0

Enter Local Variables

0 1 2 3 4 5 6 …

Local variables

F0

Enter Stack

0 1 2 3 4 5 6 …

Local variables

Stack

F0

Enter Pool Of Constants

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

JVM (heap)

F0

Where Is The Code?

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

JVM (heap)

F0

Where Is The Code?

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

JVM (heap)

F0

Where is the code?

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

Method codePC

Class

JVM (heap)

F0

Load

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

Stack

6

Method codePC

Class

JVM (heap)

F0

And…

6

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

JVM (heap)

F0

Store

6 8

Cla

ss

Pool of constants

0 1 2 3 4 5 6 …

Local variables

8

Stack

6

Method codePC

Class

JVM Datatypes

● Primitive types● Java { numeric – integral: byte (±8), short (±16),

int (±32), long (±64), char (+16), floating point: float (±32), double (±64); boolean (int or byte) }

● returnAddress – pointers to the opcodes of JVM (jumps - loops)

● Reference types● class, array, interface● null

JVM Datatypes Descriptors

Java type Type descriptor

boolean Z

char C

byte B

short S

int I

float F

long J

double D

Object Ljava/lang/Object;

byte[] [B

String[][] [[Ljava/lang/String;

void V

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4()

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

(Ljava/lang/Object;Ljava/lang/Long;)J

JVM Method Descriptors

Source Code Method declaration

Method Descriptor

void m1(int i, double d, float f)

(IDF)V

byte[] m2(String s) (Ljava/lang/String;)[B

Object m3(int[][][] i) ([[[I)Ljava/lang/Object;

boolean[] m4() ()[B

long m5(Object, Long) (Ljava/lang/Object;Ljava/lang/Long;)J

206 instructions

DON'T PANIC!

Level 1 – Do Nothing/1

● nop

Level 2 – Load Constants/20

● aconst_null, ● iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3,

iconst_4, iconst_5● lconst_0, lconst_1, ● fconst_0, fconst_1, fconst_2● dconst_0, dconst_1● bipush, sipush – 1, 2 bytes● ldc, ldc_w, ldc2_w – load from index in constant

pool 1,2,2 bytes for index

Level 3 – Load Variables/33

● iload, lload, fload, dload, aload● iload_0, iload_1, iload_2, iload_3, lload_0,

lload_1, lload_2, lload_3, fload_0, fload_1, fload_2, fload_3, dload_0, dload_1, dload_2, dload_3, aload_0, aload_1, aload_2, aload_3

● iaload, laload, faload, daload, aaload, baload, caload, saload – consume reference to array and int index in it

Level 4 – Conversions/15

● i2l, i2f, i2d, l2i, l2f, l2d, f2i, f2l, f2d, d2i, d2l, d2f, i2b, i2c, i2s

Level 6 – Maths/37

● iadd, ladd, fadd, dadd, isub, lsub, fsub, dsub, imul, lmul, fmul, dmul, idiv, ldiv, fdiv, ddiv, irem, lrem, frem, drem, ineg, lneg, fneg, dneg, ishl, lshl, ishr, lshr, iushr, lushr, iand, land, ior, lor, ixor, lxor

● Iinc - increment local variable #index by signed byte const

Level 7 – Stores/33

● istore, lstore, fstore, dstore, astore, istore_0, istore_1, istore_2, istore_3, lstore_0, lstore_1, lstore_2, lstore_3, fstore_0, fstore_1, fstore_2, fstore_3, dstore_0, dstore_1, dstore_2, dstore_3, astore_0, astore_1, astore_2, astore_3, iastore, lastore, fastore, dastore, aastore, bastore, castore, sastore

Level 8 – No-branch Comparisons/5

● lcmp, fcmpl, fcmpg, dcmpl, dcmpg (beware NaN)

Level 9 – Objects/15

● getstatic, putstatic● getfield, putfield● invokevirtual, invokespecial, invokestatic,

invokeinterface● new, newarray, anewarray● arraylength● athrow● checkcast, instanceof (difference is treatment of

null)

Level 10 – Return/6

● ireturn, lreturn, freturn, dreturn, areturn, return

165 of 206

81%

We Have Enough Mana/Resources!

Let's dive in bytecode!

Enter Bytecode

javap – your only true friend now

javap -classpath PATH -p -c -l -s CLASS

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

Example 1

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn

public static int whatIsThis (int a, int b, int c) {int result = a + b;result += c;return result;}

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

Example 2

public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn

public static int whatIsThis (int a, int b, int c) {result a + b + c;}

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

Example 3

public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D

public static int whatIsThis (int a, float b, double c) {

return (int) (a + b + c);}

Example 4

public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

More verbosity

javap -v -classpath PATH -p -c -l -s CLASS

Example 4

Constant pool: #1 = Class #2 // org/kambanaria/readbytecode/bgoug/Example4 #2 = Utf8 org/kambanaria/readbytecode/bgoug/Example4… #16 = Fieldref #17.#19 // java/lang/System.out:Ljava/io/PrintStream;… #22 = String #23 // BGOUG #23 = Utf8 BGOUG #24 = Methodref #25.#27 // java/io/PrintStream.println:(Ljava/lang/String;)V…

Example 4

public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return

public static void main (String[] args){

System.out.println("BGOUG");}

// Hello, BGOUG!

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public char[] whatIsThis() { return content; }

Example 5

public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn

public static void main(java.lang.String[]); Code: 0: getstatic #22

java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class

org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method

java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method

java/io/PrintStream.println:(Ljava/lang/String;)V 19: return

public static void main (String[] args) { System.out.println(

Arrays.toString( new Example5(). whatIsThis()));}

Level 11 – Stack/9

● pop a ➔● pop2 ba ➔● dup a aa➔● dup_x1 ba aba➔● dup_x2 cba acba➔● dup2 ba baba➔● dup2_x1 cba bacba➔● dup2_x2 dcba badcba➔● swap ba ab➔

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class

java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field

s:Ljava/lang/String; 17: return

Example 6

public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class

java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method

java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field

s:Ljava/lang/String; 17: return

public void whatIsThis(String s) { if (null == s) { throw new NullPointerException(); } this.s = s;}

Level 12 – conditions, branches, loops/19

● ifeq, ifne, iflt, ifge, ifgt, ifle● if_icmpeq, if_icmpne, if_icmplt, if_icmpge,

if_icmpgt, if_icmple● if_acmpeq, if_acmpne● ifnull, ifnonnull● goto, jsr, ret

193 of 206

94%

Example 7

public static int parse(java.lang.String); Code: 0: aload_0 1: invokestatic #16 // Method java/lang/Integer.parseInt:(Ljava/lang/String;)I 4: ireturn 5: astore_1 6: iconst_0 7: ireturn Exception table: from to target type 0 4 5 Class java/lang/NumberFormatException

public static int parse(String s) {try {

return Integer.parseInt(s);} catch (NumberFormatException e) {

return 0;}

}

Example 8

public class org.kambanaria.readbytecode.bgoug.Example8 { static final boolean $assertionsDisabled; static {}; Code: 0: ldc #1 // class org/kambanaria/readbytecode/bgoug/Example8 2: invokevirtual #10 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #16 // Field $assertionsDisabled:Z 16: return

public class Example8 {private static String repeat(String s){

assert s != null;return s + s;

}}

Example 8 private static java.lang.String repeat(java.lang.String); Code: 0: getstatic #16 // Field $assertionsDisabled:Z 3: ifne 18 6: aload_0 7: ifnonnull 18 10: new #28 // class java/lang/AssertionError 13: dup 14: invokespecial #30 // Method java/lang/AssertionError."<init>":()V 17: athrow 18: new #31 // class java/lang/StringBuilder 21: dup 22: aload_0 23: invokestatic #33 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26: invokespecial #39 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 29: aload_0 30: invokevirtual #42 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 33: invokevirtual #46 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 36: areturn

Now You Know

Beware Asserts In Public Methods!

Further resources

● Oracle: The JVM Specification, Java SE 7 Edition

● A. Arhipov: Java Bytecode For Discriminating Developers

● Wikipedia: Java Bytecode Instruction Listings● S. H. Park Understanding JVM Internals● C. McGlone:

Looking "Under the Hood" with javap● P. Haggar: Java bytecode

Presentation background

● Alexander Wilms: Hexagons