Generics in .NET, C++ and Java

30
GENERICS Sasha Goldshtein blog.sashag.net | @goldshtn

description

Implementation details and performance traits of generics in .NET, Java and C++. Presentation for the Jerusalem .NET/C++ User Group by Sasha Goldshtein.

Transcript of Generics in .NET, C++ and Java

Page 1: Generics in .NET, C++ and Java

GENERICSSasha Goldshtein

blog.sashag.net | @goldshtn

Page 2: Generics in .NET, C++ and Java

What Is This Talk?• Understanding how generics are implemented in C++,

Java and .NET at the runtime and machine-code level• Understanding the performance implications and other

pros/cons of each mechanism

• We will not learn how to use generics

Page 3: Generics in .NET, C++ and Java

Why Do We Want Them?• “Pure” object-oriented programming does not always

provide a clean and type-safe solution with good performance

• In other words, what’s wrong here?

public class ArrayList { object[] items; public void Add(object item) { ... } public object ElementAt(int index) { ... }}

Page 4: Generics in .NET, C++ and Java

The C++ Approach

“Templates, the smart macros from hell.”

• Use parameterized template as a sketch• No constraints on the original template code• Everything happens at compile-time

template <typename RanIt>void sort(RanIt begin, RanIt end) { … if (*begin < *(begin+1)) …}

Page 5: Generics in .NET, C++ and Java

C++ Template Definition

template <typename T>class vector { T* data; int size; int cap;public: vector(int capacity) { ... } void push_back(const T& datum) { ... } T operator[](int index) const { ... }};

Page 6: Generics in .NET, C++ and Java

C++ Template Instantiation

You say:

vector<int> v(2);

Compiler says:

class __vector__int__ { int* data; int size; int cap;public: vector(int capacity) { ... }};

Page 7: Generics in .NET, C++ and Java

C++ Template Instantiation

You say:

vector<int> v(2);v.push_back(42);

Compiler says:

class __vector__int__ { int* data; int size; int cap;public: vector(int capacity) { ... } void push_back(const int& datum) { ... }};

Page 8: Generics in .NET, C++ and Java

C++ Template Instantiation

You say:

vector<EmptyClass> v(2);sort(v.begin(), v.end());

Compiler says:error C2784: 'bool std::operator <(const std::vector<_Ty,_Alloc> &,const std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::vector<_Ty,_Alloc> &' from 'EmptyClass' vector(1724) : see declaration of 'std::operator <' templatesstuff.cpp(20) : see reference to function template instantiation 'void sort<std::_Vector_iterator<_Myvec>>(RanIt,RanIt)' being compiled with [ _Myvec=std::_Vector_val<std::_Simple_types<EmptyClass>>, RanIt=std::_Vector_iterator<std::_Vector_val< std::_Simple_types<EmptyClass>>> ]

Page 9: Generics in .NET, C++ and Java

The C++ Approach—Pros and Cons

Pros

• No performance cost• Very flexible• Full compile-time type

safety

Cons

• Can’t share templates between translation units

• Can’t share templates between libraries (code bloat)

• Can’t reliably export templates from libraries

• No constraints = no readable compiler errors

Page 10: Generics in .NET, C++ and Java

The Java Approach• Use parameterized template as a compiler aid• Constraints used to prove things to the compiler• Erase type information at runtime

public class LinkedList<E> { private LinkedList<E> head; private E value; public void add(E element) { ... } public E getAt(int index) { ... }}

Page 11: Generics in .NET, C++ and Java

Java Generic Type Erasure

There is just one type (raw type) at runtime:

public class LinkedList { private LinkedList head; private Object value; public void add(Object element) { ... } public Object getAt(int index) { ... }}

Page 12: Generics in .NET, C++ and Java

Java Generic Type Constraints

Cannot use anything but java.lang.Object methods without specifying constraint (wildcard):

public class SortedList<E extends Comparable<E>> { ... public void add(E element) { ... if (element.compareTo(other)) ... }}

Page 13: Generics in .NET, C++ and Java

The Java Approach—Pros and Cons

Pros

• Backwards compatible with non-generic Java versions

• Constraint violation results in clear compiler error

• Can share generic types and objects between packages/applications

Cons

• Can’t use generics with primitive types

• Can’t distinguish between generic class instantiations

• Can’t instantiate generic type parameters (“new E”)

• Can’t use type parameters in static methods or fields

Page 14: Generics in .NET, C++ and Java

The .NET Approach• Use parameterized template as a compiler aid and a

runtime code generation sketch for the JIT• Constraints used to prove things to the compiler

public class List<T> { T[] items; int size; int cap; public void Add(T item) { ... } public T this[int index] { get { ... } set { ... } }}

Page 15: Generics in .NET, C++ and Java

Digression: .NET Object Layout

Page 16: Generics in .NET, C++ and Java

.NET Generic Types at Runtime• There is a separate type at runtime for each generic

instantiation, but not necessarily a separate copy of the methods’ code

• Does this method’s machine code depend on T?

public void Add(T item) { if (size < items.Length – 1) { items[size] = item; ++size; } else AllocateAndAddSlow(item);}

Page 17: Generics in .NET, C++ and Java

.NET Generic Code Sharing

Page 18: Generics in .NET, C++ and Java

Concrete Example: Stack PushBasicStack`1[[System.__Canon, mscorlib]].Push(System.__Canon)00260360 57 push edi00260361 56 push esi00260362 8b7104 mov esi,dword ptr [ecx+4]00260365 8b7908 mov edi,dword ptr [ecx+8]00260368 8d4701 lea eax,[edi+1]0026036b 894108 mov dword ptr [ecx+8],eax0026036e 52 push edx0026036f 8bce mov ecx,esi00260371 8bd7 mov edx,edi00260373 e8f4cb3870 call clr!JIT_Stelem_Ref (705ecf6c)00260378 5e pop esi00260379 5f pop edi0026037a c3 ret

Page 19: Generics in .NET, C++ and Java

Concrete Example: Stack PushBasicStack`1[[System.Int32, mscorlib]].Push(Int32)002603c0 57 push edi002603c1 56 push esi002603c2 8b7104 mov esi,dword ptr [ecx+4]002603c5 8b7908 mov edi,dword ptr [ecx+8]002603c8 8d4701 lea eax,[edi+1]002603cb 894108 mov dword ptr [ecx+8],eax002603ce 3b7e04 cmp edi,dword ptr [esi+4]002603d1 7307 jae 002603da002603d3 8954be08 mov dword ptr [esi+edi*4+8],edx002603d7 5e pop esi002603d8 5f pop edi002603d9 c3 ret002603da e877446170 call clr!JIT_RngChkFail (70874856)002603df cc int 3

Page 20: Generics in .NET, C++ and Java

Concrete Example: Stack PushBasicStack`1[[System.Double, mscorlib]].Push(Double)00260420 56 push esi00260421 8b5104 mov edx,dword ptr [ecx+4]00260424 8b7108 mov esi,dword ptr [ecx+8]00260427 8d4601 lea eax,[esi+1]0026042a 894108 mov dword ptr [ecx+8],eax0026042d 3b7204 cmp esi,dword ptr [edx+4]00260430 730c jae 0026043e00260432 dd442408 fld qword ptr [esp+8]00260436 dd5cf208 fstp qword ptr [edx+esi*8+8]0026043a 5e pop esi0026043b c20800 ret 80026043e e813446170 call clr!JIT_RngChkFail (70874856)00260443 cc int 3

Page 21: Generics in .NET, C++ and Java

Type-Specific Code• What about new T[12] or typeof(T).FullName?• When .NET generic methods need access to T, they get it

from the method table (this or hidden parameter)

• …Unless the type parameters are value types, in which case the MT is hard-coded into the method:

C#:

Foo<T>() { … typeof(T) … } T=intMachine code:

mov ecx,offset 798b6844 (MT: System.Int32)call clr!JIT_GetRuntimeType (6ca40aa8)

Page 22: Generics in .NET, C++ and Java

Generics and Reflection• Because generic types are first-class citizens, they are

accessible to Reflection at runtime

Type to = typeof(Dictionary<,>);Type tc = to.MakeGenericType( typeof(string), typeof(int));

to = typeof(List<double>).GetGenericTypeDefinition();tc = to.MakeGenericType(typeof(int)); //List<int>

Page 23: Generics in .NET, C++ and Java

Generic Constraints• .NET constraints restrict type parameters at compile-time,

very similar to Java’s• Only a limited set of constraints available:

• Interface constraint: where T : IComparable<T>• Base constraint: where T : UserControl• Category constraint: where T : class or where T : struct

• Constructor constraint: where T : new()

Note that constraints don’t break the machine code equivalence for reference types. Why?

Page 24: Generics in .NET, C++ and Java

Case Study: IEquatable<T>public static void CallEquals<T>(T inst) { inst.Equals(inst);}

public struct Point { public int X, Y; public override bool Equals(object o) { if (o is Point) return Equals((Point)o); return false; } public bool Equals(Point pt) { ... }}

Page 25: Generics in .NET, C++ and Java

Case Study: IEquatable<T>• CallEquals has no constraints, so the C# compiler

chooses the Object.Equals(Object) virtual method• We can add an interface constraint with a strongly-typed Equals method—now the compiler prefers it• Note: the interface call has no virtual cost on value types

public static void CallEquals<T>(T inst) where T : IEquatable<T>{ inst.Equals(inst);}

Page 26: Generics in .NET, C++ and Java

Sorting “If Possible”, a la C++public class List<T> { T[] items; ... public void Add(T item) { ... } public void Sort(SortProvider<T> sorter = null) { sorter = sorter ?? SortProvider<T>.GetDefault(); if (sorter == null) throw new NotImplementedException(); sorter.Sort(items); }}

Page 27: Generics in .NET, C++ and Java

Sorting “If Possible”, a la C++public abstract class SortProvider<T> { public abstract void Sort(T[] items); public static SortProvider<T> GetDefault() { if (T is IComparable<T>) return new DefaultSortProvider<T>(); if (T is IGreaterthanable<T>) return new GreaterThanSortProvider<T>(); return null; }}internal class DefaultSortProvider<T> : SortProvider<T> where T : IComparable<T> { //Use T.CompareTo for sorting}

Page 28: Generics in .NET, C++ and Java

Getting Generic Math Right in .NET• Pretty nasty:• Consider Complex<T>: you can’t implement operators…• Solution sketch:

• Define ICalculator<T> with methods instead of operators• Implement ICalculator<T> for each T• Choose between ICalculator<T>’s implementations at runtime,

and use them in your generic math code• For more:

http://www.codeproject.com/Articles/8531/Using-generics-for-calculations

Page 29: Generics in .NET, C++ and Java

The .NET Approach—Pros and Cons

Pros

• Constraint violation results in clear compiler error

• Can share generic types and objects between packages/applications

• Can use generics efficiently with value types

• Can use Reflection to query over generic types

Cons

• Constraints are not enough for everything (e.g., generic math)

• No meta-programming abilities (advantage?)

Page 30: Generics in .NET, C++ and Java

QUESTIONS?Sasha Goldshtein

blog.sashag.net | @goldshtn