# Saturday, 31 May 2008
Introducing CallerID Part 1

In the Java API there are a number of APIs that rely on being able to walk up the call stack to determine certain properties of the immediate caller of the API. The most common scenario is determining the caller's class or class loader. This information is used to implement security checks or to use the right class loader.

Up until now IKVM used the System.Diagnostics.StackFrame .NET API to walk the stack. There are however two major issues with this API. The first issue is that it is unreliable and IKVM jumps through various hoops to make sure that the CLR doesn't inline methods it shouldn't inline or use tail call optimizations it shouldn't use. The second issue is that it is relatively slow.

I finally got around to implementing an alternative scheme that eliminates the usage of System.Diagnostics.StackFrame in most cases. In part 2 I'll describe the implementation, but for now let's just look at some performance numbers.

I've cooked up two microbenchmarks that clearly demonstrate the difference between the current approach and the new approach.


In this microbenchmark we look at the Class.forName() API. It walks the call stack to determine the class loader of the immediate caller, because that is the class loader that it needs to use to load the class.

class ForName {
  public static void main(String[] args) throws Exception {
    long start = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
    long end = System.currentTimeMillis();
    System.out.println(end - start);

ForName results:

  x86 x64
IKVM 0.37.3063             4350     3959
IKVM 0.37.3073 45 49
JDK 1.6 138 97

(The difference between JDK 1.6 x86 and x64 is mostly due to x86 defaulting to HotSpot Client and x64 to HotSpot Server.)


The Method.invoke() API uses the caller class to perform the required access checks when you are calling a non-public method or a method in non-public class.

class Invoke {
  public static void main(String[] args) throws Exception {
    java.lang.reflect.Method m = Invoke.class.getDeclaredMethod("foo");
    long start = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
    long end = System.currentTimeMillis();
    System.out.println(end - start);

  private static void foo() { }

Invoke results:

  x86 x64
IKVM 0.37.3063             6083     5572
IKVM 0.37.3073 14 16
JDK 1.6 82 39

To be continued in part 2...

Saturday, 31 May 2008 17:20:40 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Friday, 23 May 2008
Invokedynamic Proof of Concept Part 2

Rémi Forax inspired me to make my invokedynamic proof of concept a little more real, so I implemented it (still only as a proof of concept, not production quality code) in an IKVM 0.37 fork.

I used the following Java test case:

interface java_dyn_Dynamic
  void printSubstring(int startIndex, int length);
  int length();

class InvokeDynamic
      java.lang.reflect.Method m = InvokeDynamic.class.getDeclaredMethod("bootstrapInvokeDynamic",
                                     java.dyn.CallSite.class, Object.class, Object[].class);
      java.dyn.MethodHandle mh = java.dyn.MethodHandles.unreflect(m);
      java.dyn.Linkage.registerBootstrapMethod(InvokeDynamic.class, mh);
    catch (Exception x)

  public static void main(String[] args)
    java_dyn_Dynamic obj = (java_dyn_Dynamic)(Object)"invokedynamic";
    for (int i = 0; i < 3; i++)
      System.out.println("len = " + obj.length());
      obj.printSubstring(3, 2);

  public static void printSubstring(String str, int startIndex, int length)
    System.out.println(str.substring(startIndex, startIndex + length));

  private static Object bootstrapInvokeDynamic(java.dyn.CallSite cs, Object receiver, Object[] args)
    throws Exception
    if (cs.getStaticContext().getName().equals("length"))
      java.lang.reflect.Method m = String.class.getMethod("length");
      return m.invoke(receiver);
      Class c = cs.getStaticContext().getCallerClass();
      java.lang.reflect.Method m = c.getMethod("printSubstring", String.class, int.class, int.class);
      return m.invoke(null, receiver, args[0], args[1]);

After compiling this with javac I used a hex editor to patch the file (changed java_dyn_Dynamic to java/dyn/Dynamic and C0 00 03 to 00 00 00 to remove the checkcast instruction).

Current limitations:

  • java.dyn.MethodType is not implemented.
  • MethodHandles can only be created on methods with a couple of arguments and from the primitive types only int can be used as an argument or return type.
  • Not all of the IKVM specific corner cases are supported (e.g. calling .NET methods or methods on remapped types).
  • The CallSite and MethodHandle implementation objects do not have a proper class (i.e. CallSite.getClass() doesn't work).
  • Only a small subset of the MethodHandle functionality is implemented (e.g. no bound or Java method handles).

None of these limitations pose any interesting challenges, so implementing them does not add much value to this proof of concept.


Friday, 23 May 2008 15:17:04 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Wednesday, 21 May 2008
Invokedynamic Proof of Concept

A few days ago the JSR 292 expert group released an early draft review for the invokedynamic instruction.

To investigate how this would work out for IKVM, I cooked up a proof of concept in C#.

class InvokeDynamicPoC
  private static InvokeDynamic<object, CallSite, object, object[]> __bootstrapMethod =
  private static CallSiteImpl<InvokeDynamicVoid<string, int, int>> __site1 =
    new CallSiteImpl<InvokeDynamicVoid<string, int, int>>(
      new StaticContextImpl(typeof(InvokeDynamicPoC), "printSubstring"));

  private static void Main()
    for (int j = 0; j < 2; j++)
      int start = Environment.TickCount;
      for (int i = 0; i < 10/*0000000/**/; i++)
        string obj = "invokedynamic";
        int arg1 = 3;
        int arg2 = 2;
        MethodHandleImpl<InvokeDynamicVoid<string, int, int>> mh = __site1.mh;
        if (mh == null)
          __bootstrapMethod(__site1, obj, new object[] { java.lang.Integer.valueOf(arg1),
                                                         java.lang.Integer.valueOf(arg2) });
          mh.d(obj, arg1, arg2);
      int end = Environment.TickCount;
      Console.WriteLine(end - start);

  private static void printSubstring(string s, int startIndex, int length)
    Console.WriteLine(s.Substring(startIndex, length));

  private static object bootstrapInvokeDynamic(CallSite cs, object receiver, object[] arguments)
    java.lang.Class c = typeof(InvokeDynamicPoC);
    java.lang.reflect.Method m = c.getDeclaredMethod(cs.getStaticContext().getName(),
                                                     typeof(string), typeof(int), typeof(int));
    MethodHandle mh = MethodHandles.unreflect(m);
    return m.invoke(null, receiver, arguments[0], arguments[1]);

The full (compilable and working) source is available here.

Wednesday, 21 May 2008 09:56:49 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
New Development Snapshot

More .NET 2.0 work and bug fixing, but the biggest change is that class library initialization has been refactored (building on the property support added in the previous snaphot which allows java.lang.System to lazily initialize the I/O streams). Instead of it being an all or nothing approach (i.e. initialize all core classes at once, like in Java) the initialization dependencies have been mostly removed and each part can now be initialized on demand. This makes the code easier to follow, startup more efficient and removed the need for some ugly hacks to make initialization work correctly.

One thing isn't yet working, if you specify a Java security manager in your app.config it isn't guaranteed that the security manager will be installed immediately (currently it's only installed the first time ClassLoader.getSystemClassLoader() is called.) I'm not sure if it is worth fixing this, because using the Java security manager in this way doesn't seem to make much sense (i.e. if you are running your application as a .NET application, why would you use the Java security manager to restrict what it can do, it won't be able do a good job anyway, since it doesn't know about your non-Java .NET code that directly calls .NET Framework APIs.)

Changes since previous development snapshot:

  • Call suppressFillInStackTrace before instantiating a remapped exception in the remap implementation method.
  • Added assembly location to verbose class cast exception if the assembly fullnames matches but the locations don't.
  • Create the generic delegate type before compiling the rest of the core class library, to allow the core class library to use delegates.
  • Fixed name mangling bug. Dots in nested type names should be mangled, because they shouldn't affect the package name.
  • Include exception message in ClassCastException.
  • Added hack to support instantiating fake enums for types loaded in ReflectionOnly (to support custom attribute annotations that have enum values in ikvmstub).
  • Added -reference option to ikvmstub to load referenced assemblies from a specific location.
  • Refactored class library initialization.
  • Implemented Runtime.availableProcessor() using .NET 2.0 API.
  • Moved most java.lang.System "native" methods to the Java side.
  • Moved java.lang.Thread "native" methods to Java.
  • Implemented support for specifying Thread stack size.
  • Fix deserialization of double arrays.
  • Added more efficient float/double to/from int/long bits converters.
  • Made Double.doubleToRawLongBits/longBitsToDouble and Float.floatToRawIntBits/intBitsToFloat intrinsics.
  • Generalized the support infrastructure for adding compiler intrinsics.
  • Fixed Graphics2D.rotate() to convert rotation angle from radians (Java) to degrees (.NET).
  • Fixed libikvm-native.so build to include reference to gmodule-2.0 library.
  • Fixed build issue on Mono (gmcs implicitly references System.Core.dll and that caused a conflict with our locally defined ExtensionAttribute).
  • Fixed ikvmc not to open the key file for write access.
  • Added workarounds for mcs compiler bug (related to the mutual dependencies of IKVM assemblies).
  • Restructured code to remove (mcs) compiler warnings.


Development snapshots are intended for evaluating and keeping track of where the project is going, not for production usage. The binaries have not been extensively tested and are not strong named.

If you want to run this version on Mono, you'll need a Mono version built from recent svn, it does not work on Mono 1.9.

Binaries available here: ikvmbin-0.37.3063.zip

Wednesday, 21 May 2008 07:16:08 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Friday, 09 May 2008
Compiler Intrinsics

Most compilers have some (or in some cases many) intrinsic functions. HotSpot has a number of them (see here, search for "intrinsics known to the runtime") as does the CLR JIT. IKVM has had a couple as well (System.arraycopy(), AtomicReferenceFieldUpdater.newUpdater(), String.toCharArray()). These were sort of hacked into the compiler and I finally decided to clean that up a little and add more scalable support for adding intrinsincs. The trigger to do this was that I added four more intrinsics: Float.floatToRawIntBits(), Float.intBitsToFloat(), Double.doubleToRawLongBits() and Double.longBitsToDouble().


Here's a micro benchmark:

public class test {
  public static void main(String[] args) {
    long sum = 1;
    long start = System.currentTimeMillis();
    for (int i = 0; i < 10000000; i++) {
      sum += Double.doubleToRawLongBits(sum);
    long end = System.currentTimeMillis();
    System.out.println(end - start);

Here are the results:

         x86 (aligned)     x86 (unaligned)                      x64
JDK 1.6 HotSpot Server VM    287   109
JDK 1.6 HotSpot Client VM 335    
IKVM 0.36 .NET 1.1 479 565  
IKVM 0.36 .NET 2.0 570 704 124
IKVM 0.37 338 468 101

Since the x86 .NET results are highly sensitive as to whether the double on the stack happens to be aligned or not, I included both results.


Here's the MSIL that IKVM generates for the loop:

IL_000b:   ldloc.2
IL_000c:   ldc.i4    0x989680
IL_0011:   bge       IL_0028
IL_0016:   ldloc.0
IL_0017:   ldloc.0
IL_0018:   conv.r8
IL_0019:   ldloca.s  V_3
IL_001b:   call      int64 [IKVM.Runtime]IKVM.Runtime.DoubleConverter::ToLong(float64,
                     valuetype [IKVM.Runtime]IKVM.Runtime.DoubleConverter&)
IL_0020:   add
IL_0021:   stloc.0
IL_0022:   ldloc.2
IL_0023:   ldc.i4.1
IL_0024:   add
IL_0025:   stloc.2
IL_0026:   br.s      IL_000b

The conversion isn't actually inlined, but instead a local variable of value type IKVM.Runtime.DoubleConverter is added to the method and a static method on that type that takes the value to be converted and a reference to the local variable is called. Here's the code for IKVM.Runtime.DoubleConverter:

public struct DoubleConverter
  private double d;
  private long l;

  public static long ToLong(double value, ref DoubleConverter converter)
    converter.d = value;
    return converter.l;

  public static double ToDouble(long value, ref DoubleConverter converter)
    converter.l = value;
    return converter.d;

It uses the .NET feature that allows you to explicitly control the layout of a struct  to overlay the double and long fields. Note that this construct is fully verifiable.

For comparison, the standard System.BitConverter.DoubleToInt64Bits() uses unsafe code and looks something like this:

public static unsafe long DoubleToInt64Bits(double value)
  return *((long*)&value);

For some reason (probably because it isn't verifiable) the JIT doesn't like this so much and doesn't inline this method.

JIT Code

Here's the x86 code generated by the .NET 2.0 SP1 JIT:

049E15CE  cmp    ebx,989680h
049E15D4  jge    049E1600
049E15D6  lea    ecx,[esp+8]
049E15DA  mov    dword ptr [esp+10h],esi
049E15DE  mov    dword ptr [esp+14h],edi
049E15E2  fild   qword ptr [esp+10h]
049E15E6  fstp   qword ptr [esp+10h]
049E15EA  fld    qword ptr [esp+10h]
049E15EE  fstp   qword ptr [ecx]
049E15F0  mov    eax,dword ptr [ecx]
049E15F2  mov    edx,dword ptr [ecx+4]
049E15F5  add    eax,esi
049E15F7  adc    edx,edi
049E15F9  mov    esi,eax
049E15FB  mov    edi,edx
049E15FD  inc    ebx
049E15FE  jmp    049E15CE

Here's the x64 code generated by the .NET 2.0 SP1 JIT:

00000642805B8A90  cmp        ecx,989680h
00000642805B8A96  jge        00000642805B8AB1
00000642805B8A98  cvtsi2sd   xmm0,rdi
00000642805B8A9D  lea        rax,[rsp+20h]
00000642805B8AA2  movsd      mmword ptr [rax],xmm0
00000642805B8AA6  mov        rax,qword ptr [rax]
00000642805B8AA9  add        rdi,rax
00000642805B8AAC  add        ecx,1
00000642805B8AAF  jmp        00000642805B8A90

In both cases the construct is inlined properly. It is also obvious why the x64 code is so much faster, it uses SSE (as we've seen before) and only uses one memory store/load combination.


For completeness, here's the code generated by HotSpot x64:

0000000002772EA0  cvtsi2sd   xmm0,r11
0000000002772EA5  add        ebp,10h
0000000002772EA8  movsd      mmword ptr [rsp+20h],xmm0
0000000002772EAE  mov        r10,qword ptr [rsp+20h]
0000000002772EB3  add        r10,r11
0000000002772EB6  cvtsi2sd   xmm0,r10
0000000002772EBB  movsd      mmword ptr [rsp+20h],xmm0
0000000002772EC1  mov        r11,qword ptr [rsp+20h]
0000000002772EC6  add        r11,r10
0000000002772EC9  cvtsi2sd   xmm0,r11
0000000002772ECE  movsd      mmword ptr [rsp+20h],xmm0
0000000002772ED4  mov        r10,qword ptr [rsp+20h]
0000000002772ED9  add        r10,r11
0000000002772FC0  cvtsi2sd   xmm0,r10
0000000002772FC5  movsd      mmword ptr [rsp+20h],xmm0
0000000002772FCB  mov        r11,qword ptr [rsp+20h]
0000000002772FD0  add        r11,r10
0000000002772FD3  cmp        ebp,r9d
0000000002772FD6  jl         0000000002772EA0

It actually unrolled the loop 16 times (which appears not be helping in the case), but otherwise the code generated is pretty similar to what we saw on the CLR. Of course, in HotSpot Double.doubleToRawIntBits() is also an intrinsic because in Java the only alternative would be to write it in native code and the JNI transition would add significant overhead in this case.

Friday, 09 May 2008 11:27:51 (W. Europe Daylight Time, UTC+02:00)  #    Comments [3]
# Monday, 05 May 2008
IKVM 0.36 Update 2 Release Candidate 1

A couple of fixes.


  • Remapped exceptions with explicit remapping code now call suppressFillInStackTrace (to make sure the proper stack trace is captured).
  • Fixed memory mapped file bug (mapping at a non-zero file offset would fail).
  • Fixed .NET type name mangling for nested types that contain a dot in their name (which the C# 3.0 compiler generates for some private helper types).
  • Fixed java.io.File.getCanonicalPath() to swallow System.NotSupportedException (thrown when the path contains a colon, other than the one following the drive letter).
  • Fixed bug in deserialization of double arrays.

Binaries available here: ikvmbin-
Sources (+ binaries):ikvm-

Monday, 05 May 2008 06:23:35 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Thursday, 17 April 2008
Invalid Casting Goodness

Yesterday Miguel blogged about a nice new feature in Mono. I added the IKVM_VERBOSE_CAST environment variable to IKVM to do something similar a while ago.

public class test {
  public static void main(String[] args) {

C:\j>\ikvm-\bin\ikvm test
Exception in thread "main" java.lang.ClassCastException
        at test.main(test.java)


C:\j>\ikvm-\bin\ikvm test
Exception in thread "main" java.lang.ClassCastException: Object of type "System.String[], mscorlib, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" cannot be cast to "System.String, mscorlib, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
        at test.main(test.java)

Note that the assembly qualified type names are displayed, as I believe this feature is particularly useful when trying to debug issues that arise from having loaded multiple assemblies that contain the "same" types.

While writing this I discovered that both JDK 1.6 and .NET 2.0 always generate descriptive exception messages for invalid casts:

C:\j>\jdk1.6\bin\java test
Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.String; can not be cast to java.lang.String
        at test.main(test.java:5)

C:\j>\ikvm\bin\ikvm test
Exception in thread "main" java.lang.ClassCastException: Unable to cast object of type 'System.String[]' to type 'System.String'.
        at test.main(test.java:5)

This last result is my local ikvm development version running on .NET 2.0 with a patch to enable taking the exception message from the .NET InvalidCastException, which I didn't previously do because on .NET 1.1 this message didn't contain any useful information.

Thursday, 17 April 2008 08:51:38 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Monday, 14 April 2008
New Development Snapshot

It's been quite a while since the last development snapshot. I'm still working on integrating .NET 2.0 features, but I'm also doing random fixes/improvements here and there as I come across them. The biggest visible change in this snapshot is the support for defining .NET properties as Java fields. The main motivation was that I wanted to handle the System.in/out/err fields more cleanly. They are now implemented like this:

public final static InputStream in;

static { in = null; }

private static InputStream get_in()
    return StdIO.in;

This defines a .NET property called "in" and associates the getter method of the property with the get_in() method. Note that we're only specifying the getter here in the Property annotation because the field is final, but you can also specify a setter method. The static initializer that initializes the property to null is necessary to keep javac happy, but it doesn't actually do anything. The ikvm bytecode compiler will ignore any assignments to read-only properties. Another thing to note is that the get_in() will automatically be made public (because the field is public), but from Java it will still appear private.

Changes since previous development snapshot:

  • Fixed regression in return value annotation value.
  • Forked Class, Constructor and Field.
  • Made class annotation handling lazy and bypass encode/decode.
  • Fixed ReflectionOnly referenced assembly loading order (ikvmstub).
  • Initialize class library in JVM_CreateJavaVM.
  • Reintroduced guard against recursive FinishCore invocations.
  • Implemented support for annotations on .NET fields/methods/parameters.
  • Hide ikvmc generated GetEnumerator() method from Java.
  • Simplified annotation handling.
  • Added support to Class.forName() for assembly qualified Java type names.
  • Replaced notion of DynamicOnly types with Fake types. Fake types are implemented as generic type instances and can have DynamicOnly methods.
  • Changed System.nanoTime() implementation to use Stopwatch.GetTimestamp().
  • Ripped out annotation/constant pool support that is no longer needed.
  • Added support for defining unloadable (i.e. missing) types to use as custom modifiers in signatures.
  • Use custom modifiers to make sure constructor signature is unique (if necessary).
  • Restructured code to remove compiler warnings.
  • Updated FlushFileBuffers p/invoke to use SafeFileHandle.
  • Forked OpenJDK sources that are going to be modified to refactor the library initialization.
  • Made __Fields nested class abstract (it was already sealed) and removed the constructor.
  • Restored the special case for interface .cctor methods to fix bug #1930303
  • Added ikvm/internal/MonoUtils.java. A new helper class to contain Mono specific methods.
  • Added Mac OS X platform detection.
  • Fixed System.mapLibraryName() to use platform detection instead of os.name property.
  • Improved java.library.path for Windows, Linux and Mac OS X.
  • Added support for setting os.name and os.ver on Mac OS X.
  • Try to guess os.arch based on IntPtr.Size.
  • Set sun.nio.MaxDirectMemorySize to -1 to allow "unlimited" direct byte buffers.
  • Fixed memory mapped file bug that caused mapping at non-zero file position to fail.
  • Close mapping handle using the Close() method on SafeFileHanlde instead of p/invoking the Win32 API directly.
  • Added support for filenames/paths with colons in them to Win32FileSystem.CanonicalizePath().
  • Added support for turning Java fields into .NET properties with an annotation.
  • Implemented System.in/out/err as .NET properties (explicitly).


Development snapshots are intended for evaluating and keeping track of where the project is going, not for production usage. The binaries have not been extensively tested and are not strong named.

If you want to run this version on Mono, you'll need a Mono version built from recent svn, it does not work on Mono 1.9.

Binaries available here: ikvmbin-0.37.3026.zip

Monday, 14 April 2008 07:40:06 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Monday, 07 April 2008
IKVM 0.36 Update 1 Released

When I released IKVM 0.36 I said I intended to support 0.36 for a longer period since it is the last version that runs on .NET 1.1. Today I've released the first update release of 0.36 to SourceForge. For those who have been tracking the release candidates, this release is identical to release candidate 5.

Changes since IKVM

  • Changed version to IKVM
  • Fix for reflection bug on .NET generic nested types that caused this.
  • Fix for bug #1865922.
  • java.awt.image.Raster fix.
  • Fix bug in DynamicMethod based serialization for fields typed as ghost interfaces.
  • Fixed ikvmc to support referencing assemblies that contain .NET type named java.lang.Object.
  • Improved error handling for ikvmc -reference option.
  • Optimized codegen for lcmp, fcmp, dcmp and shift opcodes.
  • Added support to Class.forName() for loading Java types with assembly qualified type names.
  • Implemented field/method/parameter annotation support for .NET types.
  • Added workaround for .NET 1.1 bug in Directory.CreateDirectory(). (bug #1902154)
  • Added -removeassertions optimization option to ikvmc.
  • Added -removeassertions to IKVM.OpenJDK.ClassLibrary.dll build.
  • Fixed JVM_CreateJavaVM to initialize the class library.
  • Fixed ikvmc to include zero length resource files.
  • Implemented SocketOptions.IP_MULTICAST_IF and SocketOptions.IP_MULTICAST_IF2.
  • Fixed assembly class loader to ignore codebase for dynamic assemblies (previously it would throw an exception).
  • Fixed exception stack trace code to return the .NET name of a type when a method in a primitive type is on the stack.
  • Fixed JNI reflection to filter out HideFromReflection members.
  • Fixed java.net.NetworkInterface to work on pre-Win2K3 systems.
  • Fixed java.lang.Thread to set context class loader for threads started from .NET.

If you want to build from source, you need ikvm-0.36.11.zip, classpath-0.95-stripped.zip, openjdk-b13-stripped.zip and the java.awt.image.Raster fix.

Monday, 07 April 2008 06:15:18 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Monday, 24 March 2008
Generic Algorithms Revisited (Again)

Back in 2003 I described how you could theoretically use value types to efficiently inject behavior into a generic type. Recently I was thinking about this again and decided to look at the code generated by the JIT.

Let's look at a very simple case:

interface IOperation
  void DoIt();

struct Dummy : IOperation
  public void DoIt() { }

public class Program
  static void Gen<T>() where T : IOperation

  static void Main()
    for (; ; )

The .NET 2.0 SP1 x86 JIT generates the following code for the Gen<Dummy>() method:

sub    esp,8
xor    eax,eax
mov    dword ptr [esp],eax
mov    dword ptr [esp+4],eax
mov    byte ptr [esp],0
movsx  eax,byte ptr [esp]
mov    byte ptr [esp+4],al
add    esp,8

The good news is that the Dummy.DoIt() method has been inlined. The bad news is that every single instruction apart from the ret is unnecessary.

In order to understand what is going on, we need to look at the IL generated by the C# compiler for the Gen<T>() method and we also need to know that the CLR treats zero length types as having a length of one byte.

Here's the IL for Gen<T>():

.maxstack 1
.locals init ([0] !!T CS$0$0000, [1] !!T CS$0$0001)
ldloca.s CS$0$0000
initobj !!T
ldloca.s CS$0$0001
constrained. !!T
callvirt instance void IOperation::DoIt()

For some reason the default(T).Doit() construct results in two local variables being created. Now the JIT code above is starting to make a little sense (although it still isn't an excuse why it isn't optimized away):

sub    esp,8
xor    eax,eax
mov    dword ptr [esp],eax
mov    dword ptr [esp+4],eax
.locals init ([0] !!T CS$0$0000, [1] !!T CS$0$0001)
Reserve and initialize space on the stack for the two local variables. Note that they are both unnecessarily padded, so we need 8 bytes intsead of 4.
mov    byte ptr [esp],0 ldloca.s CS$0$0000
initobj !!T
Initialize the first local variable.
movsx  eax,byte ptr [esp] ldloc.0
Read the value of the first local variable.
mov    byte ptr [esp+4],al stloc.1
Store the value we just read in the second local variable.
  ldloca.s CS$0$0001
constrained. !!T
callvirt instance void IOperation::DoIt()
Inlined (empty) DoIt method.
add    esp,8
Unreserve stack space and return to caller.

We can improve the code a little by changing the Gen<T>() method body as follows:

T t = default(T);

Now the C# compiler won't generate the extra local variable and the x86 code looks a little better:

push   eax
xor    eax,eax
mov    dword ptr [esp],eax
mov    byte ptr [esp],0
pop    ecx

For completeness here's the x64 code for the original Gen<Dummy>() method:

sub    rsp,38h
xor    eax,eax
mov    byte ptr [rsp+21h],al
xor    eax,eax
mov    byte ptr [rsp+20h],al
xor    eax,eax
mov    byte ptr [rsp+21h],al
movzx  eax,byte ptr [rsp+21h]
mov    byte ptr [rsp+20h],al
lea    rcx,[rsp+20h]
mov    rax,6428001C078h
call   rax
add    rsp,38h
rep ret

This is even worse than the x86 code, because the Dummy.DoIt() method isn't even inlined.


The standard excuse of why a JIT doesn't do some "obvious" optimizations is of course that it runs while your program is supposed to be running, so it can't take a lot of time to analyze and optimize the code. This is true to some extent (and JIT performance is one reason why running your managed code on x64 takes so much longer to start up), but it doesn't apply to NGEN and from what I understand today's NGEN doesn't do any more optimizations than the JIT compiler. I tested this specific example (for x64) and the NGEN code is indeed identical to the JIT code. This is unfortunate and hopefully, someday we'll see more sophisticated optimizations specific to NGEN as well.


This trick of using value types in generic algorithms appears to work reasonably well on x86, but there is still room for improvement in the JIT. Rumour has it that in the next .NET 2.0 service pack (due out this summer?) there will be an update of the JIT that finally supports inlining of methods that have value type arguments, let's hope that this JIT update will also include other optimizations as well. Let's also hope that this update includes a better version of the x64 JIT, because as this example shows yet again the x64 JIT still considerably lags behind the x86 JIT.

P.S. Here's the Mono 1.9 x86 JIT code for the original Gen<Dummy>():

push   ebp
mov    ebp,esp
sub    esp,8
mov    byte ptr [ebp-8],0
mov    byte ptr [ebp-4],0
mov    byte ptr [ebp-8],0
movsx  eax,byte ptr [ebp-8]
mov    byte ptr [ebp-4],al
lea    eax,[ebp-4]
push   eax
call   02A20278
pop    ecx

Monday, 24 March 2008 09:23:29 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]