# Monday, 04 June 2007
« First Hybrid OpenJDK / GNU Classpath Sna... | Main | Array Initialization »
OpenJDK Library/VM Integration

In Merging the First OpenJDK Code and Another x64 CLR Bug I briefly mentioned the three different approaches that are available for integrating the OpenJDK classes with IKVM:

  1. Use existing native interface
  2. Use map.xml tricks
  3. Change the code (i.e. fork a specific source file)

This triggered a couple of questions in the comments.

Morten asked:

about your work of [i]ntegrating the whole of OpenJDK and [y]our 3 options. Could you provide more details? From what you write, it sounds like using aspectj to inject annotations could be useful. Codegeneration might also be useful?

Yes, that's essentially what option 2 is. The map.xml infrastructure allows me to add methods and fields to existing classes, or allows me to replace method bodies.

Albert Strasheim asked:

It's too bad that the Sun and Classpath code aren't compatible down to the native methods called by the pure Java code.

As Andrew pointed out, it would not have been possible or practical to reverse engineer the Sun native interface. Besides that, the Sun and GNU Classpath libraries have very different goals, so it actually makes a lot of sense for the interfaces to differ.

I figured that you'd go the "native methods in C#" route, so I'd be interested to know why you think that the "IKVM specific modifications" route is cleaner and more efficient (efficient in terms of performance or time to implement?).

It would seem that implementing the native methods in C# would be the best solution, but that's not always the case. Hopefully that will become clear below.

Examples

Let's look at some examples and the downsides and cost associated with each of the three options:

1. Use existing native interface

In terms of long term development cost, this is easily the best option. Once you understand the ikvmc native method mapping that is used by the class library, it is very easy to find to corresponding native method. Sun is not very likely to change the native method semantics without also modifying the method name or signature. When a new native method is added, the build process will fail due to the fact that the new native method isn't implemented (and hence generates an unverifiable JNI method stub).

In terms of runtime performance, this method can be very efficient, but only if the method parameters contains all the required data to work on, or if the data can be easily obtained in another efficient way. This is often not the case, because many of the Sun native methods use JNI reflection to access private fields in the object. Here are two real world examples, one where the native methods works really well and another one where it's not as efficient as I'd like:

namespace IKVM.NativeCode.java.lang.reflect
{
  public sealed class Array
  {
    public static bool getBoolean(object arrayObj, int index)
    {
      if (arrayObj == null)
      {
        throw new jlNullPointerException();
      }
      bool[] arr = arrayObj as bool[];
      if (arr != null)
      {
        if (index < 0 || index >= arr.Length)
        {
          throw new jlArrayIndexOutOfBoundsException();
        }
        return arr[index];
      }
      throw new jlIllegalArgumentException("argument type mismatch");
    }
  }
}

Clearly this is very efficient (and it makes you wonder why it's a native method at all). Here's one that's less efficient:

namespace IKVM.NativeCode.java.lang
{
  public sealed class Class
  {
    public static bool isInterface(object thisClass)
    {
      return TypeWrapper.FromClass(thisClass).IsInterface;
    }
  }
}

The thisClass parameter is a reference to the java.lang.Class object, but the VM wants to use its internal representation of a class (TypeWrapper), so here I need to use TypeWrapper.FromClass() to get the corresponding TypeWrapper object. Currently that's implemented by using reflection to access a private field in java.lang.Class (that was added through map.xml).

To contrast, here's the equivalent method for use with the GNU Classpath library:

namespace IKVM.NativeCode.java.lang
{
  public sealed class VMClass
  {
    public static bool IsInterface(object wrapper)
    {
      return ((TypeWrapper)wrapper).IsInterface;
    }
  }
}

Here only a downcast is required, because Class.isInterface() calls VMClass.isInterface(), which reads the vmdata field from Class and passes that to the native IsInterface method.

2. Use map.xml tricks

This is very powerful, but also much more expensive because it is much less obvious what's going on. In the case of a native method that is implemented, if you can't find it in the C# code, you'll probably eventually find it in map.xml, but when replacing an existing method it's very easy to miss the fact that the method is replaced. I've used method replacement in java.lang.ClassLoader to make bootstrap resource loading work. ClassLoader has two private methods getBootstrapResource() and getBootstrapResources() that are used to load resources from the boot class path. These methods are implemented by parsing the sun.boot.class.path system property and then using internal classes to load from these jars or directories. Obviously that won't do for IKVM, because the boot classes and resources are contained in the core library .NET assembly (currently named IKVM.Hybrid.GNU.Classpath.OpenJDK.dll). So I've had to replace these two methods:

<class name="java.lang.ClassLoader">
  <field name="wrapper" sig="Ljava.lang.Object;" modifiers="private" />
  <method name="getBootstrapResource" sig="(Ljava.lang.String;)Ljava.net.URL;">
    <body>
      <ldarg_0 />
      <call class="java.lang.LangHelper"
            name="getBootstrapResource" sig="(Ljava.lang.String;)Ljava.net.URL;" />
      <ret />
    </body>
  </method>
  <method name="getBootstrapResources" sig="(Ljava.lang.String;)Ljava.util.Enumeration;">
    <body>
      <ldarg_0 />
      <call class="java.lang.LangHelper"
            name="getBootstrapResources" sig="(Ljava.lang.String;)Ljava.util.Enumeration;" />
      <ret />
    </body>
  </method>
</class>

These method bodies simply call the real implementations in LangHelper, because it's obviously easier to implement them in Java than in CIL.

3. Change the code (i.e. fork a specific source file)

In some cases, the changes required are so substantial that the map.xml trickery doesn't work or isn't practical. For an example of that we need to look at how reflection is implemented in OpenJDK.

Here's a fragment of java.lang.reflect.Method (not the actual code, but it gets the idea across):

package java.lang.reflect;

public final class Method
{
  private static final ReflectionFactory reflectionFactory = (ReflectionFactory)
    AccessController.doPrivileged(new sun.reflect.ReflectionFactory.GetReflectionFactoryAction());
  private MethodAccessor methodAccessor;

  public Object invoke(Object obj, Object... args)
  {
    // ... access checks omitted ...
    if (methodAccessor == null)
    {
      methodAccessor = reflectionFactory.newMethodAccessor(this);
    }
    return methodAccessor.invoke(obj, args);
  }
}

So the actual method invocation is delegated to an object that implements the MethodAccessor interface. By default Sun's ReflectionFactory returns an object that counts the number of invocations and delegates to another MethodAccessor object that uses JNI to use the VM reflection infrastructure, if the invocation counter crosses a certain value the inner object is replaced by an instance of a dynamically generated class that contains custom generated byte codes to efficiently call the target method (and this generated bytecode is not verifiable, so it won't work as-is on IKVM). In this case I've decided to fork ReflectionFactory and implement my own version that simply returns MethodAccessor instances that reuse the IKVM reflection infrastructure.

The cost of forking a source file should be obvious. When improvements are made to the original, you don't automatically inherit these. In this particular case ReflectionFactory is a fairly small and simple class (all of the complexity is in the classes it calls), so the risk seems low.

Structural Issues

Besides the issues already pointed out, there are also things of a more structural nature. A good example can again be found in java.lang.reflect.Method. Here's its constructor:

Method(Class declaringClass,
       String name,
       Class[] parameterTypes,
       Class returnType,
       Class[] checkedExceptions,
       int modifiers,
       int slot,
       String signature,
       byte[] annotations,
       byte[] parameterAnnotations,
       byte[] annotationDefault)
{
  // ...
}

The constructor expects the generics signature and annotations to be passed in. This poses a dillema. On IKVM looking up the custom attributes that contain this data is relatively expensive and in many cases a method object may be constructed while the code is not interested in generics information or annotations at all, so it might be more desireable to lazily get this data. Another problem is that the annotations are passed in as byte arrays that contain the annotation data as it exists in the class file. Code compiled with ikvmc obviously doesn't have a class file. Despite these two issues, I've chosen to not fork Method. At the moment I think that the performance cost of eagerly getting the generics signature and the annotations is an acceptable trade-off for not having to fork Method. Currently, the way annotations are handled is pretty horrible, because I've left the existing annotation mechanisms in place and reuse the stub class generator code to recreate the annotation byte arrays (and the corresponding constant pool entries) on the fly to pass them into the Method constructor. I've got some ideas to change annotation storage in ikvmc compiled code to use a mechanism that's more compatible with this approach and that may ultimately turn out to be more space efficient as well.

Compatibility

One aspect I haven't talked about yet is compatibility. For example, there's code out there that directly uses ReflectionFactory. This is evil, but it's also reality. So I try to weigh the compability impact as well when I'm trying to find the right approach.

Monday, 04 June 2007 10:46:34 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)  

Live Comment Preview