# Thursday, 14 August 2003
« PDC 2003 | Main | More on interaction with .NET types »
Workarounds

This week I finished the support for missing classes. This allows Eclipse to run without the -Xbootclasspath workaround.

To run Eclipse (on Windows), you can now simply go to the Eclipse directory and type:

   eclipse -vm \ikvm\bin\ikvm.exe

The verifier and compiler had to be changed to support, what I call in the code, unloadable classes (probably should have called it missing classes). Whenever the verifier encounters a reference of a type that cannot be loaded, it treats it as a special type (kind of like the null-type) and allows (almost) any operation to succeed. When the compiler encounters this type, it instead of generating CIL instruction to implement a particular instruction, it generates a call to a helper method in ByteCodeHelper that implements the instruction using reflection. At the moment the reflection results are not cached, so it could be made more efficient by adding caching, but since this shouldn't happen that often, this is not a high priority.

java.lang.CharSequence

I also finally implemented support for (what I've termed) ghost interfaces. Ghost interfaces are interfaces that are implemented by remapped types. So, for example, java.lang.String (really System.String) appears to implement java.lang.CharSequence, thus java.lang.CharSequence is a ghost interface. When a reference of a ghost interface type is passed around, it really is an object reference (so that it can contain references to types that actually implement the interface, as well as types that only appear to implement the type).

Here is an example:

CharSequence toUpperCase(CharSequence seq) {
  StringBuffer buf = new StringBuffer();
  int len = seq.length();
  for(int i = 0; i < len; i++) {
    char c = seq.charAt(i);
    c = Character.toUpperCase(c);
    buf.append(c);
  }
  return buf;
}

This is compiled as:

Object toUpperCase(Object seq) {
  StringBuffer buf = new StringBuffer();
  int len;
  if(seq instanceof String) {
    len = ((String)seq).length();
  } else {
    len = ((CharSequence)seq).length();
  }
  for
(int i = 0; i < len; i++) {
    char c;
    if(seq instanceof String) {
      c = ((String)seq).charAt(i);
    } else {
      c = ((CharSequence)seq).charAt(i);
    }
    c = Character.toUpperCase(c);
    buf.append(c);
  }
  return buf;
}

There are some downsides to this approach:

  • Performance cost of the type check and the cast. BTW, since System.String is a sealed type, the type check can (theoretically) be very efficient.
  • Type is erased in the method signature, causing problems if an identifical signature already exists (the CLR supports a nice mechanism to workaround this, but unfortunately Reflection.Emit currently doesn't expose this functionality).
  • When this method is statically compiled and called from, for example, C#, the signature is confusing.

An alternative approach would be to wrap each String object when it needs to be treated as a CharSequence, but is harder to implement (object identity has to be preserved) and it isn't clear to me that it would be more efficient or elegant.

On the upside, this is a totally generic solution, there is no special support for String, any remapped type (i.e. type in map.xml) can declare to support any interface, and the compiler will automatically do the right thing. It is also used to make java.lang.Throwable (i.e. System.Exception) appear to implement java.io.Serializable. At the moment it isn't used for arrays (which should appear to implement java.io.Serializable and java.lang.Cloneable), but it would be easy to add this.

Oh, and since java.lang.String now implements CharSequence, I can now use the StringBuffer implementation from GNU Classpath instead of remapping it to System.Text.StringBuilder.

Reflecting on .NET types

Another thing I've been working on is integration with .NET types. When you want to use a .NET type from within Java you have two options: 1) use netexp to generate a jar containing stubs for the .NET classes, so you can statically compile against the .NET types, or 2) use reflection against the .NET type.

Something I dislike about netexp is that it includes remapping logic to map .NET types and signatures to Java compatible stuff. For example, when it encounters an enum, it generates a final Java class with public static final members or when it encounters a signature with a byref argument, it turns that into an array argument. Now this is all very nice, because it allows Java code to use most of the .NET features that Java doesn't really support, but the part I don't like is that this remapping is also done in the IKVM.NET runtime, because when it compiles Java code that was compiled against a netexp generated jar, it needs to do some of the same translations. This duplication of the remapping logic is obviously not a good thing. So I want to rewrite netexp in Java and make it use Java reflection to interrogate the .NET types, that way all the remapping is done in one place, the IKVM.NET runtime.

However, while thinking about this I realized that there is a problem with respect to type identity. There are two ways a .NET type can become visible in Java, netexp and Class.forName. Both have problems with the class name. Netexp converts to namespaces to lowercase, because Java compiler don't like the System namespace (System binds to the java.lang.System class and is not considered as a package name) and Class.forName requires the assembly qualified type name (e.g. "System.String, mscorlib, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"). There are several ways to handle this:

  1. Only allow .NET types to be visible through netexp generated classes. Obviously, I don't like this, because it would be very limiting and my plan to rewrite netexp in Java wouldn't work.
  2. Allow .NET types to be visible through netexp with one name (e.g. system.String) and through Class.forName with another name (i.e. the assembly qualified name). The problem this causes is that once you create an instance of a class and call getClass on that instance, which of the two Class objects should it then return?
  3. Use a universal name mangling scheme. Both netexp and Class.forName would represent System.String as, for example, "System_String__mscorlib__Version_1_0_5000_0__ Culture_neutral__PublicKeyToken_b77a5c561934e089". I think this would work pretty well, but the obvious downside is that it makes the Java source code totally unreadable. Something that still would be an issue is how Java code reacts when it tries to load one class and then gets another (in the face of .NET binding policy). For example, it would be possible to load the above String, but then get a Class object that returns "..._Version_2_0_..." as its name, because the app is running on some future version of the CLR.

Maybe there are other solutions I haven't thought of, but at the moment I'm thinking of going with number 2.

I updated the snapshots on Tuesday. Binaries and source.

Thursday, 14 August 2003 12:06:29 (W. Europe Daylight Time, UTC+02:00)  #    Comments [5]