# Thursday, 14 August 2003
« PDC 2003 | Main | More on interaction with .NET types »
Workarounds

This week I finished the support for missing classes. This allows Eclipse to run without the -Xbootclasspath workaround.

To run Eclipse (on Windows), you can now simply go to the Eclipse directory and type:

   eclipse -vm \ikvm\bin\ikvm.exe

The verifier and compiler had to be changed to support, what I call in the code, unloadable classes (probably should have called it missing classes). Whenever the verifier encounters a reference of a type that cannot be loaded, it treats it as a special type (kind of like the null-type) and allows (almost) any operation to succeed. When the compiler encounters this type, it instead of generating CIL instruction to implement a particular instruction, it generates a call to a helper method in ByteCodeHelper that implements the instruction using reflection. At the moment the reflection results are not cached, so it could be made more efficient by adding caching, but since this shouldn't happen that often, this is not a high priority.

java.lang.CharSequence

I also finally implemented support for (what I've termed) ghost interfaces. Ghost interfaces are interfaces that are implemented by remapped types. So, for example, java.lang.String (really System.String) appears to implement java.lang.CharSequence, thus java.lang.CharSequence is a ghost interface. When a reference of a ghost interface type is passed around, it really is an object reference (so that it can contain references to types that actually implement the interface, as well as types that only appear to implement the type).

Here is an example:

CharSequence toUpperCase(CharSequence seq) {
  StringBuffer buf = new StringBuffer();
  int len = seq.length();
  for(int i = 0; i < len; i++) {
    char c = seq.charAt(i);
    c = Character.toUpperCase(c);
    buf.append(c);
  }
  return buf;
}

This is compiled as:

Object toUpperCase(Object seq) {
  StringBuffer buf = new StringBuffer();
  int len;
  if(seq instanceof String) {
    len = ((String)seq).length();
  } else {
    len = ((CharSequence)seq).length();
  }
  for
(int i = 0; i < len; i++) {
    char c;
    if(seq instanceof String) {
      c = ((String)seq).charAt(i);
    } else {
      c = ((CharSequence)seq).charAt(i);
    }
    c = Character.toUpperCase(c);
    buf.append(c);
  }
  return buf;
}

There are some downsides to this approach:

  • Performance cost of the type check and the cast. BTW, since System.String is a sealed type, the type check can (theoretically) be very efficient.
  • Type is erased in the method signature, causing problems if an identifical signature already exists (the CLR supports a nice mechanism to workaround this, but unfortunately Reflection.Emit currently doesn't expose this functionality).
  • When this method is statically compiled and called from, for example, C#, the signature is confusing.

An alternative approach would be to wrap each String object when it needs to be treated as a CharSequence, but is harder to implement (object identity has to be preserved) and it isn't clear to me that it would be more efficient or elegant.

On the upside, this is a totally generic solution, there is no special support for String, any remapped type (i.e. type in map.xml) can declare to support any interface, and the compiler will automatically do the right thing. It is also used to make java.lang.Throwable (i.e. System.Exception) appear to implement java.io.Serializable. At the moment it isn't used for arrays (which should appear to implement java.io.Serializable and java.lang.Cloneable), but it would be easy to add this.

Oh, and since java.lang.String now implements CharSequence, I can now use the StringBuffer implementation from GNU Classpath instead of remapping it to System.Text.StringBuilder.

Reflecting on .NET types

Another thing I've been working on is integration with .NET types. When you want to use a .NET type from within Java you have two options: 1) use netexp to generate a jar containing stubs for the .NET classes, so you can statically compile against the .NET types, or 2) use reflection against the .NET type.

Something I dislike about netexp is that it includes remapping logic to map .NET types and signatures to Java compatible stuff. For example, when it encounters an enum, it generates a final Java class with public static final members or when it encounters a signature with a byref argument, it turns that into an array argument. Now this is all very nice, because it allows Java code to use most of the .NET features that Java doesn't really support, but the part I don't like is that this remapping is also done in the IKVM.NET runtime, because when it compiles Java code that was compiled against a netexp generated jar, it needs to do some of the same translations. This duplication of the remapping logic is obviously not a good thing. So I want to rewrite netexp in Java and make it use Java reflection to interrogate the .NET types, that way all the remapping is done in one place, the IKVM.NET runtime.

However, while thinking about this I realized that there is a problem with respect to type identity. There are two ways a .NET type can become visible in Java, netexp and Class.forName. Both have problems with the class name. Netexp converts to namespaces to lowercase, because Java compiler don't like the System namespace (System binds to the java.lang.System class and is not considered as a package name) and Class.forName requires the assembly qualified type name (e.g. "System.String, mscorlib, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"). There are several ways to handle this:

  1. Only allow .NET types to be visible through netexp generated classes. Obviously, I don't like this, because it would be very limiting and my plan to rewrite netexp in Java wouldn't work.
  2. Allow .NET types to be visible through netexp with one name (e.g. system.String) and through Class.forName with another name (i.e. the assembly qualified name). The problem this causes is that once you create an instance of a class and call getClass on that instance, which of the two Class objects should it then return?
  3. Use a universal name mangling scheme. Both netexp and Class.forName would represent System.String as, for example, "System_String__mscorlib__Version_1_0_5000_0__ Culture_neutral__PublicKeyToken_b77a5c561934e089". I think this would work pretty well, but the obvious downside is that it makes the Java source code totally unreadable. Something that still would be an issue is how Java code reacts when it tries to load one class and then gets another (in the face of .NET binding policy). For example, it would be possible to load the above String, but then get a Class object that returns "..._Version_2_0_..." as its name, because the app is running on some future version of the CLR.

Maybe there are other solutions I haven't thought of, but at the moment I'm thinking of going with number 2.

I updated the snapshots on Tuesday. Binaries and source.

Thursday, 14 August 2003 12:06:29 (W. Europe Daylight Time, UTC+02:00)  #    Comments [5]
Thursday, 14 August 2003 15:49:22 (W. Europe Daylight Time, UTC+02:00)
Have you considered only lowercasing the "system" namespace and no others? Or perhaps (to be really safe) only lowercasing top-level namespaces that match the name of a class in java.lang?

Also, would it be possible for the implementation of Class.forName to do remapping of classnames from normal java ones to assembly qualified ones? If so, would that solve the problem? (I'm not entirely sure I understand what the problem is, so I'm not sure)
Stuart
Thursday, 14 August 2003 17:12:10 (W. Europe Daylight Time, UTC+02:00)
One of the problems is that all .NET types are loaded by the bootstrap class loader, so they have to have distinct names.

I just had an another idea. It might be a good idea to encode the assembly qualified name in the package. For example for the System.String type:

assembly.mscorlib.1.0.5000.0.neutral.b77a5c561934e089.System.String

This has the advantage that you can use "import" in most cases (although not for String) to use the short name.

I have to think about it, but I think I like it.
Thursday, 14 August 2003 20:15:53 (W. Europe Daylight Time, UTC+02:00)
Hmmm... how fundamental a rearchitecting would it take to have a classloader per assembly? Then there would be a trivial isomorphism between Java's "names are unique within a classloader" and .NET's "names are unique within a strongly-named assembly", if I'm understanding correctly.
Stuart
Thursday, 14 August 2003 21:38:37 (W. Europe Daylight Time, UTC+02:00)
I don't care for the occasional lowercasing ideas.

It could be that all .NET classes would begin with NET. from the Java perspective.

NET.System.String
NET.Some.Other.Class

import NET.*; // import all of .NET

This makes it clear where the class is defined and clearly separates all of .NET into its own package.

I'm just trying Eclipse under IKVM for the first time as I write this. Will this resolve itself without netexp if compiling using a Java compiler running under IKVM? I recall that someone had even made a command-line version of Eclipse's Java compiler. If I can use Eclipse under IKVM to compile Java source against .NET types without netexp, that would be perfect.

Also, I thought that java.lang would be searched later if specifically listed later, though I could be wrong there.
Brian Sullivan
Thursday, 14 August 2003 21:53:49 (W. Europe Daylight Time, UTC+02:00)
As I then answer my own question after getting Eclipse to work. Of course, it sees just its .jar files in the classpath to compile, not the .NET classes that it's actually running with.

The Eclipse compiler could conceivably be updated to find .NET classes and compile against those as well. Currently beyond my scope from the .NET perspective, but interesting possibility.

As a side note, I just got my customized Eclipse IDE with Cobol plugin working under IKVM. I had to copy ikvm.exe to java.exe and add an rt.jar under jre\lib for Eclipse to find IKVM properly as a JRE, but then it all worked fine.
Brian Sullivan
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)  

Live Comment Preview