# Saturday, March 20, 2004

Miguel posted a nice example of how to use Gtk# from Java using IKVM/Mono on his blog. In response Pablo posted a question to the Mono list and Jonathan Pryor replied with a nice explanation of how delegates are handled to the IKVM and Mono lists (quoted with permission, slightly edited):

From: Jonathan Pryor
Sent: Friday, March 19, 2004 02:35
To: Pablo Baena
Cc: Miguel de Icaza; mono-list@lists.ximian.com; ikvm-developers@lists.sourceforge.net
Subject: Re: [Mono-list] Java and C#


On Thu, 2004-03-18 at 16:07, Pablo Baena wrote:
> Miguel: I saw your blog about IKVM. One thing I haven't been able to 
> investigate is, how useful can be Gtk# with Java. Because, for example, I 
> couldn't find a clue on how to attach a Java 'listener' to a C# event, or any 
> way to use attributes in Java.

They really need to document this better...

However, grepping through the ikvm.zip file (from their website), we

// file: classpath/java/lang/VMRuntime.java
cli.System.AppDomain.get_CurrentDomain().add_ProcessExit (
  new cli.System.EventHandler (
    new cli.System.EventHandler.Method () {
      public void Invoke (Object sender, cli.System.EventArgs e) {

>From this (and prior knowledge), we can draw the following statements:

1. Properties are actually functions with `get_' and `set_' prefixed to
them. Thus C# property System.AppDomain.CurrentDomain is the static
Java function cli.System.AppDomain.get_CurrentDomain().

2. Events are actually functions with `add_' and `remove_' prefixed to
their name. Thus C# event System.AppDomain.ProcessExit is the static
Java function cli.System.AppDomain.add_ProcessExit().

3. There is no equivalent to C# delegates in Java, so these are
translated into a class + interface pair. The EventHandler class is the
standard C# type name (cli.System.EventHandler), which takes as an
argument an interface to invoke, named "cli." + C# delegate type name +
".Method", hence cli.System.EventHandler.Method. The EventHandler.Method
interface has a function Invoke() which must be implemented, and this
method will be invoked when the event is signaled.

I suspect that there is no way to add attributes in Java. Microsoft's
Visual J# permits the use of Attributes (IIRC), but it's through their
Visual J++ syntax -- through a specially formed JavaDoc comment. 
Something like (from memory):

* @attribute-name (args...)
public void myMethod () {/* ... */}

Of course, that's compiler specific, and no standard Java compiler will
support that. So when it comes to attributes, you're probably up the

- Jon

I replied saying that I believe that the attribute construct in JDK 1.5 can probably be used to expose .NET attributes to Java (and use them in Java code that is target to run on IKVM).

Saturday, March 20, 2004 2:51:08 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [3]
A Less Broken Snapshot?
In Tuesday's snapshot, ikvmc was completely broken. Sorry about that. The CoreClasses cache introduced an incorrect dependency between Object, Throwable and String. This caused Throwable or String to be loaded while it was being loaded and that resulted in an exception: System.ArgumentException: Item has already been added. Key in dictionary: "java.lang.Throwable" Key being added: "java.lang.Throwable"

Hopefully this snapshot will be a little better quality, but don't hold your breath, because the main change in this version is the addition of local variable liveness analysis to the verifier. This required some tricky code and made it clear to me (again) that the verifier desperately needs to be rewritten.

The trigger for the local variable liveness analysis was to be able to emit debugging information for local variables, but it also has the nice side effect of allowing a little better code generation. Previously, if a local variable slot was shared between two different reference types, the .NET local would have the type of the common base type, even if the uses were in fact totally distinct. The compiler had to emit downcasts whenever it emitted a load from one of those locals. In classpath.dll there were 1288 such downcasts. With the new liveness information, it is now possible to split those Java locals in multiple .NET locals, so these downcasts are now gone. Another optimization, which doesn't seem all that exciting, is the elimination of dead stores to local variables. In itself this is a fairly pointless optimization, because the CLR/Mono JIT will probably do it anyway. However, there is one very important optimization that can be done because of dead store elimination, in exception handling. Whenever an exception handler discards the exception object and the IKVM bytecode compiler can detect this, it can skip the (expensive) stack trace capturing that is normally required. I had already hacked some support to recognize these exception handlers (in classpath.dll there were 313 optimized exception handlers), but now it works much better (there are now 444 optimized exception handlers).

What's new?

  • Decorated the various ByteCodeHelper methods with the [DebuggerStepThroughAttribute] attribute to make stepping through the source code in the debugger less disruptive.
  • Restored the signature decoding methods in ClassFile.cs that I removed in the previous version. I had failed to realise that they're different from the ones in ClassLoaderWrapper, because they deal with unloadable classes.
  • Fixed CoreClasses to decouple the different classes (accessing one no longer triggers loading the others).
  • Changed handling of package accessible final fields (they're no longer turned into a property).
  • Fixed System.setOut (copy & paste mistake, it tried to set "in").
  • The debugging information is now classified as Java/Text. Not sure if this affects anything, but it seemed like the right thing to do.
  • Fixed debugging line number information to make sure the firt CIL instruction has a corresponding line number. Previously, Visual Studio .NET refused to step into an ikvmc compiled method.
  • Optimized dead stores to local variables and use new dead store information to optimize exception handling.
  • Local variables are now properly typed and have their names attached in debugging information (when ikvmc with the -debug option is used). NOTE: if the same variable name is reused in a method, the debugging information for those variables is not yet emitted correctly.
  • Fixed mapping of System.IntPtr to gnu.classpath.RawData. The mapping is now private to classpath.dll.
  • Changed JVM.CriticalFailure to write to always write to stderr on Unix instead of try to display a message box.
  • Fixed race condition between returning from Thread.join and the thread being removed from the thread pool / marked as dead.
  • Fixed Thread.yield to not consume thread interrupted status. (Note that Thread.sleep(0) behaves as Thread.yield() and also does not consume the interrupted status).

New snapshots: just the binaries and source plus binaries.

Saturday, March 20, 2004 2:26:49 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Tuesday, March 16, 2004
New Snapshot

Last week I said I'd go through a stabilization phase, but I couldn't resist the urge to implement some more stuff and fix various things. So this snapshot is a fairly big change again, but no major architectural overhaul like the pervious one.

What's new?

  • Merged with current Classpath cvs.
  • Support for JDK 1.5 style class literals (only for class files with version 49 or greater).
  • Removed signature decoding from ClassFile.cs (I once thought that it should live there, instead of in ClassLoaderWrapper, but that turned out not to be a good idea).
  • Added CoreClasses.cs to cache a few of the frequently used TypeWrappers (Object, Class, String and Throwable).
  • Fixed volatile long/double handling to use the (new in .NET 1.1) Thread.VolatileRead/VolatileWrite methods.
  • Changed type used in ImplementsAttribute to the ghost wrapper for ghosts.
  • Changed method name mangling for interface implementation stubs (shorter name and now uses a slash to make sure it doesn't clash with any Java method names).
  • Added support for Finalize/finalize method overriding when mixing Java and non-Java classes in the class hierarchy. I don't like this solution very much. The code is ugly and complicated.
  • Added special support for finalize method for .NET types that extend Java types.
  • Fixed handling of synchronized static methods. Previously, .NET MethodImplOptions.Synchronized flag was simply set, but this was incorrect because that causes the method to synchronize on the .NET Type object, instead of the Java Class object.
  • Fixed handling of instance calls on value types.
  • Fixed System.currentTimeMillis implementation to use DateTime.UtcNow instead of Environment.TickCount, to prevent overflow.
  • Changed System.setErr/setIn/setOut to use TypeWrapper based reflection instead of .NET reflection.
  • Changed handling of resources to use .NET resources instead of global fields, this allows resources to work in multi-module assemblies.
  • Changed URL format for assembly embedded resources from opaque to parseable, to facilitate parsing them as a URI.
  • Added support for passing ghost references to methods in map.xml instructions.
  • Fixed a regression introduced in the previous snapshot, that caused exception mapping not to be invoked for catch(Throwable).
  • Limited fixes to get AWT working again (after Classpath AWT changes).
  • Declared String.equals and String.compareTo(Object) in map.xml to make reflection appearance identical to JDK.
  • Implemented JDK 1.4 String methods that rely on regular expressions (Classpath now has java.util.regex.* support, although not 100% compatible with the JDK).
  • Minor performance improvement in String.hashCode implementation. Oddly enough, by doing the length check in the for condition, instead of manually hoisting it out of the loop. Apparantly the CLR JIT recognizes this pattern and optimizes it better.
  • Fixed Thread.join to work with non-Java created threads as well.
  • Fixed removal of non-Java created threads from ThreadGroup.
  • Fixed ServerSocket.accept() timeout support.
  • Fixed ikvmc handling of -reference assemblies (to handle the Load vs LoadFrom context issues).
  • Various comment fixes.

New snapshots: just the binaries and source plus binaries.

Tuesday, March 16, 2004 5:50:55 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [3]
# Saturday, March 13, 2004
JDK 1.5 beta

Yesterday I looked at the JDK 1.5 beta that Sun released recently. There appears not to be a complete list of changes to the VM yet and the only things I found were a few new modifier bits (that haven't yet stabilized) and the fact that class literals are finally supported in the VM. This is important for IKVM.NET, because it makes class literals in statically compiled code work better and more efficient.

For a quick refresher of how class literals are currently compiled, let's look at how the following class is compiled:

class ClassLiteral
  public static void main(String[] args)

Compiling this with Jikes 1.19 and then disassembling it (I've left out the default constructor that the compiler generated):

class ClassLiteral extends java/lang/Object

static java/lang/Class class$java$lang$String
// Unknown attribute : Synthetic//

public static main([Ljava/lang/String;)V
// attrib length: 53
// max stacks: 3
// max locals: 1
// code length: 25
0 getstatic <Field java/lang/System java/io/PrintStream out>
3 getstatic <Field ClassLiteral java/lang/Class class$java$lang$String>
6 dup
7 ifnonnull 21
10 pop
11 ldc "[Ljava.lang.String;"
13 iconst_0
14 invokestatic <Method ClassLiteral class$(Ljava/lang/String;Z)Ljava/lang/Class;>
17 dup
18 putstatic <Field ClassLiteral java/lang/Class class$java$lang$String>
21 invokevirtual <Method java/io/PrintStream println(Ljava/lang/Object;)V>
24 return

static class$(Ljava/lang/String;Z)Ljava/lang/Class;
// Unknown attribute : Synthetic
// attrib length: 55
// max stacks: 3
// max locals: 4
// code length: 23
0 aload_0
1 invokestatic <Method java/lang/Class forName(Ljava/lang/String;)Ljava/lang
4 iload_1
5 ifne 11
8 invokevirtual <Method java/lang/Class getComponentType()Ljava/lang/Class;>

11 areturn
12 new java/lang/NoClassDefFoundError
15 dup_x1
16 invokespecial <Method java/lang/NoClassDefFoundError <init>()V>
19 invokevirtual <Method java/lang/Throwable initCause(Ljava/lang/Throwable;)
22 athrow
Exception table:
start_pc = 0
end_pc = 12
handler_pc = 12
catch_type = java/lang/ClassNotFoundException

The amount of code generated is pretty bizarre. Note that this isn't Jikes' fault, there just isn't a way to do it better. Now, here is what it looks like compiled with javac from the 1.5 beta (specifying the -target 1.5 option):

class ClassLiteral extends java/lang/Object

public static main([Ljava/lang/String;)V
// attrib length: 38
// max stacks: 2
// max locals: 1
// code length: 10
0 getstatic <Field java/lang/System java/io/PrintStream out>
3 ldc_w java/lang/String
6 invokevirtual <Method java/io/PrintStream println(Ljava/lang/Object;)V>
9 return

This looks a lot better! No new bytecode instruction was added, instead the ldc instruction was modified to allow referencing a CONSTANT_Class_info . When the VM encounters this it loads the class and pushes the class object on the stack. I added support for this to IKVM.NET (not in cvs yet) in about 15 minutes. When JDK 1.1 was released (the first version to support class literals in the source), I wondered why they didn't add VM support at the same time, but fortunately they finally got around to it.


If you looked closely at the Jikes generated code, you may have noticed that Jikes actually loads the string array class ("[Ljava.lang.String;") instead of java.lang.String. Why does it do this? It does this, because it correctly implements the JLS. The JLS says that class literals should not cause a class to be initialized. Doing a Class.forName() initializes the class, but when you initialize an array class you don't initialize the component type class. So this is a clever trick. Javac doesn't do this, so it (incorrectly) causes the class to be initialized.


Why does this change help statically compiled code in IKVM.NET? Performance is a bit better, but that's not the most important difference. The real benefit shows up when you statically compile code into multiple assemblies. If one assembly references a class in another assembly via a class literal, you'd better be sure that the referenced assembly is already loaded in the AppDomain, otherwise the IKVM.NET runtime is unable to find the class. In the new (JDK 1.5) way of references class literals, it is no longer opaque to ikvmc, so it can now compile the construct in such a way that the class literal causes the appropriate assembly to be loaded by the .NET runtime when it is executed.


Something that struck me a funny is the new StringBuilder class that JDK 1.5 includes. It's almost identical to StringBuffer, except that it is not thread safe. If you look at the Rotor source code, you can see that the .NET StringBuilder also started life as StringBuffer. Now if the next version of .NET includes a thread safe version of StringBuilder and name it StringBuffer, we've come full circle ;-)

Saturday, March 13, 2004 6:38:45 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, March 10, 2004
To Invert Or Not To Invert

Stuart commented:
I'm not convinced that cli.System.Object should be visible to Java at all. AIUI, Java code will never see instances of cli.System.Object, because all such objects appear to inherit from java.lang.Object instead.

If cli.System.Object *is* visible to Java code, it introduces a paradox: java.lang.Object inherits from cli.System.Object (per the way it's actually implemented) but cli.System.Object should appear to inherit from java.lang.Object (per Java's rule that *everything* inherits from java.lang.Object). Now, it may be possible to create magic glue code that inverts the apparent inheritance relationship like that, but do you really want to go there? :)

The inversion is exactly what I was thinking about. Stuart's analysis above contains a crucial mistake, java.lang.Object does not inherit from cli.System.Object. However, it is virtually impossible not to get confused about this stuff, so let's try to make the discussion a little easier by defining a naming convention:

  • java.lang.Object
    This is the base class of all Java classes (as seen from the Java side of the world).
  • [classpath]java.lang.Object
    This is an implementation artifact of IKVM, it is a .NET type that is used as the base type for all non-remapped Java classes.
  • System.Object
    This is the base class of all .NET classes (as seen from the .NET side of the world).
  • cli.System.Object
    This is the IKVM manifestation of the System.Object type on the Java side of the world.

The paradox is that [classpath]java.lang.Object inherits from System.Object and cli.System.Object inherits from java.lang.Object, but hopefully it is now clear that this isn't a problem. (BTW, one of the definitions of a paradox is "A seemingly contradictory statement that may nonetheless be true").

There are actually two reasons why I would want to do this:

  1. If a Java class extends a .NET type (that was exported using netexp) you see both the virtual methods in java.lang.Object as well as the ones in System.Object that the class in question happens to override (a fairly arbitrary set). By introducing cli.System.Object as the penultimate base class for all .NET types, this can be made much more consistent. cli.System.Object would have final implementations for all the virtual methods in java.lang.Object (to make sure that the essentially non-existing methods don't get overridden) and it would introduce the real virtual methods of System.Object.

  2. If you want to define your own "first class" .NET exception class in Java, you need to extend cli.System.Exception. In other words, it makes for a more powerful programming model to expose the remapped types in this way.

Wednesday, March 10, 2004 10:15:16 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [6]
# Tuesday, March 9, 2004
Object Model Mapping Part 4

Yesterday I checked in a major change set that implements the new object model mapping infrastructure. Today I put the new snapshots online as well. The new implementation is about a thousand lines less code than the previous.

What's new

  • Many code changes to implement the new model.
  • When compiling classpath.dll, ikvmc now requires the -remap:map.xml option. This is the only time the mapping information is read from the XML. When code actually runs, or when other classes are compiled, the remapping information is read from custom attributes in classpath.dll.
  • Tracing infrastructure. Interesting points in the runtime now contain trace calls that can be enabled with a command line switch (or app.config setting). In addition, when Java code is compiled it can optionally be instrumented so that each method called writes its name and signature to the tracer. This has a big performance impact (it will be optimized a little bit in the future, but don't expect too much), so it is not enabled for classpath.dll, by default.
  • classpath.dll now contains the remapped types (java.lang.Object, java.lang.Throwable, java.lang.String and java.lang.Comparable). This means that if you want to create a Java like class in C# you can now extend java.lang.Object. Note however that you should never define your references as java.lang.Object, use System.Object instead. If you want to call a java.lang.Object instance method on a System.Object reference, use the corresponding static instancehelper method on java.lang.Object.


From the Java side of the fence, finalization continues to work as it always has, but when C# code is subclassing Java code, you should use the C# destructor if you need finalization. If you override the finalize method, you run the risk that it isn't called (it only gets called if one of your Java base classes actually overrides it). The C# destructor does the right thing. If you use another .NET language, you have to override Finalize and make sure that you call the base class Finalize. More complicated mixed scenarios (e.g. Java code subclassing C# code that subclasses Java code) are not supported at the moment (wrt finalization, other aspects should work fine).

What's next?

It's not quite done yet, but I'll be going through a stabilization phase before making any more changes. I have some ideas for changes to the way the remapped .NET types appear on the Java side (e.g. should it be possible to extend cli.System.Object in Java?). There are also some optimizations that can be done and there still remains some restructuring to be done.


I've tested this snapshot pretty well, but considering the scale of the changes, I expect some regressions. Bug reports are appreciated (as always).

New snapshots: just the binaries and source plus binaries.


Next month I'm speaking again at the rOOts conference in Bergen, Norway, where I had a very good time last year. Come and say hi if you're there. Also, I'm happy to be speaking again at the excellent (and fun) Colorado Software Summit in Keystone, Colorado in October.

Tuesday, March 9, 2004 10:42:12 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, March 3, 2004
Object Model Mapping Part 3

A brief update on the progress and some random thoughts on remapping. Today I got classpath.dll to build (and verify) for the first time using the new remapping infrastructure. Finally some progress. However, there is still some more work to do before it runs again.

Part of the new model is that there are now two different type wrappers (the internal class that contains the meta data of an IKVM type) for the remapped types. There is one type wrapper that is used during static compilation (RemapperTypeWrapper) and another one that is used during at runtime (CompiledTypeWrapper, also used for normal (i.e. non-remapped) statically compiled classes). The advantage of this is that there is less overhead at runtime and the code is also a bit less complex. This is also the final step in an invisible process that has been going on for a long time. It is now no longer possible to run IKVM without a statically compiled classpath.dll (it hasn't been possible for a while, but theoretically it could have been made to work). When I just got started, there was no static compiler yet and the only way for it to work was to load all classes dynamically, after the static compiler started to work, support for dynamically loading the core classes began to degenerate. That degeneration is now final and there is no way back ;-)

What's next?

More metadata needs to be emitted on the remapped types and CompiledTypeWrapper needs to be changed to understand it. The code needs to be cleaned up. I'm not sure yet, but I think a lot of complexity can be removed now. Virtual method invocation needs to be optimized. At the moment all virtual method call to remapped types go through the helper methods, but this is only needed is the reference type is not know to be a Java subclass. For example, calling java.lang.Object.toString() on a java.lang.Object reference requires the call to go through the helper method, but calling java.lang.Object.toString() on java.lang.Math (this is just a random example) doesn't need to go through the helper method.

Of course, there are also various loose ends that need to be tied up, but I think I'm on track to have a working version sometime next week.


Patrik posted an overview of the FOSDEM Java talks. It also includes links to the slides for some of the talks.

Wednesday, March 3, 2004 8:59:25 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Monday, March 1, 2004
Object Model Mapping Part 2

In yesterday's entry I didn't get to the stuff that kept me going in circles last week. I had decided on the mixed model a few months ago, but as usual the devil is in the details.

Initially I wanted to keep the map.xml format more or less the same and I think that put me on the wrong track. Let's start by taking a look at some of the current map.xml features. Within the <class> tag there are <constructor> and <method> tags. These can contain several different child tags:

  • Empty (i.e. no child tags)
    The method is identical to the corresponding method in the underlying type. Example: The constructor of java.lang.Object is identical to the constructor of System.Object, so the tag looks like this:
    <constructor sig="()V" modifiers="public" />
  • <redirect>
    The method is redirected to another method. This can be a static method in a helper class or a static or instance method in the underlying type. Example: java.lang.Object.notifyAll() is redirected to System.Threading.Monitor.PulseAll():
    <method name="notifyAll" sig="()V" modifiers="public final">
        <redirect class="System.Threading.Monitor, mscorlib" name="PulseAll" sig="(Ljava.lang.Object;)V" type="static" />
  • <invokespecial>
    If the method is invoked using the invokespecial bytecode, this CIL sequence is emitted. Example: java.lang.Object.wait() is implemented by calling System.Threading.Monitor.Wait(), but Monitor.Wait returns a boolean that has to be discarded:
    <method name="wait" sig="()V" modifiers="public final">
            <call type="System.Threading.Monitor, mscorlib" name="Wait" sig="(Ljava.lang.Object;)Z" />
            <pop />
  • <invokevirtual>
    Similar to <invokespecial>, but this defines the CIL that is emitted when the invokevirtual bytecode is used to call the method.
  • <override>
    Specifies that this method conceptually overrides the method named in the <override> tag. I say "conceptually" because in the equivalence model there is no real class. However, if a real subclass would override this method, it would actually be overriding the method named in the override tag. Example: java.lang.Object.hashCode() overrides System.Object.GetHashCode:
    <method name="hashCode" sig="()I" modifiers="public">
        <override name="GetHashCode" />
            <dup />
            <isinst type="System.String, mscorlib" />
            <brfalse name="skip" />
            <castclass type="System.String, mscorlib" />
            <call class="java.lang.StringHelper" name="hashCode" sig="(Ljava.lang.String;)I" />
            <br name="end" />
            <label name="skip" />
            <callvirt type="System.Object, mscorlib" name="GetHashCode" />
            <label name="end" />
  • <newobj>
    Used in constructors to define the CIL that is emitted when a new instance is created. Example: java.lang.String has a default constructor, but System.String doesn't:
    <constructor sig="()V" modifiers="public">
            <ldstr value="" />
            <call type="System.String, mscorlib" name="Copy" />

The thing to note is that some of the remapping implications are still handled manually in this scheme. For example, the <invokevirtual> of Object.hashCode has to check for string instances. This information can be derived from the remapping file and it shouldn't be necessary to do this explicitly.

I didn't really like the <invokespecial> and <invokevirtual> constructs and I explored the idea of only having a <body> tag that contains the method body. However, it soon became clear that that wouldn't be enough. For example, the implementation of java.lang.Object.clone needs to call the protected method System.Object.MemberwiseClone and this is only allowed in subclasses. So it wouldn't be possible to generate a verifiable helper method for that.

The obvious solution (in hindsight) came to me when I realised that there are actually two types of "subclasses" of java.lang.Object, the ones that really extend java.lang.Object (our generated .NET type) and the ones that don't (e.g. arrays, System.String and all other .NET types). I knew this before, of course, but I was trying to make the model too general. After this realisation, it became obvious that every method should have a <body> and an <alternateBody> (tentative names).

After I've modified the remapping engine to automatically handle all the overridden method implications, the <alternateBody> construct will not be needed for many methods. I think only for Object.clone and Object.finalize and both will be trivial. The <alternateBody> for clone will throw a CloneNotSupportedException (it could also check if the object implements ICloneable and if so, invoke that Clone method, but does this really help?) and the <alternateBody> for finalize will simply be empty, since there is no good reason to ever explicitly invoke the Finalize method of a .NET type.

As an aside, I'm also going to remove the <redirect> construct, because it doens't really add any value. It's just as easy to have a <body> with a <call>.

I'm not clear on the performance implications of these changes. In the existing model, many of the remapping constructs are inlined, but in the new model they won't be, invokespecial will end up calling the real method in the new classes and invokevirtual will call the static helper method. This will probably be slightly slower, but I think the other advantages easily outweigh this.

Another advantage of this scheme that I didn't yet mention is that reflection on remapped methods is now trivial. Currently, the following program doesn't work on IKVM, in the new model the call would simply end up at the static helper method for clone:

class ReflectClone
  public static void main(String[] args) throws Exception
    java.lang.reflect.Method m;
    m = Object.class.getDeclaredMethod("clone", new Class[0]);
    System.out.println(m.invoke(args, new Object[0]));

BTW, I originally tried this by getting the public clone method on the array class, but oddly enough on the Sun JVM array types don't appear to have a public clone method (even though you can call it just fine!).

Monday, March 1, 2004 10:59:33 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [4]
# Sunday, February 29, 2004
Object Model Mapping

After suffering from coder's block (if that's the programmer's equivalent of writer's block) for weeks, I finally got started the past week on the new object model remapping infrastructure. I spent most of the week going in circles. Writing code, deleting it, writing more code, deleting it again. However, I think I figured it out now. To test that theory I decided to write a blog entry about it. Explaining something is usually a good way to find holes in your own understanding.

I'm going to start out by describing the problem. Then I'll look at the existing solution and it's limitations. Finally I'll explain the new approach I came up with. If everything goes well, by end the of the entry you'll be wondering "why did it take him so long to come up with this, it's obvious!". That means I came up with the right solution and I explained it well.

Note that I'm ignoring ghost interfaces in this entry. The current model will stay the same. For details, see the previous entries on ghost interfaces.

What are the goals?

  • We need a way to have the Java object model fit into the .NET model
  • We would like to enable full interoperability between the two models
  • Performance should be acceptable
  • Implementation shouldn't be overly complex

The .NET model (relevant types only):

For comparison, here is the Java model that we want to map to the .NET model:

There are several possible ways to go (I made up some random names):

  • Equivalence
    This is the current model. The Java classes are mapped to the equivalent .NET types. This works because System.Object has mostly the same virtual methods as java.lang.Object. For the java.lang.Throwable to System.Exception, a little more work is needed and that is where the java.lang.Throwable$VirtualMethods interface comes in. When a virtual method on Throwable is called through a System.Exception reference, the compiler calls a static helper method in Throwable$VirtualMethodsHelper that checks if the passed object implements the Throwable$VirtualMethods interface and if so, it calls the interface method, if not, it calls the Throwable implementation of the method (i.e. it considers the method not overridden by a subclass). A downside of using an interface for this is that all interface methods must be public, at the moment this isn't a problem because all virtual methods (except for clone and finalize derived from Object) in Throwable are public, but it could become a problem later on.
  • Extension
    This is a fairly straightforward approach where java.lang.Object extends System.Object and the Java array, String and Throwable classes are simply subclasses of java.lang.Object. It is easy to implement. The obvious downsides are that arrays will be slow (extra indirection), Strings need to be wrapped/unwrapped when doing interop with .NET code and Throwable is not a subclass of System.Exception (the CLI supports this, but once again not a good idea for interop).
  • Wrapping
    I apologise in advance, because I probably can't explain this one very well (because it doesn't make any sense to me). Many people have actually suggested this model. In the model java.lang.Object extends System.Object, but arrays, String and Throwable do not extend java.lang.Object, instead whenever an instance of those types is assigned to a java.lang.Object reference, it is wrapped in an instance of a special java.lang.Object wrapper class. The downside of this model is that wrapping and unwrapping is expensive and (and this is why I don't like this approach at all) that the expense is paid in ways that are very unexpected to the Java programmer (who expects simple assignment to be expensive?).
  • Mixed
    This is the new model. Explanation follows below.

What's wrong with equivalance?

Both J# and the current version of IKVM use equivalence (although many of the details differ and J# doesn't consider Throwable and System.Exception to be equivalent) and it works well. So why change it? There are four advantages to the mixed model:

  • Interop works better
    In the current model, if you subclass a Java class from C# and you want to override Object.equals, depending on whether any of the base classes overrides Object.equals you need tot override either Equals or equals. If you want to call Throwable.getStackTrace() from C# on a reference of type System.Exception there is no obvious way to do that.
  • More efficient representation of remapping model
    Currently every subclass of java.lang.Object overrides toString() to call ObjectHelper.toStringSpecial, this is needless code bloat. More importantly, before any classes can be resolved the map.xml remapping information needs to be parsed and interpreted (to know about java.lang.Object, java.lang.String and java.lang.Throwable). In the new model, java.lang.Object, java.lang.String and java.lang.Throwable will be in the classpath.dll, so they can be resolved on demand. The classes will be decorated with custom attributes to explain to the runtime that they are special, but no other metadata will need to be parsed or interpreted.
  • Cleaner model
    java.lang.ObjectHelper and java.lang.StringHelper no longer need to be public and the various $VirtualMethod helper types aren't needed anymore.
  • Easier to get right
    There are a few subtle bugs in the current implementation. Try the following for example:
    class ThrowableToString
      public static void main(String[] args) throws Exception
        String s = "cli.System.NotImplementedException";
        Object o = Class.forName(s).newInstance();
        Throwable t = (Throwable)o;
    It prints out:
    System.NotImplementedException: The method or operation is not implemented.
    cli.System.NotImplementedException: The method or operation is not implemented.

    Obvously, both lines should be the same. Another (at the moment theoretical) problem is that it is legal for code in the java.lang package to call Object.clone or Object.finalize (both methods are protected, but in Java, protected also implies package access), currently that wouldn't work.

Here is the mixed model I ended up with:

I called it mixed because it combines some features of equivalence and extension. For example, references of type java.lang.Object are still compiled as System.Object (like in the equivalence model), but the non-remapped Java classes extend java.lang.Object (like in the extension model).

java.lang.Object will contain all methods of the real java.lang.Object and in addition to those also a bunch of static helper methods that allow you to call java.lang.Object instance methods on System.Object references. The helper methods will test if the passed object is either a java.lang.Object or a java.lang.Throwable (for virtual methods) and if so, it will downcast and call the appropriate method on those classes, if not, it will perform an alternative action (that was specified in map.xml when this classpath.dll was compiled).

Object.finalize requires some special treatment since we don't want java.lang.Object.finalize to override System.Object.Finalize because that would cause all Java objects to end up on the finalizer queue and that's very inefficient. So the compiler will contain a rule to override System.Object.Finalize when a Java class overrides java.lang.Object.finalize.

I glossed over a lot of details, but those will have to wait for next time.


Finally a short note on FOSDEM (Free and Open Source Software Developer's Meeting). Last weekend I visisted FOSDEM in Brussels. I enjoyed seeing Dalibor, Chris, Mark, Sascha and Patrik again and I also enjoyed meeting gjc hackers Tom Tromey and Andrew Haley for the first time. Mark wrote up a nice report about it. If you haven't read it yet, go read it now. All in all a very good and productive get-together.

Sunday, February 29, 2004 3:43:01 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, February 18, 2004

Stuart pointed out the F.A.Q. was out of date, so I updated it a little bit. He also asked:

Speaking of which, I noticed while perusing the FAQ that the JIT compiler is included in IK.VM.NET.dll which means it's required for running even statically-compiled code. For apps that don't use clever classloading tricks, the JIT isn't needed at all when everything's been statically compiled. Would it be possible to separate the JIT out into a different DLL to reduce the necessary dependencies for a statically-compiled Java app?

Sure, the 275K of IK.VM.NET.dll is miniscule compared to the 3Mb of classpath.dll, but it's the principle of the thing ;)

This is definitely something I want to do. In fact, I would also like to have the option to only include parts of the Classpath code when statically compiling a library. So instead of having a dependency on classpath.dll, you'd suck in only the required classes.

Wednesday, February 18, 2004 9:36:11 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [4]