# Monday, 29 March 2004
Backward Branch Constraints

The last snapshot was totally broken on Mono (*blush*). My apologies to the Mono users. I should test my snapshots on Mono before releasing them, but I'm lazy so this sometimes slips through the cracks.

The breakage was kind of interesting though. As I wrote last time, I rewrote the bytecode compiler to emit CIL in the same order as the Java bytecode. This caused invalid (per the ECMA spec) CIL to be generated in some cases. The interesting thing is that the code isn't really invalid, the only reason it is invalid is because the spec says so:

Partition III -- 1.7.5 Backward Branch Contraints

It must be possible, with a single forward-pass through the CIL instruction stream for any method, to infer the exact state of the evaluation stack at every instruction (where by “state” we mean the number and type of each item on the evaluation stack).

In particular, if that single-pass analysis arrives at an instruction, call it location X, that immediately follows an unconditional branch, and where X is not the target of an earlier branch instruction, then the state of the evaluation stack at X, clearly, cannot be derived from existing information. In this case, the CLI demands that the evaluation stack at X be empty.

(Note that the section numbering seems to change with each version, this is from the working draft of June 2003.)

Java bytecode has no such requirment, so my straight forward translation caused Java bytecode that is only reachable through a backward branch to be translated into invalid CIL.

The Microsoft verifier and JIT don't require this constraint to be met and they will happily verify and JIT code that violates this constraint. However, the Mono JIT relies on it and so it was unable to handle some of the CIL that IKVM generated.

I fixed the IKVM bytecode compiler to emit ECMA compliant code (at least for this particular issue, who knows what else is wrong). I also fixed exception mapping, which didn't work on Mono as it relied on a currently unimplemented feature (I think, I didn't really investigate).

I think that, realistically, the Mono JIT will also have to be "fixed" to support the broken code, as you can be sure that there will be compilers that emit broken code, because it works on the Microsoft runtime.

New snapshots: just the binaries and source plus binaries.

Monday, 29 March 2004 12:56:28 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Friday, 26 March 2004
Debugging Support

I finished and fixed the local variable analysis and made various improvements to the debugging experience (in Visual Studio .NET). In the process I discovered some quirks in jikes' debugging tables. It associates two line numbers with the same bytecode address when that bytecode address is also the start of an exception block. It also (I think incorrectly) starts a local variable scope before the store instruction that first initializes that local variable. Javac starts the local variable scope immediately after the first store to that local, this makes more sense to me, but also required a hack in the local variable debugging support to make sure that the IL variable scope starts at the right position (i.e. before the first store to that local variable).

To make local variable scopes work, I had to change the bytecode compiler to emit IL in the same order as the original Java bytecode. For some reason I originally wrote the compiler to do the same code flow analysis as the verifier and this resulted in a somewhat arbitrary order of the generated IL. I finally fixed that and I also removed the recursion from the compiler (it now uses an explicit stack to keep track of exception blocks).

Note that unreachable code is still not compiled (it can't be verified, so it can't be compiled), but there will be nop IL instructions for each unreachable bytecode instruction, so you can set breakpoints and move the instruction pointer there. This could be a bit confusing and I probably should fix it at some point (by not emitting the nops).

I also had to reintroduce the automatic downcasting of locals, because I realised that it is possible to write bytecode by hand that depends on this (Java source will never require it). In non-debug builds of classpath.dll this introduces 12 (unnecessary) downcasts, so that's not really worth optimising.

What's new?

  • When compiling with -debug option, dead store are not optimized out anymore.
  • When compiling with -debug option, local variables are merged based on LocalVariableTable information.
  • When compiling with -debug option, the first instruction of a method will always (except when compiling with -Xmethodtrace) have an associated line number now (to enable stepping into the method).
  • When compiling with -debug option, local variables are now scoped based on LocalVariableTable information. Different locals with the same name are now handled correctly in the debugger.
  • Made helper method MethodInfo caching more consistent.
  • Method arguments that are value types are now boxed on method entry and not on each individual load, this makes them behave consistent with local variables.
  • Reusing method argument local variable slots (with different types) is now fully supported. I've never seen a Java compiler do this, but it is legal.
  • lookupswitch/tableswitch branches into exception blocks are now handled correctly (exception block is split around branch target). This completes branch/exception block handling. I originally thought that this would never happen, but apparently it does.
  • Added a helper method to emit Ldc_I4 that always uses the optimal encoding.
  • Added a class loader check to the System.arraycopy optimization check, to make sure that we're in fact calling System.arraycopy on the bootstrap version of java.lang.System.
  • Many fixes to the local variable analysis.
  • Added -srcpath option to ikvmc. If specified, this prepends the specified path and the package name to the source file name in the debugging information. In most cases this will be the correct location for the source file and will allow the debugger to automatically load the correct source.
  • Added AWT peer for Window.

New snapshots: just the binaries and source plus binaries.

Friday, 26 March 2004 11:30:36 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Thursday, 25 March 2004
Q & A

In the comments on the previous entry, Stuart asks:

Given an instance of System.Type, how do I determine what the typename is in Java?

This is tricky and my (lame) answer is that you shouldn't have to. Do you have a specific scenario in mind? Having said that, in your comment you're on the right track, there is no one-to-one mapping between System.Type instances and java.lang.Class instances. What I think you're after is the fact that you'll never encounter cli.System.Object or cli.System.Exception as field types or in method signatures. The only way to get the class object for these types, is by calling getClass() on an instance or by calling Class.getSuperclass() on a subclass of them.

Note that the above behaviour isn't actually implemented yet. At the moment cli.System.Object and cli.System.Exception are simply helper classes with static methods and you'll never encounter them as field types, in method signatures or as instance types.

As to which one to use from Java, you're right when you say java.lang.* is the answer is most cases. The only times you want to use cli.System.Object or cli.System.Exception is when you subclass them from Java to make your Java class look more .NET like to .NET consumers.

What would be really nice would be if there were something in the IK.VM.NET.dll that would let me answer this question authoritatively with a simple call...

There is the NativeCode.java.lang.VMClass.getClassFromType() method that is used by the runtime internally (it's public, so you could call it), but I can't really guarantee that it'll stay around or behave consistently over time. At some point in the future there'll probably be a utility class in the ikvm.lang package that will contain conversion method to go from System.Type to java.lang.Class and vice versa (but it probably will require you to specify to context for the mapping).

Oh, one other thing that occurred to me would be a nice feature: if a class implemented in Java could put itself into the cli.* package and thereby make itself look to other Java code as if it were a .NET type, without actually putting it in a cli.* namespace in .NET. In other words, a Java class cli.Foo.Bar would be compiled as namespace Foo.Bar *without* the attribute that preserves its name in Java, so that Foo.Bar then gets translated back to cli.Foo.Bar when Java code sees it.

Sort of the inverse of the attribute for turning name mangling off on the .NET side.

Can you explain why and when you'd want to use this?

Mike asks:

Are there any tasks for a CS scrub like myself to work on with IKVM?

Here is a partial list of things that need to be done:

  • Implement the AWT peers
  • Implement the missing JNI methods
  • Build a framework to test the verifier / bytecode compiler
  • Write a Java implementation of String.valueOf(float) and String.valueOf(double)
  • Search the source for // TODO for things that seem doable (or write test cases that show the current code is broken)
  • Write documentation
  • Design a logo
  • Design a website

If you (or anyone) decides to work on something, please send e-mail to the ikvm-developers list, so we can coordinate.

Thursday, 25 March 2004 10:28:51 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
# Saturday, 20 March 2004
Documentation

Miguel posted a nice example of how to use Gtk# from Java using IKVM/Mono on his blog. In response Pablo posted a question to the Mono list and Jonathan Pryor replied with a nice explanation of how delegates are handled to the IKVM and Mono lists (quoted with permission, slightly edited):

From: Jonathan Pryor
Sent: Friday, March 19, 2004 02:35
To: Pablo Baena
Cc: Miguel de Icaza; mono-list@lists.ximian.com; ikvm-developers@lists.sourceforge.net
Subject: Re: [Mono-list] Java and C#

Below...

On Thu, 2004-03-18 at 16:07, Pablo Baena wrote:
> Miguel: I saw your blog about IKVM. One thing I haven't been able to 
> investigate is, how useful can be Gtk# with Java. Because, for example, I 
> couldn't find a clue on how to attach a Java 'listener' to a C# event, or any 
> way to use attributes in Java.

They really need to document this better...

However, grepping through the ikvm.zip file (from their website), we
see:

// file: classpath/java/lang/VMRuntime.java
cli.System.AppDomain.get_CurrentDomain().add_ProcessExit (
  new cli.System.EventHandler (
    new cli.System.EventHandler.Method () {
      public void Invoke (Object sender, cli.System.EventArgs e) {
        Runtime.getRuntime().runShutdownHooks();
      }
    }
  )
);

>From this (and prior knowledge), we can draw the following statements:

1. Properties are actually functions with `get_' and `set_' prefixed to
them. Thus C# property System.AppDomain.CurrentDomain is the static
Java function cli.System.AppDomain.get_CurrentDomain().

2. Events are actually functions with `add_' and `remove_' prefixed to
their name. Thus C# event System.AppDomain.ProcessExit is the static
Java function cli.System.AppDomain.add_ProcessExit().

3. There is no equivalent to C# delegates in Java, so these are
translated into a class + interface pair. The EventHandler class is the
standard C# type name (cli.System.EventHandler), which takes as an
argument an interface to invoke, named "cli." + C# delegate type name +
".Method", hence cli.System.EventHandler.Method. The EventHandler.Method
interface has a function Invoke() which must be implemented, and this
method will be invoked when the event is signaled.

I suspect that there is no way to add attributes in Java. Microsoft's
Visual J# permits the use of Attributes (IIRC), but it's through their
Visual J++ syntax -- through a specially formed JavaDoc comment. 
Something like (from memory):

/**
* @attribute-name (args...)
*/
public void myMethod () {/* ... */}

Of course, that's compiler specific, and no standard Java compiler will
support that. So when it comes to attributes, you're probably up the
creek.

- Jon

I replied saying that I believe that the attribute construct in JDK 1.5 can probably be used to expose .NET attributes to Java (and use them in Java code that is target to run on IKVM).

Saturday, 20 March 2004 14:51:08 (W. Europe Standard Time, UTC+01:00)  #    Comments [3]
A Less Broken Snapshot?
In Tuesday's snapshot, ikvmc was completely broken. Sorry about that. The CoreClasses cache introduced an incorrect dependency between Object, Throwable and String. This caused Throwable or String to be loaded while it was being loaded and that resulted in an exception: System.ArgumentException: Item has already been added. Key in dictionary: "java.lang.Throwable" Key being added: "java.lang.Throwable"

Hopefully this snapshot will be a little better quality, but don't hold your breath, because the main change in this version is the addition of local variable liveness analysis to the verifier. This required some tricky code and made it clear to me (again) that the verifier desperately needs to be rewritten.

The trigger for the local variable liveness analysis was to be able to emit debugging information for local variables, but it also has the nice side effect of allowing a little better code generation. Previously, if a local variable slot was shared between two different reference types, the .NET local would have the type of the common base type, even if the uses were in fact totally distinct. The compiler had to emit downcasts whenever it emitted a load from one of those locals. In classpath.dll there were 1288 such downcasts. With the new liveness information, it is now possible to split those Java locals in multiple .NET locals, so these downcasts are now gone. Another optimization, which doesn't seem all that exciting, is the elimination of dead stores to local variables. In itself this is a fairly pointless optimization, because the CLR/Mono JIT will probably do it anyway. However, there is one very important optimization that can be done because of dead store elimination, in exception handling. Whenever an exception handler discards the exception object and the IKVM bytecode compiler can detect this, it can skip the (expensive) stack trace capturing that is normally required. I had already hacked some support to recognize these exception handlers (in classpath.dll there were 313 optimized exception handlers), but now it works much better (there are now 444 optimized exception handlers).

What's new?

  • Decorated the various ByteCodeHelper methods with the [DebuggerStepThroughAttribute] attribute to make stepping through the source code in the debugger less disruptive.
  • Restored the signature decoding methods in ClassFile.cs that I removed in the previous version. I had failed to realise that they're different from the ones in ClassLoaderWrapper, because they deal with unloadable classes.
  • Fixed CoreClasses to decouple the different classes (accessing one no longer triggers loading the others).
  • Changed handling of package accessible final fields (they're no longer turned into a property).
  • Fixed System.setOut (copy & paste mistake, it tried to set "in").
  • The debugging information is now classified as Java/Text. Not sure if this affects anything, but it seemed like the right thing to do.
  • Fixed debugging line number information to make sure the firt CIL instruction has a corresponding line number. Previously, Visual Studio .NET refused to step into an ikvmc compiled method.
  • Optimized dead stores to local variables and use new dead store information to optimize exception handling.
  • Local variables are now properly typed and have their names attached in debugging information (when ikvmc with the -debug option is used). NOTE: if the same variable name is reused in a method, the debugging information for those variables is not yet emitted correctly.
  • Fixed mapping of System.IntPtr to gnu.classpath.RawData. The mapping is now private to classpath.dll.
  • Changed JVM.CriticalFailure to write to always write to stderr on Unix instead of try to display a message box.
  • Fixed race condition between returning from Thread.join and the thread being removed from the thread pool / marked as dead.
  • Fixed Thread.yield to not consume thread interrupted status. (Note that Thread.sleep(0) behaves as Thread.yield() and also does not consume the interrupted status).

New snapshots: just the binaries and source plus binaries.

Saturday, 20 March 2004 14:26:49 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Tuesday, 16 March 2004
New Snapshot

Last week I said I'd go through a stabilization phase, but I couldn't resist the urge to implement some more stuff and fix various things. So this snapshot is a fairly big change again, but no major architectural overhaul like the pervious one.

What's new?

  • Merged with current Classpath cvs.
  • Support for JDK 1.5 style class literals (only for class files with version 49 or greater).
  • Removed signature decoding from ClassFile.cs (I once thought that it should live there, instead of in ClassLoaderWrapper, but that turned out not to be a good idea).
  • Added CoreClasses.cs to cache a few of the frequently used TypeWrappers (Object, Class, String and Throwable).
  • Fixed volatile long/double handling to use the (new in .NET 1.1) Thread.VolatileRead/VolatileWrite methods.
  • Changed type used in ImplementsAttribute to the ghost wrapper for ghosts.
  • Changed method name mangling for interface implementation stubs (shorter name and now uses a slash to make sure it doesn't clash with any Java method names).
  • Added support for Finalize/finalize method overriding when mixing Java and non-Java classes in the class hierarchy. I don't like this solution very much. The code is ugly and complicated.
  • Added special support for finalize method for .NET types that extend Java types.
  • Fixed handling of synchronized static methods. Previously, .NET MethodImplOptions.Synchronized flag was simply set, but this was incorrect because that causes the method to synchronize on the .NET Type object, instead of the Java Class object.
  • Fixed handling of instance calls on value types.
  • Fixed System.currentTimeMillis implementation to use DateTime.UtcNow instead of Environment.TickCount, to prevent overflow.
  • Changed System.setErr/setIn/setOut to use TypeWrapper based reflection instead of .NET reflection.
  • Changed handling of resources to use .NET resources instead of global fields, this allows resources to work in multi-module assemblies.
  • Changed URL format for assembly embedded resources from opaque to parseable, to facilitate parsing them as a URI.
  • Added support for passing ghost references to methods in map.xml instructions.
  • Fixed a regression introduced in the previous snapshot, that caused exception mapping not to be invoked for catch(Throwable).
  • Limited fixes to get AWT working again (after Classpath AWT changes).
  • Declared String.equals and String.compareTo(Object) in map.xml to make reflection appearance identical to JDK.
  • Implemented JDK 1.4 String methods that rely on regular expressions (Classpath now has java.util.regex.* support, although not 100% compatible with the JDK).
  • Minor performance improvement in String.hashCode implementation. Oddly enough, by doing the length check in the for condition, instead of manually hoisting it out of the loop. Apparantly the CLR JIT recognizes this pattern and optimizes it better.
  • Fixed Thread.join to work with non-Java created threads as well.
  • Fixed removal of non-Java created threads from ThreadGroup.
  • Fixed ServerSocket.accept() timeout support.
  • Fixed ikvmc handling of -reference assemblies (to handle the Load vs LoadFrom context issues).
  • Various comment fixes.

New snapshots: just the binaries and source plus binaries.

Tuesday, 16 March 2004 17:50:55 (W. Europe Standard Time, UTC+01:00)  #    Comments [3]
# Saturday, 13 March 2004
JDK 1.5 beta

Yesterday I looked at the JDK 1.5 beta that Sun released recently. There appears not to be a complete list of changes to the VM yet and the only things I found were a few new modifier bits (that haven't yet stabilized) and the fact that class literals are finally supported in the VM. This is important for IKVM.NET, because it makes class literals in statically compiled code work better and more efficient.

For a quick refresher of how class literals are currently compiled, let's look at how the following class is compiled:

class ClassLiteral
{
  public static void main(String[] args)
  {
    System.out.println(String.class);
  }
}

Compiling this with Jikes 1.19 and then disassembling it (I've left out the default constructor that the compiler generated):

class ClassLiteral extends java/lang/Object

static java/lang/Class class$java$lang$String
// Unknown attribute : Synthetic//

public static main([Ljava/lang/String;)V
// attrib length: 53
// max stacks: 3
// max locals: 1
// code length: 25
0 getstatic <Field java/lang/System java/io/PrintStream out>
3 getstatic <Field ClassLiteral java/lang/Class class$java$lang$String>
6 dup
7 ifnonnull 21
10 pop
11 ldc "[Ljava.lang.String;"
13 iconst_0
14 invokestatic <Method ClassLiteral class$(Ljava/lang/String;Z)Ljava/lang/Class;>
17 dup
18 putstatic <Field ClassLiteral java/lang/Class class$java$lang$String>
21 invokevirtual <Method java/io/PrintStream println(Ljava/lang/Object;)V>
24 return

static class$(Ljava/lang/String;Z)Ljava/lang/Class;
// Unknown attribute : Synthetic
//
// attrib length: 55
// max stacks: 3
// max locals: 4
// code length: 23
0 aload_0
1 invokestatic <Method java/lang/Class forName(Ljava/lang/String;)Ljava/lang
/Class;>
4 iload_1
5 ifne 11
8 invokevirtual <Method java/lang/Class getComponentType()Ljava/lang/Class;>

11 areturn
12 new java/lang/NoClassDefFoundError
15 dup_x1
16 invokespecial <Method java/lang/NoClassDefFoundError <init>()V>
19 invokevirtual <Method java/lang/Throwable initCause(Ljava/lang/Throwable;)
Ljava/lang/Throwable;>
22 athrow
Exception table:
start_pc = 0
end_pc = 12
handler_pc = 12
catch_type = java/lang/ClassNotFoundException

The amount of code generated is pretty bizarre. Note that this isn't Jikes' fault, there just isn't a way to do it better. Now, here is what it looks like compiled with javac from the 1.5 beta (specifying the -target 1.5 option):

class ClassLiteral extends java/lang/Object

public static main([Ljava/lang/String;)V
// attrib length: 38
// max stacks: 2
// max locals: 1
// code length: 10
0 getstatic <Field java/lang/System java/io/PrintStream out>
3 ldc_w java/lang/String
6 invokevirtual <Method java/io/PrintStream println(Ljava/lang/Object;)V>
9 return

This looks a lot better! No new bytecode instruction was added, instead the ldc instruction was modified to allow referencing a CONSTANT_Class_info . When the VM encounters this it loads the class and pushes the class object on the stack. I added support for this to IKVM.NET (not in cvs yet) in about 15 minutes. When JDK 1.1 was released (the first version to support class literals in the source), I wondered why they didn't add VM support at the same time, but fortunately they finally got around to it.

Trivia

If you looked closely at the Jikes generated code, you may have noticed that Jikes actually loads the string array class ("[Ljava.lang.String;") instead of java.lang.String. Why does it do this? It does this, because it correctly implements the JLS. The JLS says that class literals should not cause a class to be initialized. Doing a Class.forName() initializes the class, but when you initialize an array class you don't initialize the component type class. So this is a clever trick. Javac doesn't do this, so it (incorrectly) causes the class to be initialized.

IKVM.NET

Why does this change help statically compiled code in IKVM.NET? Performance is a bit better, but that's not the most important difference. The real benefit shows up when you statically compile code into multiple assemblies. If one assembly references a class in another assembly via a class literal, you'd better be sure that the referenced assembly is already loaded in the AppDomain, otherwise the IKVM.NET runtime is unable to find the class. In the new (JDK 1.5) way of references class literals, it is no longer opaque to ikvmc, so it can now compile the construct in such a way that the class literal causes the appropriate assembly to be loaded by the .NET runtime when it is executed.

StringBuilder

Something that struck me a funny is the new StringBuilder class that JDK 1.5 includes. It's almost identical to StringBuffer, except that it is not thread safe. If you look at the Rotor source code, you can see that the .NET StringBuilder also started life as StringBuffer. Now if the next version of .NET includes a thread safe version of StringBuilder and name it StringBuffer, we've come full circle ;-)

Saturday, 13 March 2004 18:38:45 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, 10 March 2004
To Invert Or Not To Invert

Stuart commented:
I'm not convinced that cli.System.Object should be visible to Java at all. AIUI, Java code will never see instances of cli.System.Object, because all such objects appear to inherit from java.lang.Object instead.

If cli.System.Object *is* visible to Java code, it introduces a paradox: java.lang.Object inherits from cli.System.Object (per the way it's actually implemented) but cli.System.Object should appear to inherit from java.lang.Object (per Java's rule that *everything* inherits from java.lang.Object). Now, it may be possible to create magic glue code that inverts the apparent inheritance relationship like that, but do you really want to go there? :)

The inversion is exactly what I was thinking about. Stuart's analysis above contains a crucial mistake, java.lang.Object does not inherit from cli.System.Object. However, it is virtually impossible not to get confused about this stuff, so let's try to make the discussion a little easier by defining a naming convention:

  • java.lang.Object
    This is the base class of all Java classes (as seen from the Java side of the world).
  • [classpath]java.lang.Object
    This is an implementation artifact of IKVM, it is a .NET type that is used as the base type for all non-remapped Java classes.
  • System.Object
    This is the base class of all .NET classes (as seen from the .NET side of the world).
  • cli.System.Object
    This is the IKVM manifestation of the System.Object type on the Java side of the world.

The paradox is that [classpath]java.lang.Object inherits from System.Object and cli.System.Object inherits from java.lang.Object, but hopefully it is now clear that this isn't a problem. (BTW, one of the definitions of a paradox is "A seemingly contradictory statement that may nonetheless be true").

There are actually two reasons why I would want to do this:

  1. If a Java class extends a .NET type (that was exported using netexp) you see both the virtual methods in java.lang.Object as well as the ones in System.Object that the class in question happens to override (a fairly arbitrary set). By introducing cli.System.Object as the penultimate base class for all .NET types, this can be made much more consistent. cli.System.Object would have final implementations for all the virtual methods in java.lang.Object (to make sure that the essentially non-existing methods don't get overridden) and it would introduce the real virtual methods of System.Object.

  2. If you want to define your own "first class" .NET exception class in Java, you need to extend cli.System.Exception. In other words, it makes for a more powerful programming model to expose the remapped types in this way.

Wednesday, 10 March 2004 10:15:16 (W. Europe Standard Time, UTC+01:00)  #    Comments [6]
# Tuesday, 09 March 2004
Object Model Mapping Part 4

Yesterday I checked in a major change set that implements the new object model mapping infrastructure. Today I put the new snapshots online as well. The new implementation is about a thousand lines less code than the previous.

What's new

  • Many code changes to implement the new model.
  • When compiling classpath.dll, ikvmc now requires the -remap:map.xml option. This is the only time the mapping information is read from the XML. When code actually runs, or when other classes are compiled, the remapping information is read from custom attributes in classpath.dll.
  • Tracing infrastructure. Interesting points in the runtime now contain trace calls that can be enabled with a command line switch (or app.config setting). In addition, when Java code is compiled it can optionally be instrumented so that each method called writes its name and signature to the tracer. This has a big performance impact (it will be optimized a little bit in the future, but don't expect too much), so it is not enabled for classpath.dll, by default.
  • classpath.dll now contains the remapped types (java.lang.Object, java.lang.Throwable, java.lang.String and java.lang.Comparable). This means that if you want to create a Java like class in C# you can now extend java.lang.Object. Note however that you should never define your references as java.lang.Object, use System.Object instead. If you want to call a java.lang.Object instance method on a System.Object reference, use the corresponding static instancehelper method on java.lang.Object.

Finalization

From the Java side of the fence, finalization continues to work as it always has, but when C# code is subclassing Java code, you should use the C# destructor if you need finalization. If you override the finalize method, you run the risk that it isn't called (it only gets called if one of your Java base classes actually overrides it). The C# destructor does the right thing. If you use another .NET language, you have to override Finalize and make sure that you call the base class Finalize. More complicated mixed scenarios (e.g. Java code subclassing C# code that subclasses Java code) are not supported at the moment (wrt finalization, other aspects should work fine).

What's next?

It's not quite done yet, but I'll be going through a stabilization phase before making any more changes. I have some ideas for changes to the way the remapped .NET types appear on the Java side (e.g. should it be possible to extend cli.System.Object in Java?). There are also some optimizations that can be done and there still remains some restructuring to be done.

Snapshots

I've tested this snapshot pretty well, but considering the scale of the changes, I expect some regressions. Bug reports are appreciated (as always).

New snapshots: just the binaries and source plus binaries.

Conferences

Next month I'm speaking again at the rOOts conference in Bergen, Norway, where I had a very good time last year. Come and say hi if you're there. Also, I'm happy to be speaking again at the excellent (and fun) Colorado Software Summit in Keystone, Colorado in October.

Tuesday, 09 March 2004 10:42:12 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, 03 March 2004
Object Model Mapping Part 3

A brief update on the progress and some random thoughts on remapping. Today I got classpath.dll to build (and verify) for the first time using the new remapping infrastructure. Finally some progress. However, there is still some more work to do before it runs again.

Part of the new model is that there are now two different type wrappers (the internal class that contains the meta data of an IKVM type) for the remapped types. There is one type wrapper that is used during static compilation (RemapperTypeWrapper) and another one that is used during at runtime (CompiledTypeWrapper, also used for normal (i.e. non-remapped) statically compiled classes). The advantage of this is that there is less overhead at runtime and the code is also a bit less complex. This is also the final step in an invisible process that has been going on for a long time. It is now no longer possible to run IKVM without a statically compiled classpath.dll (it hasn't been possible for a while, but theoretically it could have been made to work). When I just got started, there was no static compiler yet and the only way for it to work was to load all classes dynamically, after the static compiler started to work, support for dynamically loading the core classes began to degenerate. That degeneration is now final and there is no way back ;-)

What's next?

More metadata needs to be emitted on the remapped types and CompiledTypeWrapper needs to be changed to understand it. The code needs to be cleaned up. I'm not sure yet, but I think a lot of complexity can be removed now. Virtual method invocation needs to be optimized. At the moment all virtual method call to remapped types go through the helper methods, but this is only needed is the reference type is not know to be a Java subclass. For example, calling java.lang.Object.toString() on a java.lang.Object reference requires the call to go through the helper method, but calling java.lang.Object.toString() on java.lang.Math (this is just a random example) doesn't need to go through the helper method.

Of course, there are also various loose ends that need to be tied up, but I think I'm on track to have a working version sometime next week.

More FOSDEM

Patrik posted an overview of the FOSDEM Java talks. It also includes links to the slides for some of the talks.

Wednesday, 03 March 2004 20:59:25 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Monday, 01 March 2004
Object Model Mapping Part 2

In yesterday's entry I didn't get to the stuff that kept me going in circles last week. I had decided on the mixed model a few months ago, but as usual the devil is in the details.

Initially I wanted to keep the map.xml format more or less the same and I think that put me on the wrong track. Let's start by taking a look at some of the current map.xml features. Within the <class> tag there are <constructor> and <method> tags. These can contain several different child tags:

  • Empty (i.e. no child tags)
    The method is identical to the corresponding method in the underlying type. Example: The constructor of java.lang.Object is identical to the constructor of System.Object, so the tag looks like this:
    <constructor sig="()V" modifiers="public" />
  • <redirect>
    The method is redirected to another method. This can be a static method in a helper class or a static or instance method in the underlying type. Example: java.lang.Object.notifyAll() is redirected to System.Threading.Monitor.PulseAll():
    <method name="notifyAll" sig="()V" modifiers="public final">
        <redirect class="System.Threading.Monitor, mscorlib" name="PulseAll" sig="(Ljava.lang.Object;)V" type="static" />
    </method>
  • <invokespecial>
    If the method is invoked using the invokespecial bytecode, this CIL sequence is emitted. Example: java.lang.Object.wait() is implemented by calling System.Threading.Monitor.Wait(), but Monitor.Wait returns a boolean that has to be discarded:
    <method name="wait" sig="()V" modifiers="public final">
        <invokespecial>
            <call type="System.Threading.Monitor, mscorlib" name="Wait" sig="(Ljava.lang.Object;)Z" />
            <pop />
        </invokespecial>
    </method>
  • <invokevirtual>
    Similar to <invokespecial>, but this defines the CIL that is emitted when the invokevirtual bytecode is used to call the method.
  • <override>
    Specifies that this method conceptually overrides the method named in the <override> tag. I say "conceptually" because in the equivalence model there is no real class. However, if a real subclass would override this method, it would actually be overriding the method named in the override tag. Example: java.lang.Object.hashCode() overrides System.Object.GetHashCode:
    <method name="hashCode" sig="()I" modifiers="public">
        <override name="GetHashCode" />
        <invokevirtual>
            <dup />
            <isinst type="System.String, mscorlib" />
            <brfalse name="skip" />
            <castclass type="System.String, mscorlib" />
            <call class="java.lang.StringHelper" name="hashCode" sig="(Ljava.lang.String;)I" />
            <br name="end" />
            <label name="skip" />
            <callvirt type="System.Object, mscorlib" name="GetHashCode" />
            <label name="end" />
        </invokevirtual>
    </method>
  • <newobj>
    Used in constructors to define the CIL that is emitted when a new instance is created. Example: java.lang.String has a default constructor, but System.String doesn't:
    <constructor sig="()V" modifiers="public">
        <newobj>
            <ldstr value="" />
            <call type="System.String, mscorlib" name="Copy" />
        </newobj>
    </constructor>

The thing to note is that some of the remapping implications are still handled manually in this scheme. For example, the <invokevirtual> of Object.hashCode has to check for string instances. This information can be derived from the remapping file and it shouldn't be necessary to do this explicitly.

I didn't really like the <invokespecial> and <invokevirtual> constructs and I explored the idea of only having a <body> tag that contains the method body. However, it soon became clear that that wouldn't be enough. For example, the implementation of java.lang.Object.clone needs to call the protected method System.Object.MemberwiseClone and this is only allowed in subclasses. So it wouldn't be possible to generate a verifiable helper method for that.

The obvious solution (in hindsight) came to me when I realised that there are actually two types of "subclasses" of java.lang.Object, the ones that really extend java.lang.Object (our generated .NET type) and the ones that don't (e.g. arrays, System.String and all other .NET types). I knew this before, of course, but I was trying to make the model too general. After this realisation, it became obvious that every method should have a <body> and an <alternateBody> (tentative names).

After I've modified the remapping engine to automatically handle all the overridden method implications, the <alternateBody> construct will not be needed for many methods. I think only for Object.clone and Object.finalize and both will be trivial. The <alternateBody> for clone will throw a CloneNotSupportedException (it could also check if the object implements ICloneable and if so, invoke that Clone method, but does this really help?) and the <alternateBody> for finalize will simply be empty, since there is no good reason to ever explicitly invoke the Finalize method of a .NET type.

As an aside, I'm also going to remove the <redirect> construct, because it doens't really add any value. It's just as easy to have a <body> with a <call>.

I'm not clear on the performance implications of these changes. In the existing model, many of the remapping constructs are inlined, but in the new model they won't be, invokespecial will end up calling the real method in the new classes and invokevirtual will call the static helper method. This will probably be slightly slower, but I think the other advantages easily outweigh this.

Another advantage of this scheme that I didn't yet mention is that reflection on remapped methods is now trivial. Currently, the following program doesn't work on IKVM, in the new model the call would simply end up at the static helper method for clone:

class ReflectClone
{
  public static void main(String[] args) throws Exception
  {
    java.lang.reflect.Method m;
    m = Object.class.getDeclaredMethod("clone", new Class[0]);
    m.setAccessible(true);
    System.out.println(m.invoke(args, new Object[0]));
  }
}

BTW, I originally tried this by getting the public clone method on the array class, but oddly enough on the Sun JVM array types don't appear to have a public clone method (even though you can call it just fine!).

Monday, 01 March 2004 10:59:33 (W. Europe Standard Time, UTC+01:00)  #    Comments [4]