# Wednesday, 18 February 2004
F.A.Q.

Stuart pointed out the F.A.Q. was out of date, so I updated it a little bit. He also asked:

Speaking of which, I noticed while perusing the FAQ that the JIT compiler is included in IK.VM.NET.dll which means it's required for running even statically-compiled code. For apps that don't use clever classloading tricks, the JIT isn't needed at all when everything's been statically compiled. Would it be possible to separate the JIT out into a different DLL to reduce the necessary dependencies for a statically-compiled Java app?

Sure, the 275K of IK.VM.NET.dll is miniscule compared to the 3Mb of classpath.dll, but it's the principle of the thing ;)

This is definitely something I want to do. In fact, I would also like to have the option to only include parts of the Classpath code when statically compiling a library. So instead of having a dependency on classpath.dll, you'd suck in only the required classes.

Wednesday, 18 February 2004 09:36:11 (W. Europe Standard Time, UTC+01:00)  #    Comments [4]
# Monday, 16 February 2004
Jikes 1.19, Bytecode Bug, Serialization and a New Snapshot
 

Jikes

I upgraded to Jikes 1.19 that was released recently. It didn't like the netexp generated stub jars (which is good, because it turns out they were invalid), so I fixed netexp to be better behaved in what it emits. Jikes didn't like the fact that the inner interfaces that I created had the ACC_STATIC modifier set at the class level, rightly so, but the error message it came up with was very confusing. Along the way I also discovered that it is illegal for constructors to be marked native (kind of odd, I don't really see why you couldn't have a native constructor). So I made them non-native and have a simple method body that only contains a return. That isn't correct either (from the verifier's point of view) and I guess I should change it to throw an UnsatifiedLinkError. That would also be more clear in case anyone tries to run the stubs on a real JVM.

Jikes 1.19 has a bunch of new pedantic warnings (some enabled by default). I don't think this is a helpful feature at  the moment. Warnings are only useful if you can make sure you don't get any (by keeping your code clean), but when you already have an existing codebase, this is very hard and in the case of Classpath, where you have to implement a spec, you often don't have the option to do the right thing. So I would like to have to option to have lint like comment switches to disable specific warnings in a specific part of the code.

Bytecode Bug

I also did some work to reduce the number of failing Mauve testcases on IKVM and that caused me to discover that the bit shifting instructions were broken (oops!). On the JVM the shift count is always masked by the number of bits (-1) in the integral type you're shifting. So for example:

int i = 3;
System.out.println(i << 33);

This prints out 6 ( 3 << (33 & 31)). On the CLI, if the shift count is greater than the number of bits in the integral type, the result is undefined. I had to fix the bytecode compiler to explicitly do the mask operation.

Serialization

Brian J. Sletten reported on the mailing list that deserialization was extremely slow. That was caused by the fact that reflection didn't cache the member information for statically compiled Java classes or .NET types. I fixed that and after that I also made some improvements to GNU Classpath's ObjectInputStream to speed it up even more. It's still marginally slower than the Sun JRE, but the difference shouldn't cause any problems.

Snapshot

I made a new snapshot. Here's what's new:

  • Changed classpath.build to disable jikes warnings (I know it's lame, but I grew tired of the useless warnings). I also added the -noTypeInitWarning option to ikvmc, to get rid of all the warning about running type initializers.
  • Implemented accessibility checks for Java reflection.
  • Cleaned up socket code and implemented all of the socket options (well, except for IP_MULTICAST_IF2).
  • Implemented Get###ArrayRegion/Set###ArrayRegion and GetPrimitiveArrayCritical/SetPrimitiveArray JNI functions.
  • Added all the 1.4 functions to the JNIEnv vtable.
  • Implemented support for field name overloading (a single class can have several different fields with the same name, if the types are different).
  • Changed the class format errors thrown by ClassFile.cs to .NET exception, instead of Java exception, to have better error handling in ikvmc.
  • Changed VMClass.getWrapperFromClass to use a delegate instead of reflection, to speed up reflection.
  • Fixed the compiler to mask shift counts for ishl, lshl, iushr, lushr, ishr, lshr bytecodes.
  • Fixed a bug in ghost handling (the bug could cause a "System.NotSupportedException: The invoked member is not supported in a dynamic module." exception).
  • Added EmitCheckcast and EmitInstanceOf virtual functions to TypeWrapper.
  • Added a LazyTypeWrapper base class for DotNetTypeWrapper and CompiledTypeWrapper, to cache the member information to speed up reflection.
  • Improved error handling in ikvmc.
  • Fixed netexp to generate valid (or less invalid) classes.
  • Regenerated (and checked in) mscorlib.jar, System.jar and System.Xml.jar with the new (fixed) version of netexp.

I didn't get around yet to removing the "virtual helpers" and introducing base classes for non-final remapped types (java.lang.Object and java.lang.Throwable).

New snapshots: just the binaries and source plus binaries.

Monday, 16 February 2004 16:33:14 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Sunday, 11 January 2004
New Snapshot
I made a new snapshot. Here's what's new:
  • Renamed Type property of TypeWrapper to TypeAsTBD. This is to make clear that eventually all references to TypeAsTBD should be removed (a more specific TypeAsXXX should be used).
  • Changed several InvalidOperationException tests to assertions.
  • Changed JavaAssemblyAttribute to JavaModuleAttribute. In the future this attribute will go away and be replaced by an attribute that is attached to each type. This is to support linking Java and non-Java types into the same (single file) assembly with the Whidbey linker.
  • Removed getTypeFromWrapper, getWrapperFromType, getType and getName from VMClass and added getWrapperFromClass.
  • Added TypeWrapper.RunClassInit.
  • Fixed serialization and reflection to be able to call constructor on already allocated objects.
  • Added class accessibility checks in various places.
  • Added EmitCheckCast and EmitInstanceOf to TypeWrapper.
  • Changed all .NET type names in map.xml to partially qualified names.
  • Added -target:module support to ikvmc.
  • Added assembly signing support to ikvmc.
  • Added option to set assembly version to ikvmc.
  • Fixed bug in ikvmc that caused crash if jar contained a manifest.
  • Inner class support for native methods that are mapped to NativeCode.* classes.
  • Miscellaneous clean up and restructuring.

Most of this is clean up and restructuring to facilitate the next major change, removing the "virtual helpers" and introducing base classes for non-final remapped types (java.lang.Object and java.lang.Throwable).

New snapshots: just the binaries and source plus binaries.

Sunday, 11 January 2004 14:23:27 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Saturday, 27 December 2003
dasBlog

I switched from blogX to dasBlog. Mainly because Chris is no longer maintaining blogX, but also because I wanted some new functionality.

Hopefully the transition will be smooth, but if something is not working please let me know.

Saturday, 27 December 2003 19:24:44 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
C# Shortcoming?
Tomas Restrepo posted some weird C# code on his weblog. It inspired me to test a few more corner cases:

namespace System {
   class Object { }
class ValueType { }


struct Int32 { }
class String { }
class A : string { public static void Main() { string s = new A(); } } }

If you compile this and then run peverify on the resulting executable, it will complain about every single type. Here is what's going on:

  • Object doesn't have a base class
  • ValueType extends our Object not the one in mscorlib
  • Int32 extends our ValueType
  • String extends our Object
  • A extends our String, but the local variable s is of type [mscorlib]System.String, so the assignment is not legal.

An obvious explanation for this (broken) behavior of the C# compiler is that Microsoft use the C# compiler to build mscorlib.dll. When compiling mscorlib.dll, the above behavior makes perfect sense. Of course, when you're not compiling mscorlib.dll, it doesn't make much sense and, in fact, it violates the language specification. For example, section 4.2.3 says:

       The keyword string is simply an alias for the predefined class System.String.

"predefined" being the important word here.

I guess it would be relatively easy to fix the compiler, but I think that there is actually an underlying problem: The C# language doesn't have support for assembly identities. Besides the above corner cases, this also causes problems in more real world situations.

Saturday, 27 December 2003 13:06:10 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Wednesday, 24 December 2003
Oops!

Last week's snapshot had a nasty bug. The code to copy the stack contents out of an exception block reversed the stack order. Not a good thing. A pretty stupid bug, but in my defense, this problem didn't show up with my Mauve tests compiled with Jikes. It only showed up on Eclipse (which I didn't have installed at the time). Thanks to Zoltan Varga for bringing this bug to my attention.

I've updated the snapshots. I also made some additional changes:

  • ikvmc now sniffs the name of the main class from the Jar manifest, if it exists.
  • Made loading of core Java classes used by the runtime more consistent. Introduced a new method ClassLoaderWrapper.LoadClassCritical and removed ClassLoaderWrapper.GetType (which had a similar but less well defined role).
  • Fixed some bugs related to -Xsave.
  • Introduced TypeWrapper.MakeArrayType to construct array types.
  • Added CodeEmitter.InternalError for emitters that should never be called (it throws an InvalidOperationException when its Emit method is called).
  • Cleaned up the warning message that shows up in netexp (and now also in ikvmc) that tells you that type initializers are being run when they shouldn't (due to a .NET runtime bug). The warning now only shows up if your .NET runtime has the bug.
  • Removed ThrowHack and changed the compiler to now always emit verifiably code when injecting exception throwing code (e.g. for illegal access errors).
  • Cleaned up JavaException.cs.
  • Changed MethodWrapper.GetExceptions to return string array and resolve these to classes on the Java side.
  • Fixed various bugs related to unloadable classes. The code generated to start up Eclipse is now totally verifiable (except for the JNI calls, of course).
  • Fixed reflection code to report assembly scoped methods and fields to be private when reporting on .NET assembly that was not generated by IKVM.NET.
  • Fixed TypeWrapper.IsInSamePackage to handle array types correctly.
  • Added a JVM.CriticalFailure method that is called when something is really wrong in the runtime (e.g. LoadClassCritical failed or an unexpected exception occurred in the bytecode compiler).
  • Added a JniProxyBuilder (for development and testing only) that splits out the JNI methods into a separate module, so that the peverify errors that are produced can be easily filtered out.
  • Reimplemented class accessibility checks in bytecode compiler (they've been broken for a long time, since I started adding support for unloadable classes).

New snapshots: just the binaries, source plus binaries and GNU Classpath.

Wednesday, 24 December 2003 12:53:05 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Friday, 19 December 2003
Building IKVM.NET on Mono/Linux

I installed Debian 3.0r1 in VMware to work on getting IKVM.NET to build on Mono.

I put together a new snapshot, this time including a GNU Classpath snapshot, because of the compromise of the FSF CVS server my Classpath sources differ from what's available in CVS.

Here are the steps required to build IKVM.NET on Mono on Debian:

  • Download and extract Mono 0.29 runtime and C# compiler
  • Comment out the line
    [assembly: AssemblyKeyFile("..\\..\\bytefx.snk")]
    in mcs-0.29/class/ByteFX.Data/AssemblyInfo.cs.
  • Make the HasShutdownStarted property in mcs-0.29/class/corlib/System/Environment.cs static and change it to return false instead of throw a NotImplementedException.
  • cd mono-0.29
  • ./configure && make && make install
  • cd ../mcs-0.29
  • make && make install
  • Download and extract Jikes 1.18
  • cd jikes-1.18
  • ./configure && make && make install
  • Download NAnt 0.84-rc1
  • mkdir nant-0.84
  • cd nant-0.84
  • unzip ../nant-0.84.zip
  • Comment out the class constructor (static CompilerBase() { ... }) in src/NAnt.DotNet/Tasks/CompilerBase.cs.
  • make clean && make
  • Create a nant shell script in /usr/local/bin that contains:
    #!/bin/sh
    /usr/local/bin/mono /usr/local/src/nant-0.84/bin/NAnt.exe "$@"
  • Create a dummy peverify shell script, that contains:
    #!/bin/sh
  • Download and unzip classpath.zip (don't run any of the scripts)
  • Download and unzip ikvm.zip
  • cd ikvm
  • nant clean
  • nant

Note: I have not yet integrated Zoltan Varga's JNI provider for Mono and the (broken) Windows Forms based AWT is not built on Mono.

Here is what's new since the last snapshot:

  • Changed build process to work on Mono/Linux.
  • Added flag to bytecode metadata table to see if an instruction can throw an exception. The compiler can use this optimize away useless exception blocks.
  • Changed constant pool constant handling to stop boxing the values and use type based accessors instead.
  • Fixed handling of ConstantValue attribute (now works for static fields regardless of whether they are final are not).
  • Exception remapping is now defined in map.xml. This allows more efficient exception handlers, because the compiler now understand the exception hierarchy (including the constraints imposed by the remapping).
  • Changed handling of netexp exported classes to be more robust in the face of errors (on Mono some of the mscorlib.jar classes are not (yet) present in mscorlib.dll).
  • Fixed emitting of DebuggableAttribute (the attribute was attached to the module, but it should be attached to the assembly).
  • Moved most of ExceptionHelper.cs to ExceptionHelper.java and changed the runtime to generate the exception mapping method from map.xml.
  • Fixed some ghost related bugs.
  • Added a test to supress the type initializer bug warning (during netexp) on runtimes that are not broken.
  • Moved common conversion emit operations to TypeWrapper (EmitConvStackToParameterType & EmitConvParameterToStackType).
  • Added test in JavaTypeImpl.Finish to make sure that we are not in "phase 1" of static compilation. During phase 1, classes are being loaded and no compilation should occur, if it does get triggered it is because of a bug in the compiler or a compilation error during compilation of the bootstrap classes.
  • Changed loading of java.lang.Throwable$VirtualMethods so that the ClassNotFound warning doesn't occur any more.
  • Added (partial) support for private interface method implementations to reflection. This fixes a bug in netexp, that caused classes that use private interface implementation to be unusable from Java (because they appear abstract, because of the missing method).
  • Removed WeakHashtable.cs. Exception mapping code is now written in Java and uses java.util.WeakHashMap.
  • Removed StackTraceElement class from classpath.cs. Exception mapping code is now written in Java and uses the GNU Classpath StackTraceElement.java.
  • Moved java.lang.Runtime native methods to Java (except for getVersion and nativeLoad). This is based on a new split of java.lang.Runtime and java.lang.VMRuntime that hasn't been checked into Classpath yet.
  • Many changes to the bytecode compiler to emit more efficient (actually less inefficient) code for exception handlers.
  • Added workaround to bytecode compiler to Improve debugging line number information.
  • Various bug fixes and some clean up of bytecode compiler.
  • Made ikvmc more user-friendly. It now guesses all options based on the input. You can now do "ikvmc HelloWorld.class" and it will generate HelloWorld.exe (if HelloWorld.class has a main method, if not it will generate HelloWorld.dll).
  • Fixed DotNetProcess (that implement Runtime.exec) to handle environment variable duplicates properly.
  • Removed support for throwing arbitrary exceptions using Thread.stop(Throwable). You can now only throw arbitrary exceptions on the current thread or ThreadDeath exceptions on other threads.
  • Implemented shutdown hooks.
  • Changed ikvm.exe to use a more compatible way of finding the main method and to always run the static initializer of the main class (even if the main method is missing).
  • The ikvm -Xsave option is now implemented using a shutdown hook. This allows it to work even if the application terminates with System.exit().

New snapshots: just the binaries, source plus binaries and GNU Classpath.

Friday, 19 December 2003 14:56:15 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
# Thursday, 18 December 2003
Asynchronous Exceptions

I've been working on some low hanging fruit optimizations for the code generator and I came across the following peculiar bytecode pattern used by Jikes to compile synchronized blocks:

   0  goto 6
   3  aload_1
   4  monitorexit
   5  athrow
   6  aload_0
   7  dup
   8  astore_1
   9  monitorenter
  10  aload_1
  11  monitorexit
  12  return
Exception table:
   start_pc = 3
   end_pc = 5
   handler_pc = 3
   catch_type = java/lang/Throwable
   start_pc = 10
   end_pc = 12
   handler_pc = 3
   catch_type = java/lang/Throwable

This code results from this method:

static void main(String[] args) {
  synchronized(args) {
  }
}

I was confused by the first entry in the exception table. It protects part of the exception handler and points to itself. Why would you do this?

Luckily, Jikes is open source, so I went and looked at the source code (see ByteCode::EmitSynchronizedStatement). The comment in the function explains that the additional protection of the exception handler is there to deal with asynchronous exceptions. The Jikes bug database contains a better explanation of the issue.

So, it turns out that this somewhat strange looking construct is actually a perfect way to make sure that locks are always released (and only once) even if an asynchronous exception occurs. (Note that this assumes that monitorexit is an atomic instruction, wrt asynchronous exceptions, this isn't in the JVM specification[1], but it is a reasonable assumption.)

At the moment, IKVM compiles this code as follows (pseudo code):

  object local = args;
  object exception = null;
  Monitor.Enter(local);
  try {
    // this is where the body of the synchronized block would be
    Monitor.Exit(local);
  } catch(System.Exception x) {
    exception = x;
    ExceptionHelper.MapExceptionFast(x);
    goto handler;
  }
  return;
handler:
  try {
    Monitor.Exit(local);
  } catch(System.Exception x) {
    exception = x;
    ExceptionHelper.MapExceptionFast(x);
    goto handler;
  }
  throw x;

This is obviously pretty inefficient, but more importantly, it is incorrect. If an asynchronous exception occurs at the right (or rather, wrong) moment the lock will be released twice.

The right way to compile this would be (pseudo code again):

  object local = args;
  Monitor.Enter(local);
  try {
    // this is where the body of the synchronized block would be
  } finally {
    Monitor.Exit(local);
  }

Of course, in the current version of the CLR, this still wouldn't be safe in the face of asynchronous exceptions, but Chris Brumme assures us that in future versions it will be.

BTW, an alternative way to compile it (which, presumably, would work correctly even in today's CLR), is to move the synchronized block into a new method that is marked with MethodImplOptions.Synchronized.

The tricky part of both of these solutions, is recognizing the code sequences that need to be compiled as a try finally clause. Various Java compilers can use different patterns (although a pretty firm clue is provided by the two exception blocks that must end exactly after the monitorexit instruction).

This is one situation where compiling bytecode instead of Java source, makes it a lot harder to do the right thing.

[1] The JVM specification actually contains an incorrect example of how to compile a synchronized block. Not only does this example not use the above protection against asynchronous exceptions, it also doesn't protect the aload_2 and monitorexit instructions at offset 8 and 9.

Thursday, 18 December 2003 11:33:18 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Friday, 12 December 2003
ThinkPad

(This entry is totally unrelated to IKVM.NET)

Last week I got my new ThinkPad T41p and it has a cool feature, a built-in accelerometer. It is used to park the harddisk when the system detects shocks or falls down ("IBM Hard Drive Active Protection System").

I reverse engineered one of the IOCTLs that can be used to read the accelerometer data and built a simple C# application that displays an artificial horizon. It includes a simple reusable class that encapsulates the communication with the device driver (the standard IBM device driver).

Source can be found here.

Not very useful, but fun stuff anyway. In Longhorn this could be used to keep the desktop level :-)

Friday, 12 December 2003 14:10:43 (W. Europe Standard Time, UTC+01:00)  #    Comments [13]
# Monday, 01 December 2003
Finalization and the JIT

One of the subtle differences between the JVM bytecode instruction set and the CLR instructin set is that the JVM splits object instantiation in two steps: 1) allocation and 2) initialization. The CLR combines both steps in a single instruction.

This usually isn't a big difference. The CLR way makes the verifier a little easier because it doesn't need to keep track of uninitialized references. However, in the face of finalization, the difference is actually detectable.

Here is a Java example:

class foo {
  static int throwException() {
    throw new Error();
  }

  foo(int i) {
  }

  public static void main(String[] args) {
    try {
      new foo(throwException());
    } catch(Error x) {
      System.gc();
      System.runFinalization();
      throw x;
    }
  }

  protected void finalize() {
    System.out.println("finalize");
  }
}

When this class is run, it prints "finalizable" and a stack trace. Looking at the bytecode of the main method, it is obvious why:

   0  new foo
   3  invokestatic <Method foo throwException()I>
   6  invokespecial <Method foo <init>(I)V>
   9  goto 21
  12  astore_1
  13  invokestatic <Method java/lang/System gc()V>
  16  invokestatic <Method java/lang/System runFinalization()V>
  19  aload_1
  20  athrow
  21  return

The first instruction is a new that allocates the foo instance. Even if the following call to throwException throws an exception, the foo instance already exists and will be finalized.

Here is the (approximately) corresponding C# code:

class Foo {
  Foo(int i) {
  }

  ~Foo() {
    System.Console.WriteLine("Finalize");
  }

  static int ThrowException() {
    throw new System.Exception();
  }

  static void Main() {
    try {
      new Foo(ThrowException());
    } catch(System.Exception x) {
      System.Console.WriteLine(x);
    }
  }
}

When this code is run, the output should be just the stack trace, but it turns out that it will also print "Finalize" (on .NET, not on Mono). This is a bug in the .NET JIT. It moves the object allocation up to be able to more efficiently construct the call stack for the constructor invocation, but in doing so it subtly changes the semantics. When the above code is compiled with debugging enabled, it works as expected.

Monday, 01 December 2003 21:17:03 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]