# Sunday, 09 November 2003
Finalize Considered Harmful

Last week at the PDC I talked to Anders Hejlsberg (btw, he's a really nice person). I asked him about having a way to prevent boxing and he told me they had already considered it and found too many problems with it (reflection, generics, more on that in another post). I also told him that I felt that C# had only one design flaw: The destructor syntax. I was reminded of this when I was working on some java.nio classes yesterday.

The Java Community Doesn't Understand Finalization

How's that for a section title? It's obviously a generalization, but I don't think it is too far from the truth. To prepare for writing this entry I looked at finalization in three books:

All three are excellent books. Highly recommended.

Addison-Wesley put online the relevant section of Peter's book here. First, let me say that Peter is a very smart guy (I know him from the Colorado Software Summit) and that this doesn't reflect on the quality of the rest of the book, but I'm picking on him because part of this particular "praxis" demonstrates the, IMO, typical misunderstanding of finalization in the Java community (and because his book is one of the few Java books I own).

So, what is wrong with his code? He uses the finalize method to cleanup managed objects that already have their own finalizer!
This practice is widely used in the Java libraries as well and it is a really bad idea. At best it doesn't help (the ServerSocket and FileInputStream will have already been finalized (see JLS 12.6.2 Finalizer Invocations are Not Ordered), or will be finalized soon anyway) and at worst you create APIs that are very difficult to use (or code) because multiple objects "own" the same resource.

The JLS doesn't say anything useful about finalization, but does mention that you should always call super.finalize() in your finalizer (like Peter does in his code).

THIS IS USELESS!

Why? All finalizable classes should extend java.lang.Object and be final, because there is no reason for them not to be. Here is what a finalizable class should look like.

final class FileHandle {
  private int nativeHandle;

  FileHandle(String filename) {
    nativeHandle = nativeOpen(filename);
  }
  private static int nativeOpen(String filename);

  synchronized void close() {
    if(nativeHandle != 0) {
      nativeClose(nativeHandle);
      nativeHandle = 0;
    }
  }
  private static native void nativeClose(int handle);
  
  protected void finalize() {
    close();
  }

  // example operation, the others are omitted
  synchronized int read(...) {
    return nativeRead(nativeHandle, ...);
  }
  private static native int nativeRead(int handle, ...);
} 

Other than exposing the primitive operations that can be performed on the unmanaged resource (thru the handle), the class should have no functionality. The class is package private and used only by the public classes that actually implement the exposed API. The class should never have any references to other objects.

This pattern of factoring out the finalizable resource into a separate class is discussed in the Jones/Lins book. It seems that not enough people read it.

The above discussion is specific to Java and .NET finalization, there are other implementations of finalization that do finalize in order. In such an implementation it can actually be useful to use references to other objects in your finalize method, because you know that the object you're using hasn't already been finalized.

Back To C#

So what's wrong with the C# destructor syntax? I think it encourages the mistake of thinking of Finalize as a destructor (and thus using it to cleanup other managed objects). Also, a "feature" of the C# destructor is that it always call the base class Finalize method. I hope I've shown that this isn't very useful.

Anders agreed with me. "In retrospect we probably shouldn't have done that." (paraphrased from memory).

Sunday, 09 November 2003 15:53:01 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
# Wednesday, 29 October 2003
More Name Dropping
I finally met Miguel de Icaza at the Unofficial Mono BoF and he gave me a cool Mono T-shirt (Thanks Miguel!). I talked about IKVM with Chris Brumme and discussed reflection with Peter Drayton and Dario Russi. I asked Eric Gunnerson about a custom attribute to prevent boxing, but from his response I take it that it is unlikely that'll ever happen. I met Chris Hollander at the "alternate programming languages BoF" last night (at 11pm, ugh). He told me he got a lot of referrers from my site. Since his blog is named "Objective", this makes me wonder if people are looking for information on the objectives of IKVM. I really should write some more documentation ;-)
Wednesday, 29 October 2003 20:59:44 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Tuesday, 28 October 2003
Whidbey

Whidbey was unveiled this morning. It has lots of nice new features, but I'm particularly interested in some of the improvements to reflection.

  • Finally, there is support for custom modifiers in Reflection.Emit
  • DynamicMethod allows methods to be generated that "implement" a delegate and the code gets garbage collected along with the delegate when it is no longer reachable.
  • There is (some) support for doing more of Reflection.Emit based on tokens (a token is an Int32 that refers to metadata in a module), instead of the more heavy weight MethodInfo/FieldInfo/etc. objects.
  • Reflection objects (e.g. MethodInfo) are now kept in a weak reference cache, so that they can be garbage collected if they aren't needed anymore.

I need to do some rewriting to take advantage of the last two, but once I do, the memory usage of IKVM should (hopefully) go down significantly. I'm not entirely sure that the changes they made are powerful enough, but it is an important step in the right direction.

I know I promised I wasn't going to move to Whidbey anytime soon, but I am going to introduce some conditional compilation for Whidbey specific features, so that I can play around with them. Mainstream development will continue to be on 1.1

Tuesday, 28 October 2003 21:23:21 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Monday, 27 October 2003
Eclipse Reception

Last night I visisted the Eclipse reception at OOPSLA 2003. Chris Laffra from IBM Canada had prepared a poster on Alternative Eclipse Execution Models and he included IKVM on it. He also invited me to the Eclipse Reception. There was a fair amount of interest in IKVM and I talked to a bunch of people, including Erich Gamma and Doug Lea. That was cool.

This morning I atttended the PDC keynote by Bill Gates and Jim Allchin. See the other PDC blogs for details ;-)

If you're at the PDC, I'm wearing a red/white/blue (vertical bars, as in the Dutch flag) polo shirt with a small Java logo in the front. If you see me, say hi.

Monday, 27 October 2003 21:15:03 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Sunday, 26 October 2003
PDC

I arrived in LA last Friday. No internet access in the hotel room, but fortunately the conference center has excellent connectivity.

I'm not going to be PDC blogging, but expect more non-technical entries this week than usual.

Tonight I'm going to the Eclipse reception at OOPSLA 2003. It'll be fun to chat with some of the Eclipse developers. May be I can get them to fix their class loading ;-)

Sunday, 26 October 2003 18:07:35 (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
# Wednesday, 22 October 2003
Private Super Classes

While trying to get Stuart Ballard's Japitools to work on a netexp generated export of IKVM's classpath.dll, I encountered java.awt.BufferCapabilities.FlipContents. This public (inner) class has non-public base class (java.awt.AttributeValue). What kind of a bizarre design is that?

I had never realised that the Java language allowed this. Fortunately, the C# designers didn't make the same mistake.

Note that a similar issue exists for fields (and method arguments). In Java it is legal to have a public field of a non-public type, or a public method that takes an argument of a non-public type. C# prohibits both of these.

I fixed netexp to export non-public base classes (and I should do the same for interfaces, fields and method arguments/return type).

Wednesday, 22 October 2003 18:36:30 (W. Europe Daylight Time, UTC+02:00)  #    Comments [4]
# Friday, 17 October 2003
Ghosts

It took a bit longer than I anticipated, but I finally managed to put together a new snapshot and check in the changes I made in last month and a half.

What's new?

  • Fixed bug in class load failure diagnostics code.
  • Instead of having a single Type property on TypeWrapper, we now have different properties for different usages. I already know this isn't the final way I'm going to handle things, but for now it is a nice improvement that makes it easier to treat types differently based on where they are appearing (e.g. field, local, argument, base type).
  • Simple ghost references are translated as value type wrappers. See below for details.
  • Reflection support for ghost types.
  • Fixed a few ILGenerator.Emit() bugs where incorrect argument types were passed (int instead of short or byte).
  • Added a NoPackagePrefixAttribute to allow .NET types to not be prefixed with the "cli" prefix. (Suggested by Stuart Ballard)
  • Fix to prevent non-public delegates from being visible.
  • Several fixes in the handling of unloadable types.
  • Fixed java.lang.Comparable interface and method attributes to be identitical to the real interface.
  • Fixed java.lang.Runtime.exec() support for quoted arguments.

Ghost References

First of all, why did I feel the need to change it? Imagine that java.lang.StringBuffer had the following methods:

public void append(Object o) { ... }
public void append(CharSequence cs) { ... }

Previously, CharSequence would be erased to Object, so you'd have two methods with an identical signature, thus requiring name mangling. Using the method from C# would become very inconvenient, both because of the unexpected name, but also because of the fact that it isn't at all clear what argument type is expected (and passing an incorrect type will give odd results, like ClassCastException or IncompatibleClassChangeError).

Enter the value type wrapper. Here is how the java.lang.CharSequence interface is compiled now (pseudo code):

public struct CharSequence {
  public interface __Interface {
    char charAt(int i);
    // ... other methods ...
  }
  public object __ref;
  public static CharSequence Cast(object o) {
    CharSequence s;
    if(o is string) {
      s.__ref = o;
      return s;
    }
    s.__ref = (__Interface)o;
    return s;
  }
  public static bool IsInstance(object o) {
    return o is string || o is __Interface;
  }
  public object ToObject() {
    return __ref;
  }
  public static
    implicit operator CharSequence(string s) {
    CharSequence seq;
    seq.__ref = s;
    return seq;
  }
  public char charAt(int i) {
    if(__ref is string) {
      return StringHelper.charAt((string)__ref, i);
    }
    return ((__Interface)__ref).charAt(i);
  }
  // ... other methods ...
}


All types that implement CharSequence will be compiled to implement CharSequence.__Interface and will also get an implicit conversion operator to convert to CharSequence. This allows C# code to easily call any methods that take a CharSequence argument, because any valid type will be implicitly convertible to the CharSequence value type.

What Are The Downsides?

There are a few cons:

  • This trick doesn't work for arrays. So a CharSequence[] is still compiled as Object[]. Hopefully arrays are far less common, so this will not be a big issue.
  • C# code implementing CharSequence, needs to implement CharSequence.__Interface and also needs to provide an implicit conversion operator to the CharSequence value type.
  • The static methods Cast and IsInstance need to be used instead of the normal language features.
  • It is very easy to accidentally box the CharSequence value type in C#, when you pass this to Java things will get very confusing. The proper way to convert CharSequence to Object is by calling ToObject() on it.

Update: I forgot to mention that the reason that you have to implement the implicit conversion on all types that implement the interface is because C# doesn't allow implicit conversion from/to interfaces, otherwise the CharSequence value type could just have an implicit conversion from CharSequence.__Interface and we'd be done. Funnily enough, while C# doesn't allow you to define them, it turns out it does consume them, but I don't know if this is guaranteed behavior or a bug. I need to find this out. If it turns out to be correct, then I'll add the implicit conversion operator to the value types, that makes life for C# implementers a tiny bit easier.

All in all, I think this is an improvement, but obviously not perfect. As always, feedback is appreciated.

The new snapshots: binaries and complete.

Friday, 17 October 2003 15:57:05 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Thursday, 16 October 2003
GNU Classpath Meeting

We had a small GNU Classpath meeting on Tuesday at the Linux Kongress in Saarbr├╝cken, Germany. I drove down there for just that day. There were 6 others there: Dalibor Topic (Kaffe), Chris Gray (Wonka), Mark Wielaard (GNU Classpath), Sascha Brawer (GNU Classpath), Jean Daniel Fekete (Agile2D), Patrik Reali (JAOS).

I enjoyed it very much. The discussions were interesting and it was nice to finally meet some of the people involved with GNU Classpath.

We talked about how to implement Java2D. Jean Daniel has worked on Agile2D a Java2D implementation based on OpenGL. This is a nice starting point and OpenGL is available on virtually every platform. This would work for IKVM.NET as well. Sascha pointed out that I should really implement Java2D on top of the .NET System.Drawing classes (based on GDI+). That would be cool, but I think it would be too much work (for me).

I also had an interesting discussion with Patrik about JAOS, a JVM on top of Oberon (a sort of object oriented OS, if I understand it correctly). He faces some of the same problems as I do, with the integration of the two object models.

One of the conclusions of the meeting was that GNU Classpath is alive and kicking (and doing very well in terms of progress being made), but we need a little more publicity. So consider this my small drop in the bucket. If you are a Java developer and want to improve the state of free JVMs, please visit the GNU Classpath web site and consider helping out. In the near future there will be a new task list detailing some of the work that needs to be done (and it will include estimated required effort and skill set).

Thursday, 16 October 2003 14:36:50 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Friday, 03 October 2003
Vacation

I'm going on vacation again :-) I'll be back on the 12th.

I just realised I haven't blogged for a whole month. It's not because there hasn't been any progress. I've been experimenting with a way to make ghost interfaces more usable from other languages. More on that when I get back.

Friday, 03 October 2003 21:53:33 (W. Europe Daylight Time, UTC+02:00)  #    Comments [8]
# Monday, 01 September 2003
Constructing Inner Classes

Alan Macek pointed out an interesting bug in the verifier.

First, some background. Let's look at this code fragment:

class Test {
    Test() {
      new Runnable() {
          public void run() {}
      };
    }
    public static void main(String[] args) {
      new Test();
    }
}

Inner classes aren't supported by the VM, it's a compiler fiction, so when you compile the above code you get the equivalent of the following:

class Test {
    Test() {
      new Test$1(this);
    }
    public static void main(String[] args) {
      new Test();
    }
}
 
class Test$1 implements Runnable {
    private Test outer;
    Test$1(Test outer) {
      this.outer = outer;
    }
    public void run() {}
}

The interesting bit here is that the outer field is initialized after the base class constructor runs. In this case the base class is Object so this isn't significant, but when you have a base class that calls virtual methods it is:

abstract class Base {
    Base() {
      run();
    }
    public abstract void run();
}
 
class Test {
    Test() {
      new Base() {
          public void run() {
            System.out.println(Test.this);
          }
      };   
    }
    public static void main(String[] args) {
      new Test();
    }
}

When you compile this and run it, it prints out: null. The outer this reference is not yet initialized when the Base constructor runs.

However, when you compile it with the -target 1.4 option, it prints out: Test@17182c1 (or something similar). So clearly an interesting change was made to the compiler.

Let's take a look at the bytecode of the constructor of the original code fragment:

<init>(LTest;)V
0  aload_0
1  invokespecial java/lang/Object/<init>()V
4  aload_0
5  aload_1
6  putfield Test$1/this$0 Test
9  return

First, the base class constructor is called and then the this$0 field is initialized.

Now take a look at the same method, but now as compiled with the -target 1.4 option:

<init>(LTest;)V
0  aload_0
1  aload_1
2  putfield Test$1/this$0 Test
5  aload_0
6  invokespecial java/lang/Object/<init>()V
9  return

The order of the two steps is reversed. Now it first initializes the this$0 field and then calls the base class constructor.

I had missed the fact that it is legal to initialize fields in an unitialized object. So when the latter code was run, the verifier compiled that an uninitialized object reference was used.

The big question is why the compiler only does this when the -target 1.4 option is specified. This rule of allowing instance fields of unitialized object to be written has always been in the JVM specification, as far as I can tell.

New Snapshots

Friday, I put up new snapshots. Source and binaries and binaries only.

Changes since the last (on the blog announced) snapshot:

  • New version of netexp based on Java reflection (all .NET to Java mapping is now in the IKVM reflection runtime).
  • To support netexp, the .NET type reflection is now much better. This includes support for delegates and byref arguments. Instead of lower casing the .NET namespaces, all .NET types are now prefixed with the "cli." package name. Remapped this are also available and appear as final classes without constructors, but with static methods for all instance methods (taking the remapped type as the first argument). Constructors are available as static __new methods.
  • Bootstrap class loader is now a flat type space. All loaded assemblies are searched for bootstrap types when Class.forName is used.
  • Enums are now treated as other value types (i.e. boxed). This allows better accesses to overloaded .NET methods.
  • Lots of clean up and refactoring to support the new reflection capabilities.
  • Various bug fixes.
  • Updated to work with latest GNU Classpath CVS.
Monday, 01 September 2003 16:32:56 (W. Europe Daylight Time, UTC+02:00)  #    Comments [9]