After suffering from coder's block (if that's the programmer's equivalent of writer's block) for weeks, I finally got started the past week on the new object model remapping infrastructure. I spent most of the week going in circles. Writing code, deleting it, writing more code, deleting it again. However, I think I figured it out now. To test that theory I decided to write a blog entry about it. Explaining something is usually a good way to find holes in your own understanding.
I'm going to start out by describing the problem. Then I'll look at the existing solution and it's limitations. Finally I'll explain the new approach I came up with. If everything goes well, by end the of the entry you'll be wondering "why did it take him so long to come up with this, it's obvious!". That means I came up with the right solution and I explained it well.
Note that I'm ignoring ghost interfaces in this entry. The current model will stay the same. For details, see the previous entries on ghost interfaces.
What are the goals?
- We need a way to have the Java object model fit into the .NET model
- We would like to enable full interoperability between the two models
- Performance should be acceptable
- Implementation shouldn't be overly complex
The .NET model (relevant types only):
For comparison, here is the Java model that we want to map to the .NET model:
There are several possible ways to go (I made up some random names):
This is the current model. The Java classes are mapped to the equivalent .NET types. This works because System.Object has mostly the same virtual methods as java.lang.Object. For the java.lang.Throwable to System.Exception, a little more work is needed and that is where the java.lang.Throwable$VirtualMethods interface comes in. When a virtual method on Throwable is called through a System.Exception reference, the compiler calls a static helper method in Throwable$VirtualMethodsHelper that checks if the passed object implements the Throwable$VirtualMethods interface and if so, it calls the interface method, if not, it calls the Throwable implementation of the method (i.e. it considers the method not overridden by a subclass). A downside of using an interface for this is that all interface methods must be public, at the moment this isn't a problem because all virtual methods (except for clone and finalize derived from Object) in Throwable are public, but it could become a problem later on.
This is a fairly straightforward approach where java.lang.Object extends System.Object and the Java array, String and Throwable classes are simply subclasses of java.lang.Object. It is easy to implement. The obvious downsides are that arrays will be slow (extra indirection), Strings need to be wrapped/unwrapped when doing interop with .NET code and Throwable is not a subclass of System.Exception (the CLI supports this, but once again not a good idea for interop).
I apologise in advance, because I probably can't explain this one very well (because it doesn't make any sense to me). Many people have actually suggested this model. In the model java.lang.Object extends System.Object, but arrays, String and Throwable do not extend java.lang.Object, instead whenever an instance of those types is assigned to a java.lang.Object reference, it is wrapped in an instance of a special java.lang.Object wrapper class. The downside of this model is that wrapping and unwrapping is expensive and (and this is why I don't like this approach at all) that the expense is paid in ways that are very unexpected to the Java programmer (who expects simple assignment to be expensive?).
This is the new model. Explanation follows below.
What's wrong with equivalance?
Both J# and the current version of IKVM use equivalence (although many of the details differ and J# doesn't consider Throwable and System.Exception to be equivalent) and it works well. So why change it? There are four advantages to the mixed model:
- Interop works better
In the current model, if you subclass a Java class from C# and you want to override Object.equals, depending on whether any of the base classes overrides Object.equals you need tot override either Equals or equals. If you want to call Throwable.getStackTrace() from C# on a reference of type System.Exception there is no obvious way to do that.
- More efficient representation of remapping model
Currently every subclass of java.lang.Object overrides toString() to call ObjectHelper.toStringSpecial, this is needless code bloat. More importantly, before any classes can be resolved the map.xml remapping information needs to be parsed and interpreted (to know about java.lang.Object, java.lang.String and java.lang.Throwable). In the new model, java.lang.Object, java.lang.String and java.lang.Throwable will be in the classpath.dll, so they can be resolved on demand. The classes will be decorated with custom attributes to explain to the runtime that they are special, but no other metadata will need to be parsed or interpreted.
- Cleaner model
java.lang.ObjectHelper and java.lang.StringHelper no longer need to be public and the various $VirtualMethod helper types aren't needed anymore.
- Easier to get right
There are a few subtle bugs in the current implementation. Try the following for example:
public static void main(String args) throws Exception
String s = "cli.System.NotImplementedException";
Object o = Class.forName(s).newInstance();
Throwable t = (Throwable)o;
It prints out:
System.NotImplementedException: The method or operation is not implemented.
cli.System.NotImplementedException: The method or operation is not implemented.
Obvously, both lines should be the same. Another (at the moment theoretical) problem is that it is legal for code in the java.lang package to call Object.clone or Object.finalize (both methods are protected, but in Java, protected also implies package access), currently that wouldn't work.
Here is the mixed model I ended up with:
I called it mixed because it combines some features of equivalence and extension. For example, references of type java.lang.Object are still compiled as System.Object (like in the equivalence model), but the non-remapped Java classes extend java.lang.Object (like in the extension model).
java.lang.Object will contain all methods of the real java.lang.Object and in addition to those also a bunch of static helper methods that allow you to call java.lang.Object instance methods on System.Object references. The helper methods will test if the passed object is either a java.lang.Object or a java.lang.Throwable (for virtual methods) and if so, it will downcast and call the appropriate method on those classes, if not, it will perform an alternative action (that was specified in map.xml when this classpath.dll was compiled).
Object.finalize requires some special treatment since we don't want java.lang.Object.finalize to override System.Object.Finalize because that would cause all Java objects to end up on the finalizer queue and that's very inefficient. So the compiler will contain a rule to override System.Object.Finalize when a Java class overrides java.lang.Object.finalize.
I glossed over a lot of details, but those will have to wait for next time.
Finally a short note on FOSDEM (Free and Open Source Software Developer's Meeting). Last weekend I visisted FOSDEM in Brussels. I enjoyed seeing Dalibor, Chris, Mark, Sascha and Patrik again and I also enjoyed meeting gjc hackers Tom Tromey and Andrew Haley for the first time. Mark wrote up a nice report about it. If you haven't read it yet, go read it now. All in all a very good and productive get-together.