# Sunday, 17 August 2003
« More on interaction with .NET types | Main | Constructing Inner Classes »
More discussion on netexp and Class.forName

Stuart commented:

Well, I think it's really ugly, but I've run out of better proposals ;) I have lots of questions and few answers - here are some...

Is it possible for the same class to have multiple names? (There must already be *some* concessions to this, because, for example, "System.Exception" is also known as "java.lang.Throwable" - similarly for "Object" and "String").

That's right. In general it is not possible for a class to have multiple names, but the three classes you name are special cases. They can have multiple names because Java code will never encounter instances of them (instances will always appear as java.lang.Object, java.lang.String and java.lang.Throwable). The IKVM reflection code knows this and can make it appear so that System.Object, System.String and System.Exception are final classes without constructors and with only static methods. That way Java code will be able to call (almost) all .NET methods on these types.

I wonder if there could be some concessions for "well-known" assemblies, such as corlib, System, and the various System.* and Microsoft.* assemblies that ship with the framework.

How does this whole thing work with Mono which doesn't support strong names? What about unsigned assemblies?

It really isn't about strong names. The real issue is that in Java class identities are resolved based on the class name and class loader hierarchy, while in .NET type identities are resolved based on type name, assembly name and binding policy. Those two models are very different and the trick is to find a way to map one onto the other.

How does this "round-trip"? In other words, if I use ikvmc to compile some Java code into a .NET DLL, use netexp to export that DLL, and try to use its classes, do I now need to fully-qualify them?

No, you wouldn't need to fully qualify them, but you would need to make sure that the first DLL gets loaded into the AppDomain before you do any Class.forName() on it. This is essentially the problem I'm trying to solve for .NET types, but for round-tripped code I don't think it is solvable.

I guess my primary feeling is that all these solutions seem to make things worse than they currently are, where for the *most* part things "just work", and I don't have to understand strong naming or any of the other details of the .NET assembly loading model to be able to seamlessly interoperate between the two languages. I wish there were some way to solve the internal issues while still preserving that niceness-to-use.

I agree. Also my previous proposal also has a problem, for one thing, it still has problems with assembly versioning. If you run on a later version of the .NET framework the system assembly version no longer match the ones in the Java class name and thus Class.getName() will return a different name then the one used to load the class.

Obmoloc wrote:

A agree with Stuart. I think that the following code must work always:

assert Class.forName(x).getName() == x.intern();

Assuming you mean: Class.forName(x).getName().equals(x), I agree, but that is the easy one, it should also hold that for a Class c with a no-arg constructor: c.newInstance().getClass() == c

Given that, Class.forName("NET.System.String") should throw a ClassNotFoundException, as for Class.forName("NET.System.Exception"), and for Class.forName("NET.System.Object").

I believe that there is little or no need to call those methods, and so to import such classes.

I already explained how in response to Stuart's question, let me address the why here. In order to implement java.lang.String I have to write some code (System.String doesn't have equivalent methods of everything java.lang.String has). It's easiest to write this code in Java, because if I write it in C# I have to manually handle some special cases that the compiler handles for Java code. Hence I need to have access to the System.String methods. See StringHelper.java for how it currently works. This isn't the right way, while writing that I thought of the remapping of instance methods (and constructors) to static methods, but I haven't implemented that.

If one needs to know, in a generic way, wich Java class represents a .NET class, a new method should be enough for that. For example, Class.NETforName("NET.System.String") should return the same value that Class.forName("java.lang.String"). Another method could be Class.javaNameFromNETName, such as Class.javaNameFromNETName("NET.System.Exception") returns "java.lang.Throwable".

This wasn't the problem I was trying to solve and I don't think there is any need to know this.

So, I agree about using a prefix, but I suggest not using "NET.", and use "org.cli." instead, or something like this. "NET." is just too generic.

Using someone else's domain name isn't that great either. How about simply "cli"?

Jonathan Pierce commented:

I guess I don't fully understand the problem but I really dislike the idea of requiring prefixes when referencing or importing classes.

Do you mean prefixing in general or just the very long assembly name goo?

Why does the netexp implementation require that the class literal (which is compiled using Class.forName()) for statically linked netexp generated classes be in the classpath?

Currently, the class name doesn't contain enough information to resolve to a type, only when the netexp generated class is loaded IKVM notices the attribute in there that tells it what assembly the type lives in. I didn't want to search all loaded assemblies because that makes the behavior non-deterministic (sort of). If you run the class literal before the assembly gets loaded it would fail, but after it gets loaded it would work.

An Alternate Proposal

I've pretty much given up hope that there is a perfect solution to this problem, but I would like to suggest this simplified solution and see how everyone feels about this.

Basic idea: As soon as an assembly gets loaded into the AppDomain it becomes part of the boot classpath and they are "loaded" by the bootstrap class loader (to be clear, each assembly does not live in its own class loader).

Pros:

  • Simple model.
  • It mostly just works.
  • Consistent with the way precompiled Java code is treated.
  • It's basically the existing model.

Cons:

  • No way to deal with class name clashes.
  • Non-deterministic. In case of name clashes, the first one encountered is returned. Class.forName() only works if the assembly was already loaded somehow. However, to make it more usable, Class.forName() will also allow the class name to be an assembly qualified type name. In this case the class returned will have a different name from the one requested.

Why not have a class loader per assembly?

It turns out that having a class loader per assembly doesn't really solve anything. For the system to be usable, all assembly class loaders would have to be linked together lineairly. Here is a diagram:

            bootstrap class loader
                      |
                   mscorlib
                      |
                   System
                      |
              (other assemblies)
                      |
             extension class loader
                      |
            application class loader

Each class loader must always ask its parent to load a particular class first, so when a type is defined in mscorlib there can never be another type with that same name, even if it would be loaded by a different class loader. As an aside, as some of you may know, some (most?) J2EE application servers get around this problem by violating the class loader rules (not calling the parent class loader first), in this case however that wouldn't work (and arguably in the J2EE case it doesn't work either).

Using a tree shaped class loader hierarchy does allow multiple classes with the same name, but it only works when each leaf of the hierarchy is basically a separate application.

Comments?

Sunday, 17 August 2003 13:41:13 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]