Stuart commented:
Well, I think it's really ugly, but I've run out of better proposals ;) I have
lots of questions and few answers - here are some...
Is it possible for the same class to have multiple names? (There must already
be *some* concessions to this, because, for example, "System.Exception" is also known
as "java.lang.Throwable" - similarly for "Object" and "String").
That's right. In general it is not possible for a class to have multiple names, but
the three classes you name are special cases. They can have multiple names because
Java code will never encounter instances of them (instances will always appear as
java.lang.Object, java.lang.String and java.lang.Throwable). The IKVM reflection code
knows this and can make it appear so that System.Object, System.String and System.Exception
are final classes without constructors and with only static methods. That way Java
code will be able to call (almost) all .NET methods on these types.
I wonder if there could be some concessions for "well-known" assemblies, such
as corlib, System, and the various System.* and Microsoft.* assemblies that ship with
the framework.
How does this whole thing work with Mono which doesn't support strong names? What
about unsigned assemblies?
It really isn't about strong names. The real issue is that in Java class identities
are resolved based on the class name and class loader hierarchy, while in .NET type
identities are resolved based on type name, assembly name and binding policy.
Those two models are very different and the trick is to find a way to map one onto
the other.
How does this "round-trip"? In other words, if I use ikvmc to compile some Java
code into a .NET DLL, use netexp to export that DLL, and try to use its classes, do
I now need to fully-qualify them?
No, you wouldn't need to fully qualify them, but you would need to make sure that
the first DLL gets loaded into the AppDomain before you do any Class.forName() on
it. This is essentially the problem I'm trying to solve for .NET types, but for round-tripped
code I don't think it is solvable.
I guess my primary feeling is that all these solutions seem to make things worse
than they currently are, where for the *most* part things "just work", and I don't
have to understand strong naming or any of the other details of the .NET assembly
loading model to be able to seamlessly interoperate between the two languages. I wish
there were some way to solve the internal issues while still preserving that niceness-to-use.
I agree. Also my previous proposal also has a problem, for one thing, it still has
problems with assembly versioning. If you run on a later version of the .NET framework
the system assembly version no longer match the ones in the Java class name and thus
Class.getName() will return a different name then the one used to load the class.
Obmoloc wrote:
A agree with Stuart. I think that the following code must work always:
assert Class.forName(x).getName() == x.intern();
Assuming you mean: Class.forName(x).getName().equals(x), I agree, but that is the
easy one, it should also hold that for a Class c with a no-arg constructor: c.newInstance().getClass()
== c
Given that, Class.forName("NET.System.String") should throw a ClassNotFoundException,
as for Class.forName("NET.System.Exception"), and for Class.forName("NET.System.Object").
I believe that there is little or no need to call those methods, and so to import
such classes.
I already explained how in response to Stuart's question, let me address the why here.
In order to implement java.lang.String I have to write some code (System.String doesn't
have equivalent methods of everything java.lang.String has). It's easiest to write
this code in Java, because if I write it in C# I have to manually handle some special
cases that the compiler handles for Java code. Hence I need to have access to the
System.String methods. See StringHelper.java for
how it currently works. This isn't the right way, while writing that I thought of
the remapping of instance methods (and constructors) to static methods, but I haven't
implemented that.
If one needs to know, in a generic way, wich Java class represents a .NET class,
a new method should be enough for that. For example, Class.NETforName("NET.System.String")
should return the same value that Class.forName("java.lang.String"). Another method
could be Class.javaNameFromNETName, such as Class.javaNameFromNETName("NET.System.Exception")
returns "java.lang.Throwable".
This wasn't the problem I was trying to solve and I don't think there is any need
to know this.
So, I agree about using a prefix, but I suggest not using "NET.", and use "org.cli."
instead, or something like this. "NET." is just too generic.
Using someone else's domain name isn't that great either. How about simply "cli"?
Jonathan Pierce commented:
I guess I don't fully understand the problem but I really dislike the idea of
requiring prefixes when referencing or importing classes.
Do you mean prefixing in general or just the very long assembly name goo?
Why does the netexp implementation require that the class literal (which is compiled
using Class.forName()) for statically linked netexp generated classes be in the classpath?
Currently, the class name doesn't contain enough information to resolve to a type,
only when the netexp generated class is loaded IKVM notices the attribute in
there that tells it what assembly the type lives in. I didn't want to search all loaded
assemblies because that makes the behavior non-deterministic (sort of). If you run
the class literal before the assembly gets loaded it would fail, but after it gets
loaded it would work.
An Alternate Proposal
I've pretty much given up hope that there is a perfect solution to this problem, but
I would like to suggest this simplified solution and see how everyone feels about
this.
Basic idea: As soon as an assembly gets loaded into the AppDomain it becomes part
of the boot classpath and they are "loaded" by the bootstrap class loader (to be clear,
each assembly does not live in its own class loader).
Pros:
-
Simple model.
-
It mostly just works.
-
Consistent with the way precompiled Java code is treated.
-
It's basically the existing model.
Cons:
-
No way to deal with class name clashes.
-
Non-deterministic. In case of name clashes, the first one encountered is returned.
Class.forName() only works if the assembly was already loaded somehow. However, to
make it more usable, Class.forName() will also allow the class name to be an assembly
qualified type name. In this case the class returned will have a different name from
the one requested.
Why not have a class loader per assembly?
It turns out that having a class loader per assembly doesn't really solve anything.
For the system to be usable, all assembly class loaders would have to be linked together
lineairly. Here is a diagram:
bootstrap
class loader
|
mscorlib
|
System
|
(other
assemblies)
|
extension
class loader
|
application
class loader
Each class loader must always ask its parent to load a particular class first, so
when a type is defined in mscorlib there can never be another type with that same
name, even if it would be loaded by a different class loader. As an aside,
as some of you may know, some (most?) J2EE application servers get around this problem
by violating the class loader rules (not calling the parent class loader first), in
this case however that wouldn't work (and arguably in the J2EE case it doesn't work
either).
Using a tree shaped class loader hierarchy does allow multiple classes with the same
name, but it only works when each leaf of the hierarchy is basically a separate application.
Comments?