# Monday, 19 November 2007
« 0.36 Release Plan | Main | IKVM 0.36 Release Candidate 6 »
Splitting the Class Library Assembly

In a comment to the previous post Peter asks:

Do you plan to split the classpath assembly into multiple smaller assemblies at some point? The OpenJDK-based one is getting *huge* and when you only use a small subset of JDK (as I think most people who don't use IKVM as an alternative to JRE, but use it so that they can embed Java code in .NET app, do), it's pretty high overhead...

Cyprien Noel posted a feature request:

Would it be possible to split the dll with the jvm classes in 2 or more
parts ? E.g. a basic dll with java.lang, util, math, io, net, text, and
another dll with the other features. Thanks

Others have asked similar things. I'm very sympathetic to this request, because the bloat of the Java API has annoyed me ever since JDK 1.2.

Unfortunately this is a very hard problem without upstream support. Since Sun has never had any incentive to reduce dependencies inside the class library (because up until now it has always shipped as a single file), there is a pretty tight coupling between many packages in the Java class library. For example, just the static dependencies of java.lang.Object amount to 1893 classes in the IKVM 0.36 release codebase. Note that these are just the static dependencies and this set of classes won't even allow you to run "Hello World". However, adding a few more classes (to a total of 2676) gives a, what appears to be, workable subset and yields an assembly of approx. 5.5MB.

That looks like a great start, doesn't it? However, the problem is that it isn't at all clear what this assembly supports. For example, it supports the http protocol, but not the https protocol. Simply adding the https protocol handler drags in an additional 1895 classes!

Suppose I put the https support in a separate assembly and when you test your application you only test with http URLs so everything works with just the core assembly and you decide to ship only the core assembly. Now a user of your code tries an https URL and it doesn't work. This may seem like an obvious thing, but there are probably other scenarios that aren't as obvious.

The key point here is that I have no idea of most of the inner workings of the class library, so it would be difficult (or extremely time consuming) for me to figure out the best way to split the classes and to give any guidance as to when you'd need to ship what assembly.

Why I'm Probably Still Going To Do It

There is another argument for splitting the class library assembly: Performance. For example, depending on the scenario, having several smaller assemblies that are demand loaded may be more efficient than a single huge assembly. Especially because the assemblies are strong named and the CLR verifies the assembly hash when it loads the assembly (if the assembly isn't in the GAC).

When?

As is usually the answer: I don't know. It'll be done when it's done. In the meantime you can always download the sources and build your own subset class library assembly :-)

Monday, 19 November 2007 08:51:30 (W. Europe Standard Time, UTC+01:00)  #    Comments [4]