# Tuesday, 19 July 2005
« IKVM 0.18 rc1 | Main | 0.18 Released »
Assembly File Size

Yesterday's release candidate has a much smaller IKVM.GNU.Classpath.dll than the previous release candidate. As I noted this is due to some optimizations in the metadata. Let's look at the file size of IKVM.GNU.Classpath.dll over time:

Date IKVM Version File Size (bytes)
2004-06-28 0.8.0.0 3,239,936
2005-01-10 0.10.0.1 7,348,224
2005-03-02 0.12.0.0 7,409,664
2005-05-07 0.14.0.1 7,483,392
2005-07-01 0.16.0.0 7,266,304
2005-07-18 0.18.0.0 6,782,976


For reference, here's a graph that shows the growth of GNU Classpath:

The big jump in size between 0.8 and 0.10 is mostly due to three reasons: 1) Long period between releases, 2) Huge growth in GNU Classpath, 3) 0.10 for the first time includes source file names and line number tables (to be able to show source files and line numbers in stack traces).

The size reduction in 0.16 was due to a more efficient format for the line number tables. After making this optimization in 0.16, I wanted to investigate exactly what makes up the size of IKVM.GNU.Classpath.dll, but I couldn't find any tools to analyse a managed PE file based on this criterion. So I opened the ECMA specification and hacked together some code. Here's what I found:

First, let's start with the size of the Java classes and resources that IKVM.GNU.Classpath.dll consists of (to have some reference):

 

bytes

Classes 9,694,349
Resources 2,016,851
Total 11,711,200
Zipped 5,843,852


So, compared with the uncompressed size of the classes and resources, the size of IKVM.GNU.Classpath.dll isn't too bad at all.

Here's a breakdown of the parts of the PE file structure:

  bytes
PE Headers/overhead 4,096
.text section 6,770,688
.rsrc section 4,096
.reloc section 4,096


Not very interesting, except maybe that there is a .reloc section that I don't understand the need for, since there's only managed code in this module.

A little more interesting is a breakdown of the .text section:

  bytes
Unknown 8
CLI Header 72
Code + resources 3,651,236
Managed metadata 3,116,284
Filler (alignment) 3,088


Here's a breakdown of the managed metadata:

  bytes
Metadata Header 32
#~ header 12
#Strings header 20
#US header 12
#GUID header 16
#Blob header 16
#~ stream 1,704,068
String heap 356,048
Userstring heap 327,652
GUID heap 16
Blob heap 728,392


And finally, a breakdown of the #~ stream:

  bytes
Header 112
Module table 12
TypeRef table 2,210
TypeDef table 79,146
Field table 148,620
Method table 669,870
Param table 220,448
InterfaceImpl table 9,472
MemberRef table 5,424
Constant table 47,370
CustomAttribute table 488,676
StandAloneSig table 24,216
PropertyMap table 8
Property table 90
MethodSemantics table 66
MethodImpl table 600
ModuleRef table 8
TypeSpec table 400
ImplMap table 108
Assembly table 28
AssemblyRef table 112
ManifestResource table 3,654
NestedClass table 3,416
Filler (alignment) 2


For those unfamiliar with the CLI metadata specification, these tables contain fixed length records and the fields in the records typically contain flags, indexes into other tables or pointers into a string, userstring or blob heap, or an offset into the Code + resources part of the .text section.

From the above table It should be obvious that the custom attributes contribute a significant part of the file size (and remember, this is the 0.18 version of IKVM.GNU.Classpath.dll that has already been optimized quite a bit).

In my analysis tool I built specific code to look at the sizes of the custom attributes and here's the report it generated:

Custom Attributes
-----------------
Total Blob Bytes:  527670
Total Table Bytes: 488676

Type         Constructor
----         -----------
0x0000000B   IKVM.Attributes.SourceFileAttribute::.ctor(20 01 01 0E ( ...))
0x00000013   IKVM.Attributes.JavaModuleAttribute::.ctor(20 00 01 ( ..))
0x0000001B   IKVM.Attributes.RemappedTypeAttribute::.ctor(20 01 01 12 15 ( ....))
0x00000023   IKVM.Attributes.ModifiersAttribute::.ctor(20 01 01 11 1D ( ....))
0x0000002B   IKVM.Attributes.GhostInterfaceAttribute::.ctor(20 00 01 ( ..))
0x00000033   IKVM.Attributes.HideFromJavaAttribute::.ctor(20 00 01 ( ..))
0x0000003B   IKVM.Attributes.ImplementsAttribute::.ctor(20 01 01 1D 0E ( ....))
0x00000043   System.ThreadStaticAttribute::.ctor(20 00 01 ( ..))
0x0000004B   System.ObsoleteAttribute::.ctor(20 00 01 ( ..))
0x00000053   IKVM.Attributes.InnerClassAttribute::.ctor(20 02 01 0E 11 1D ( .....))
0x0000005B   IKVM.Attributes.ExceptionIsUnsafeForMappingAttribute::.ctor(20 00 01 ( ..))
0x00000063   System.ComponentModel.EditorBrowsableAttribute::.ctor(20 01 01 11 80 FD ( .....))
0x0000006B   IKVM.Attributes.NameSigAttribute::.ctor(20 02 01 0E 0E ( ....))
0x00000073   IKVM.Attributes.ThrowsAttribute::.ctor(20 01 01 1D 0E ( ....))
0x0000007B   IKVM.Attributes.RemappedInterfaceMethodAttribute::.ctor(20 02 01 0E 0E ( ....))
0x0000010B   IKVM.Attributes.LineNumberTableAttribute::.ctor(20 01 01 1D 05 ( ....))
0x0000023B   IKVM.Attributes.MirandaMethodAttribute::.ctor(20 00 01 ( ..))
0x000008A3   IKVM.Attributes.ConstantValueAttribute::.ctor(20 01 01 08 ( ...))
0x00000DFB   System.Diagnostics.DebuggableAttribute::.ctor(20 02 01 02 02 ( ....))
0x00000E03   IKVM.Attributes.RemappedClassAttribute::.ctor(20 02 01 0E 12 15 ( .....))
0x00000E0B   System.Reflection.AssemblyCompanyAttribute::.ctor(20 01 01 0E ( ...))
0x00000E13   System.Reflection.AssemblyCopyrightAttribute::.ctor(20 01 01 0E ( ...))
0x00000E1B   System.Reflection.AssemblyTitleAttribute::.ctor(20 01 01 0E ( ...))
0x00000E23   System.Reflection.AssemblyProductAttribute::.ctor(20 01 01 0E ( ...))

Type            Count  Blob Bytes  Total Bytes
----            -----  ----------  -----------
0x0000000B        917        7522        18526
0x00000013          1           5           17
0x0000001B          4         404          452
0x00000023        390         140         4820
0x0000002B          3           0           36
0x00000033        593           0         7116
0x0000003B       1785       33923        55343
0x00000043          6           0           72
0x0000004B        389           0         4668
0x00000053        776        3689        13001
0x0000005B         42           1          505
0x00000063         68           9          825
0x0000006B         89        2982         4050
0x00000073       5657       22069        89953
0x0000007B          1          25           37
0x0000010B      29864      456058       814426
0x0000023B        128           0         1536
0x000008A3          1           9           21
0x00000DFB          1           7           19
0x00000E03          4         479          527
0x00000E0B          1          21           33
0x00000E13          1         272          284
0x00000E1B          1          41           53
0x00000E23          1          14           26

For comparison, here's the same table for the 0.16 version:

Custom Attributes
-----------------
Total Blob Bytes:  638882
Total Table Bytes: 631896

Type         Constructor
----         -----------
0x0000000B   IKVM.Attributes.JavaModuleAttribute::.ctor(20 00 01 ( ..))
0x00000013   IKVM.Attributes.SourceFileAttribute::.ctor(20 01 01 0E ( ...))
0x0000001B   IKVM.Attributes.RemappedTypeAttribute::.ctor(20 01 01 12 15 ( ....))
0x00000023   IKVM.Attributes.ModifiersAttribute::.ctor(20 01 01 11 1D ( ....))
0x0000002B   IKVM.Attributes.GhostInterfaceAttribute::.ctor(20 00 01 ( ..))
0x00000033   IKVM.Attributes.HideFromJavaAttribute::.ctor(20 00 01 ( ..))
0x0000003B   IKVM.Attributes.ImplementsAttribute::.ctor(20 01 01 1D 0E ( ....))
0x00000043   System.ThreadStaticAttribute::.ctor(20 00 01 ( ..))
0x0000004B   System.ObsoleteAttribute::.ctor(20 00 01 ( ..))
0x00000053   IKVM.Attributes.InnerClassAttribute::.ctor(20 04 01 0E 0E 0E 11 1D ( .......))
0x0000005B   IKVM.Attributes.ExceptionIsUnsafeForMappingAttribute::.ctor(20 00 01 ( ..))
0x00000063   System.ComponentModel.EditorBrowsableAttribute::.ctor(20 01 01 11 80 FD ( .....))
0x0000006B   IKVM.Attributes.NameSigAttribute::.ctor(20 02 01 0E 0E ( ....))
0x00000073   IKVM.Attributes.ThrowsAttribute::.ctor(20 01 01 1D 0E ( ....))
0x0000007B   IKVM.Attributes.RemappedInterfaceMethodAttribute::.ctor(20 02 01 0E 0E ( ....))
0x0000010B   IKVM.Attributes.LineNumberTableAttribute::.ctor(20 01 01 1D 05 ( ....))
0x00000233   IKVM.Attributes.MirandaMethodAttribute::.ctor(20 00 01 ( ..))
0x000008A3   IKVM.Attributes.ConstantValueAttribute::.ctor(20 01 01 08 ( ...))
0x00000DFB   System.Diagnostics.DebuggableAttribute::.ctor(20 02 01 02 02 ( ....))
0x00000E03   IKVM.Attributes.RemappedClassAttribute::.ctor(20 02 01 0E 12 15 ( .....))
0x00000E0B   System.Reflection.AssemblyCompanyAttribute::.ctor(20 01 01 0E ( ...))
0x00000E13   System.Reflection.AssemblyCopyrightAttribute::.ctor(20 01 01 0E ( ...))
0x00000E1B   System.Reflection.AssemblyTitleAttribute::.ctor(20 01 01 0E ( ...))
0x00000E23   System.Reflection.AssemblyProductAttribute::.ctor(20 01 01 0E ( ...))

Type            Count  Blob Bytes  Total Bytes
----            -----  ----------  -----------
0x0000000B          1           5           17
0x00000013       4216       82327       132919
0x0000001B          4         404          452
0x00000023       5829         224        70172
0x0000002B          3           0           36
0x00000033       3209           0        38508
0x0000003B       1775       33852        55152
0x00000043          6           0           72
0x0000004B        389           0         4668
0x00000053        766       78090        87282
0x0000005B         42           1          505
0x00000063         68           9          825
0x0000006B         89        2982         4050
0x00000073       5650       22069        89869
0x0000007B          1          25           37
0x0000010B      30469      418060       783688
0x00000233        131           0         1572
0x000008A3          1           0           12
0x00000DFB          1           7           19
0x00000E03          4         479          527
0x00000E0B          1          21           33
0x00000E13          1         272          284
0x00000E1B          1          41           53
0x00000E23          1          14           26

One notable item is that the line number tables have grown in 0.18, this is due to the fact that 0.18 has been compiled with the Eclipse Java Compiler whereas 0.16 was compiled with Jikes. For some reason, the Eclipse compiler generates larger line number tables. I haven't investigated this yet.

The new -strictfinalfieldsemantics ikvmc option was the direct result of studying the impact of metadata on the file size. Without this option, public and protected final fields are converted into readonly properties and the field requires and additional attribute. With the option, final fields are converted into initonly fields, which has the same semantics under a strict interpretation of the 1.5 VM specification. This option alone saves 155,648 bytes.

Looking at the custom attribute sizes, there appears to be more room for improvement. In particular, the ThrowsAttribute, InnerClassAttribute and ImplementsAttribute can benefit from using tokens instead of encoding the class names in the constructor blob, but the required APIs to resolve tokens are new in Whidbey, so for the time being that isn't an option.

Another long term improvement would be to include the line number tables in the method IL (or after the IL), to save on the records in the custom attribute table (which contribute a very significant 358,368 bytes). This would probably be possible in Whidbey by using MethodBody.GetILAsByteArray(), but it would be nicer if the ECMA spec would be extended to support this directly (it would also remove the need for the ridiculously large PDB files simply to get line numbers in stack traces for other .NET applications).

Tuesday, 19 July 2005 11:06:52 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
Sunday, 31 July 2005 18:33:36 (W. Europe Daylight Time, UTC+02:00)
Extract the "Eclipse JDT Compiler" jdtcore.jar from one of these 3 files from http://download.eclipse.org/eclipse/downloads/ :

eclipse-JDT-3.1.zip
eclipse-JDT-SDK-3.0.2.zip
eclipse-JDT-2.1.3.zip

and run the batch compiler:

java -jar jdtcore.jar org.eclipse.jdt.internal.compiler.batch.Main -classpath rt.jar A.java

Reference:
http://dev.eclipse.org/viewcvs/index.cgi/%7Echeckout%7E/jdt-core-home/howto/batc
h%20compile/batchCompile.html

The latest eclipse-JDT-M20050727-1200.zip is not ready, i'm sorry.

I won't need the Sun's JDK-1.4.2 but i would need or not the Sun's JRE-1.4.2 with the IBM's "Eclipse JDT Compiler".

My favourites JREs for my free future are ikvm, kaffe, GCJ's gij, jamvm, jcvm, sablevm, cacaojvm, jikesrvm, Apache's Harmony, kissme, ... every with 100% GNU's classpath.

Sincerely yours, 31th of July of 2005.
J.C. Pizarro.
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)  

Live Comment Preview