# Wednesday, 25 November 2009
« Going Crazy with Generics, or The Story ... | Main | IKVM 0.42 Release Candidate 3 »
Running Eclipse with NGEN

In the weeks before PDC I've been working on compiling Eclipse with ikvmc. This works was triggered by Mainsoft's Eyal Alaluf who asked me to work on this and also provided a desperately needed starting point. I had wanted to do this for ages, but didn't feel like struggling with the Eclipse build system to figure out how to get started.

A couple of the changes in the most recent development snapshot are specifically related to this. In particular the ability for custom assembly class loaders to be called when the module initializer is run. This enables the statically compiled Eclipse OSGi bundles to be lazily activated on first use.

Instructions

Here are the steps needed to compile Eclipse 3.4.2 x86 on Windows:

  1. Download eclipse-SDK-3.4.2-win32.zip
  2. Download ikvmbin-0.43.3595.zip
  3. Download ikvm-eclipse-0.1.zip
  4. Unzip eclipse-SDK-3.4.2-win32.zip
  5. Open a Command Prompt in the just unzipped eclipse directory
  6. Unzip ikvmbin-0.43.3595.zip in that directory
  7. Unzip ikvm-eclipse-0.1.zip in that directory
  8. Create a directory for the compiled plugins:
    md plugins-compiled
  9. Run ikvmc to compile the eclipse plugins:
    ikvm\bin\ikvmc @response0.txt
    ikvm\bin\ikvmc @response1.txt
    (Ignore the warnings and note that this takes a while and requires a lot of memory. I haven't tested this on a 32 bit machine, it may well run out of address space there.)
  10. You can now run "eclipse-clr.exe" to start Eclipse. Note that if you compare startup times, the first time that Eclipse starts it does some initial configuration, so don't compare the first startup with the subsequent ones.
  11. Optionally you can run ngen-all.bat to compile all assemblies to native code. Make sure that you have the x86 version of ngen.exe in your path. Note that this also takes a while.

Source Code

The sources for eclipse-clr.exe are in this Visual Studio 2008 solution. It's pretty small and most of what it does is configure and hook OSGi to change the bundle loading and initialization. If you want to build eclipse-clr.exe, you first have to run ikvmc on response0.txt, then build eclipse-clr.exe (it depends on the OSGi assembly built with response0.txt) and after that you can run ikvmc on response1.txt (it depends on eclipse-clr.exe, because that contains the custom assembly class loader used for the bundles).

The response0.txt and response1.txt files were generated from the OSGi manifests and if there is interest I can publish the source to that as well, but is pretty hacky.

Performance

When compiled to native with ngen, Eclipse starts up faster than with JDK 1.6 on my systems. In theory the private working set should also be significantly less, allowing multiple Eclipse instances to use far less memory.

Disclaimer

This is just a technology demonstration, not production code and has not been extensively tested.

Wednesday, 25 November 2009 06:32:30 (W. Europe Standard Time, UTC+01:00)  #    Comments [20]
Wednesday, 25 November 2009 09:38:30 (W. Europe Standard Time, UTC+01:00)
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at IKVM.Reflection.Emit.Writer.ByteBuffer.Grow(Int32 minGrow)
at IKVM.Reflection.Emit.Writer.ByteBuffer.Write(Stream stream)
at IKVM.Reflection.Emit.ModuleBuilder.DefineManifestResource(String name, Stream stream, ResourceAttributes attribute)
at IKVM.Internal.CompilerClassLoader.AddResources(Dictionary`2 resources, Boolean compressedResources)
at IKVM.Internal.CompilerClassLoader.Compile()
at IKVM.Internal.CompilerClassLoader.Compile(List`1 optionsList)
at IkvmcCompiler.Main(String[] args)
Matzon
Wednesday, 25 November 2009 09:39:35 (W. Europe Standard Time, UTC+01:00)
The above was done on a windows xp 32bit with 3 GB ram.
Matzon
Wednesday, 25 November 2009 18:49:33 (W. Europe Standard Time, UTC+01:00)
Am also getting the same OutOfMemoryException on 32 bit windows xp. Is any workaround possible?
Thursday, 26 November 2009 04:34:28 (W. Europe Standard Time, UTC+01:00)
I got this compiling on 32bit windows after breaking response1.txt into two different files. I am happy to share if others are interested. I then NGen'd and eclipse has never started this fast. Great stuff.

One thing that shocked me though is how little sharable data there was between multiple instances. I was expecting to be able to fire up multiple eclipse-clr's and have them share a LOT of memory. Unfortunately it appears that they share very little. I used VADump and got the following output (apologies for formatting). It appears that 120 ish MBs are in 'Other Data' and hence non-sharable. This seems to be consistent with the amount that my total memory usage jumps by each time I start another eclipse-clr. Is this a problem that you are aware of, do you know what is using all the memory in 'Other Data' and can it be addressed?

Many thanks for an amazing java runtime btw, vadump output below.

Catagory Total Private Shareable Shared
Pages KBytes KBytes KBytes KBytes
Page Table Pages 146 584 584 0 0
Other System 52 208 208 0 0
Code/StaticData 8014 32056 9768 18416 3872
Heap 1303 5212 5212 0 0
Stack 45 180 180 0 0
Teb 13 52 52 0 0
Mapped Data 228 912 0 128 784
Other Data 31969 127876 127872 4 0

Total Modules 8014 32056 9768 18416 3872
Total Dynamic Data 33558 134232 133316 132 784
Total System 198 792 792 0 0
Grand Total Working Set 41770 167080 143876 18548 4656
Thursday, 26 November 2009 06:32:39 (W. Europe Standard Time, UTC+01:00)
On my system when I run "vadump -sop <pid>" I see that the NGENed images are shared (only a single private page per image), so there's definitely sharing, but still the private working set is around 100 MB. I guess that most of that is just managed heap. My numbers look different from yours (I get about 80MB of heap of 45MB of "other data"), but my guess is that in both cases the majority of these are actually managed heap and that vadump simply doesn't recognize that as "heap".
Thursday, 26 November 2009 14:54:28 (W. Europe Standard Time, UTC+01:00)
I wondered whether it was managed heap so I pointed both CLRProfiler and perfmon at it. Both seem to imply that the managed heap does not exceed 30 MB (although I could be reading this wrong). That still leaves about 90 MB (on my system) that is unaccounted for.

CLR Profiler is also pointing out a fair amount of reading of Zip files. Does it still need to open up all the jars? Could that be where the memory has gone?
Thursday, 26 November 2009 15:15:21 (W. Europe Standard Time, UTC+01:00)
On my system the ".NET CLR Memory / # Bytes in all heaps" counter accounts for three quarters of the private memory, so that looks OK to me.

It's correct that the plugin jars are still being read. It's probably possible to completely remove the need for the jars, but it's currently not my intention to work on this. I'm hoping someone else will pick this up though.
Sunday, 29 November 2009 23:52:58 (W. Europe Standard Time, UTC+01:00)
Rob,

How did you go about splitting up the response1.txt file to get it to compile on a 32 bit machine? I've been trying to figure out how to split the file up with no success.

Thanks,
Andrew
Andrew
Monday, 30 November 2009 05:17:57 (W. Europe Standard Time, UTC+01:00)
@Andrew,

I wrote a little script to split them up. I have uploaded them here http://robubu.com/files/. Copy the response files into the eclipse directory and then instead of running ikvm\bin\ikvmc @response1.txt run ikvm\bin\ikvmc @response2.txt and ikvm\bin\ikvmc @response3.txt. Let me know if that works.

@Jeroen,

I found a bug in icu4j that was causing a LOT of heap allocation (the patched jar can be found here http://robubu.com/files/com.ibm.icu_3.8.1.v20080530.jar). I'll post it to them in the next few days, need to ensure it is happening in java land. Anyway with that bug fixed I now get better vadump output (below). However it is still 125MB total with only 20 MBish sharable, compared to eclipse java which is 69 MB total. As I have dug through the heap I see a lot of static final strings getting allocated to the heap. Are you using StringFreezingAttribute? I know basically nothing about .net, is it worth turning that on? and if so how do you do it with ikvm? I also see from the doc though that it is deprecated. Do you know if there are plans to replace it with something else?

Thanks,

Rob

Catagory Total Private Shareable Shared
Pages KBytes KBytes KBytes KBytes
Page Table Pages 121 484 484 0 0
Other System 53 212 212 0 0
Code/StaticData 8235 32940 9764 17600 5576
Heap 1164 4656 4656 0 0
Stack 46 184 184 0 0
Teb 12 48 48 0 0
Mapped Data 265 1060 0 128 932
Other Data 21633 86532 86528 4 0

Total Modules 8235 32940 9764 17600 5576
Total Dynamic Data 23120 92480 91416 132 932
Total System 174 696 696 0 0
Grand Total Working Set 31529 126116 101876 17732 6508
Monday, 30 November 2009 08:59:27 (W. Europe Standard Time, UTC+01:00)
Rob,

There is a bug in ikvmc that caused a massive amount of strings to be allocated by the ikvm runtime at initialization. That has been fixed and I will post an updated snapshot soon.

The downside of StringFreezingAttribute is that it would require manually interning the strings (as Java requires) and it isn't clear this is worthwhile (esp. not with the other downsides of StringFreezingAttribute and the fact that many people don't run in with NGEN).

I will try your patched icu4j.

Regards,
Jeroen
Monday, 07 December 2009 16:22:03 (W. Europe Standard Time, UTC+01:00)
Jeroen,

I eagerly downloaded the latest 0.43 release to get the bug fix that you mentioned, but things seem to have got a little worse. It appears to start faster but memory sharing has gone down. Here's the latest vadump,

Thoughts?

Rob

Catagory Total Private Shareable Shared
Pages KBytes KBytes KBytes KBytes
Page Table Pages 139 556 556 0 0
Other System 45 180 180 0 0
Code/StaticData 7834 31336 10680 16572 4084
Heap 1164 4656 4656 0 0
Stack 51 204 204 0 0
Teb 15 60 60 0 0
Mapped Data 290 1160 0 132 1028
Other Data 25620 102480 102476 4 0

Total Modules 7834 31336 10680 16572 4084
Total Dynamic Data 27140 108560 107396 136 1028
Total System 184 736 736 0 0
Grand Total Working Set 35158 140632 118812 16708 5112
Monday, 07 December 2009 16:29:06 (W. Europe Standard Time, UTC+01:00)
Rob,

I've found it very hard to get reproducible numbers. I believe that the .NET GC adapts its memory allocation behavior based on how much physical memory the machine has free at any particular moment. So maybe you now have more physical memory available than with the earlier tests? That would be consistent with the fact that you see a slightly faster startup (which is caused by less time spent doing GC because the heap is larger).

Jeroen
Tuesday, 08 December 2009 04:05:09 (W. Europe Standard Time, UTC+01:00)
Jeroem,

Apologies for all the comments, but I can't get the numbers to properly add up and I have thrown all the tools that google has told me about at it.

I am now consistently getting a working set around 140MB and it is the same no matter how many eclipses I start or how much memory my machine has or does not have. The 'other data' is as above i.e. about 100 MB and represents the bulk of the private memory. What doesn't add up is that when I throw CLR Profiler at it, it shows that the total bytes allocated on the heap is only 88 MB and that the final heap size is only 20 MB. There is a tonne of memory going somewhere that I can't track down. I don't think it is on the heap and I am now well past my expertise on troubleshooting memory allocations on windows .net applications.

Any ideas? I would love to get to the bottom of this as having a java runtime that does a tonne of memory sharing across processes is very very appealing, http://robubu.com/?p=29

Rob
Tuesday, 08 December 2009 09:57:37 (W. Europe Standard Time, UTC+01:00)
Rob,

I also have trouble making sense of all the numbers. I did find that VMMap[1] is pretty good at showing where memory goes (and it even has some understanding of the managed heap).

I did more digging and found that you can save a couple of megs of managed heap by deleting *.class from the plugins jar files. I also deleted the *.source_* files (although I don't know if that had any effect) and the directory entries from the jars.

Regards,
Jeroen

[1] http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx
Tuesday, 08 December 2009 22:49:14 (W. Europe Standard Time, UTC+01:00)
Jeroen,

so this is starting to make more sense, VMMap is showing where the 100 MB is going. On my system it looks like, at most, 30 MB is going to the heap. The remaining 70 MB seems to be in the images and I think it is caused by them missing their base addresses and then having to get fixed up.

I pointed process explorer at it http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx. This appears to confirm the address misses. None of the eclipse libraries (that are responsible for the bulk of the private WS) are hitting their base address. This is due to them all having the same base address. In addition one IKVM library also stands out as having a large private WS, namely Security. When I look at that DLL in process explorer it is also missing its base address so this seems to confirm that the large private WS could be caused by missing the base address. So, a few more questions.

1) Does missing the base address explain the large private working set?
2) Is there anyway that you can automatically generate base addresses for the eclipse dlls? or should I start the rather tedious process of manually assigning them?
3) Can you update the base address for IKVM security as it seems to miss and that miss alone seems to be accounting for 2.6 MB,

Thanks,

Rob

Wednesday, 09 December 2009 16:18:20 (W. Europe Standard Time, UTC+01:00)
Rob,

Ah yes! The relocation fixups account for a large chunk of private working set. I must admit that I hadn't yet looked at the ngen-ed version with VMMap. I don't know why IKVM.OpenJDK.Security.dll isn't being loaded at its preferred address (in my case it is, but IKVM.OpenJDK.XML.Parser.dll isn't).

I don't have time to investigate at the moment, but I will get back to this.

The IKVM build process auto generates the base addresses (in a way that I had hoped would make it likely for them to be loaded at their preferred address, but the 32 bit address is pretty cramped). Maybe you could adept that code (it's pretty trivial) to do the Eclipse assemblies too. It is available here: http://ikvm.cvs.sourceforge.net/viewvc/*checkout*/ikvm/ikvm/tools/updbaseaddresses.cs?revision=1.1

Regards,
Jeroen
Saturday, 12 February 2011 18:21:15 (W. Europe Standard Time, UTC+01:00)
I try the same thing with the new eclipse helios but run into problems.
Dlls wich have or should have been created by the script and are later referenced as such in the same script cause these kinds of problems:
"Error: The process cannot access the file 'C:\compare\modelbus\plugins-compiled\org.apache.lucene.dll' because it is being used by another process."
If i reference only the namespace org.apache.lucene there is no problem. However, i am not sure if it is the same.

Thanks,
Michael
Tuesday, 03 January 2012 18:22:50 (W. Europe Standard Time, UTC+01:00)
Hey Jeroen, have you done any updates related to running Eclipse NGEN'd since this blog post? I'm looking to play around with NGEN'ing Eclipse but want to do it automated and on a new version of Eclipse. In the original post you mentioned making available the source for generating the response files. I poked around but did not see that.

In practice we will need a full OSGi resolver as more recent versions of Eclipse use Import-Package more so just looking at the manifests one by one will not be enough. I can whip up something that gens the files from a resolved OSGi state but need to know a bit more about the file structure. That info or the code or other pointers would help.

Jeff
Jeff McAffer
Wednesday, 02 May 2012 10:25:10 (W. Europe Daylight Time, UTC+02:00)
Hello

does it work with new eclipse? could you please share the source code for response files generation?

I would like to try it with the newest version of eclipse

Thanks in advance!

Sincerely,
Michael
Michael
Wednesday, 02 May 2012 10:39:28 (W. Europe Daylight Time, UTC+02:00)
I have not tried it with a recent version of Eclipse, but reportedly it won't work with just these changes.
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)  

Live Comment Preview