# Tuesday, 07 August 2007
Floating Point "Redundant" Cast Performance on the CLR

Let's start out with a simple micro benchmark:

using System;
using System.Threading;
class Program
{
public static void Main()
{

int start = Environment.TickCount;
double[] d = new double[1000];
for (int i = 0; i < 1000000; i++)
{
for (int j = 0; j < d.Length; j++)
{
d[j] = (double)(3.0 * d[j]);
}
}
int end = Environment.TickCount;
Console.WriteLine(end - start);
}
}

On my system this takes about 7 seconds when run in optimized mode (i.e. not in the debugger).

Here's the optimized x86 code generated by the 2.0 CLR JIT for the body of the inner loop:

fld   qword ptr [ecx+edx*8+8]      ; d[j]
fmul  dword ptr ds:[007B1230h]     ; * 3.0 
fstp  qword ptr [esp]              ; (double) 
fld   qword ptr [esp]              ; (double) 
fstp  qword ptr [ecx+edx*8+8]      ; d[j] = 

There first thing that jumps out is that the double cast takes two x87 instructions, a store and a load. Part of the reason the cast is expensive is because the value has to leave the FPU and go to main memory and back into the FPU. In this particular case it turns out to be very expensive, because esp happens to be not 8 byte aligned.

Making a seemingly unrelated change can make the micro benchmark much faster, just adding the following two lines at the top of the Main method will make the loop run in about 2.3 seconds on my system:

    double dv = 0.0;
Interlocked.CompareExchange(ref dv, dv, dv);

The reason for this performance improvement becomes clear when we look at the method prologue in the new situation:

push  ebp 
mov   ebp,esp 
and   esp,0FFFFFFF8h 
push  edi 
push  esi 
push  ebx 
sub   esp,14h

This results in an 8 byte aligned esp pointer. As a result the fstp/fld instructions will run much faster. It looks like a "bug" in the JIT that it doesn't align the stack in the first scenario.

Of course, the much more obvious question is: Why does the cast generate code at all, isn't a double already a double?

Before answering this question, let's first look at another minor change to the micro benchmark. Let's remove the Interlocked.CompareExchange() again and change the inner loop body to the following:

    double v = 3.0 * d[j];
d[j] = (double)v;

With this change, the loop now takes just 1 second on my system. When we look at the x86 code generated by the JIT, it becomes obvious why:

fld   qword ptr [ecx+edx*8+8] 
fmul  dword ptr ds:[002A1170h] 
fstp  qword ptr [ecx+edx*8+8]

The redundant fstp/fld instructions are gone.

Back to the question of why the cast isn't always optimized away. The reason for this lies in the fact that the x87 FPU internally uses an extended 80 bit representation for floating point numbers. When you explicitly cast to a double, the ECMA CLI specification requires that this results in a conversion from the internal representation into the IEEE 64 bit representation. Of course, in this scenario we're already storing the value in memory, so this necessarily implies a conversion to the 64 bit representation, making the extra fstp/fld unnecessary.

Finally, in x64 mode all three variations of the benchmark take 1 second on my system. This is because the x64 CLR JIT uses SSE instructions that internally work on the IEEE 64 bit representation of doubles, so the cast is optimized away in all situations here.

For completeness, here's the code generated by the x64 JIT for the inner loop body:

movsd  xmm0,mmword ptr [rcx] 
mulsd  xmm0,mmword ptr [000000C0h] 
movsd  mmword ptr [rcx],xmm0 
Tuesday, 07 August 2007 15:02:08 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Monday, 06 August 2007
IKVM 0.34 Update

I made another 0.34 update, since 0.36 is probably still a ways off.

Changes:

  • Fixed handling of magic “assembly” type for assembly attribute annotations (bug #1721688).
  • LocalVariableTable robustness fix (bug #1765952).
  • Fixed handling of public interfaces extending non-public interfaces in ikvmc.
  • Fixed parameter annotations on redirected contructors.
  • Fixed casting ghost interface arrays (bug #1757889).
  • Fixed JNI NewObject method.
  • Fix to make sure all implemented interface methods on .NET types are public (so that ikvmstub generates jars that javac is happy with).

Files are available here: ikvm-0.34.0.4.zip (source + binaries) and ikvmbin-0.34.0.4.zip (binaries).

Update: I forgot to update the AWT toolkit property's assembly version. Fixed in current zips. Thanks to Ted O'Conner for pointing this out.

Monday, 06 August 2007 10:41:11 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Thursday, 02 August 2007
What about AWT / Swing Support?

In the comments to the previous entry Martin asked:

Would you support AWT / Swing as the Sun Version?

I've always said that AWT / Swing are not a priority for me and that won't change. Having said that, my current idea is to have two AWT back-ends, the default one similar to the current situation, i.e. a .NET Windows Forms based partial implementation. The second (optional) one would be the OpenJDK based implementation, using the OpenJDK native libraries. However, as usual, this is all subject to change.

Andrew asked:

Okay, just wanted to clarify what you mean by 'integrating'. My understanding is you actually replace the GNU Classpath code with code copied from OpenJDK and then try and implement the necessary VM wotsits -- is that correct? If that is the case, presumably the end goal is to complete replace all the Classpath code. At the moment, that'd be a real shame because it would make IKVM more broken as lots of stuff like AWT/Swing is broken on OpenJDK.

That's not really true on IKVM. IKVM has never supported GNU Classpath's AWT/Swing implementation (it includes only the GNU Classpath implementation of the public API, the peer implementations are not used).

In case you're wondering why IKVM doesn't support the GNU Classpath peers, that's because the GNU Classpath peers don't support Windows and also because IKVM is built on top of the CLI and (almost all) the IKVM binaries are OS and CPU architecture independent, using native code is not compatible with that feature.

Thursday, 02 August 2007 09:18:13 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Wednesday, 01 August 2007
New Hybrid Snapshot

Major code bloat. I've now integrated the bulk of the OpenJDK code that isn't awt or swing. As a result the IKVM.Hybrid.GNU.Classpath.OpenJDK.dll assembly size has grown to about 26 MB. API coverage compared with JDK 1.6 is now at more than 98%. Note that doesn't mean that everything will work, because some back-end implementations are stubbed out or not included.

Disclaimers apply. I haven't done a full test pass on this build.

Changes:

  • OpenJDK: Integrated javax.management package (and sub packages).
  • OpenJDK: Integrated java.lang.management package (only a stub back-end implementation though).
  • OpenJDK: Integrated javax.imageio package (excluding the jpeg support, because OpenJDK uses native code for that).
  • OpenJDK: Integrated javax.activation, javax.annotation, javax.jws, javax.lang.model, javax.tools, javax.xml.*, org.jcp.xml.dsig.internal, org.relaxng.datatype, org.w3c.dom.*, org.xml.sax.* packages.
  • OpenJDK: Integrated javax.sql.* packages.
  • OpenJDK: Integrated javax.accessibility, javax.transaction, javax.activity packages.
  • OpenJDK: Integrated javax.print.* packages (no back-end implementation and ServiceUI is stubbed.)
  • OpenJDK: Integrated org.omg.*, javax.rmi.*, javax.sound.*, org.ietfs.jgss packages.
  • Fixed JNI NewObject method to actually create an object of the requested class, instead of the class of the constructor.
  • Added method name clash handling for AOT access stub methods.

Binaries available here: ikvmbin-hybrid-0.35.2769.zip.

Wednesday, 01 August 2007 10:16:12 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]
# Thursday, 26 July 2007
New Hybrid Snapshot

A major step in integrating OpenJDK code. One of the bottlenecks was the fact that currently OpenJDK is missing most of the crypto code, but thanks to the IcedTea project I've been able to integrate the OpenJDK java.security package and packages that depend on the crypto code (like java.net). I've also rewritten all socket implementation classes (both classic and nio) based on the OpenJDK code. FileChannelImpl and the direct and mapped byte buffers still need to be converted.

Some of the integrated packages don't have any back-end implementation (e.g. smartcardio and jgss). I'm not likely to implement smartcard support and will revisit the security and crypto stuff once Sun releases the crypto code.

Disclaimers apply. I haven't done a full test pass on this build.

Changes:

  • OpenJDK: Added support for "loading" fake native libraries from VFS and removed hack to bypass loadLibrary() call in System.initializeSystemClass().
  • OpenJDK: Integrated OpenJDK packages: java.net, java.security, java.util.jar, javax.naming, javax.net, javax.security, javax.smartcardio, java.nio.charset, java.nio.channels, java.nio.channels.spi
  • OpenJDK: Integrated IcedTea crypto/security classes.
  • OpenJDK: Fixed a race condition in Thread.interrupt().
  • Allow Object[] to be cast/assigned to ghost array. Fix for bug 1757889.
  • Fixed assembly annotation support (bug 1721688).
  • Added ikvmc warning when annotation type isn't found.
  • Added WINDOWS constant to ikvm.internal.Util to check if we're running on Windows.

Binaries available here: ikvmbin-hybrid-0.35.2763.zip.

Thursday, 26 July 2007 14:15:37 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Wednesday, 11 July 2007
.NET Framework Security Update

Yesterday Microsoft released a security update for the .NET Framework that fixes the bug I reported last December. I'll write a more detailed analysis (including a proof-of-concept exploit) in a couple of weeks.

Wednesday, 11 July 2007 07:10:17 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Wednesday, 04 July 2007
New Hybrid Snapshot

I decided to take a break from integrating new OpenJDK packages and instead focus on running some test cases and work on stabilization. Contrary to the previous hybrid snapshots, this version should be fairly usable. Feedback is appreciated.

Changes:

  • OpenJDK: Implemented java.util.concurrent.locks.LockSupport.
  • OpenJDK: Fixed race condition in Thread.interrupt() that could cause cli.System.Threading.ThreadInterruptedException to be thrown from interruptable waits/sleep.
  • OpenJDK: Imported and modified java.util.concurrent.locks.AbstractQueuedSynchronizer to make it more efficient and to remove the use of ReflectionFactory & Unsafe to reduce initialization order dependencies.
  • OpenJDK: Changed unsafe to use more efficient internal helper class to copy java.lang.reflect.Field and make it accessible (this also reduces initialization order dependencies).
  • OpenJDK: Added lib/logging.properties to VFS and implemented an additional VFS operation required for reading it.
  • OpenJDK: Changed VFS ZipEntryStream.Read() to always try to read the requested number of bytes, instead of returning earlier. For maximum compatibility with real file i/o.
  • OpenJDK: Commented out system property setting in sun.misc.Version that was setting some version properties to bogus values.
  • OpenJDK: Fixed ObjectInputStream.latestUserDefinedLoader() to skip mscorlib stack frames.
  • OpenJDK: Added support to VFS for the VFS root directory.
  • OpenJDK: Fixed FieldAccessorImpl to check the type of the object passed in.
  • OpenJDK: "Implemented" ClassLoader.retrieveDirectives() by returning empty AssertionStatusDirectives object. Enabling assertions on the ikvm.exe command line is still not implemented.
  • OpenJDK: Switched to javac compiler for building OpenJDK sources.
  • Added workaround for .NET 1.1 reflection bug that causes methods that explicitly override a method to show up twice.
  • Changed handling of ldfld & lfsfld opcodes in remapper to bypass "magic".
  • Restructured handling of fields defined in map.xml to enable referencing them from map.xml method bodies.
  • Fixed java.security.VMAccessController to make sure that if there is only system code on the stack, the resulting AccessControlContext isn't empty.
  • Updated to compile with GNU Classpath HEAD.

Binaries available here: ikvmbin-hybrid-0.35.2741.zip.

Wednesday, 04 July 2007 15:04:10 (W. Europe Daylight Time, UTC+02:00)  #    Comments [0]
# Wednesday, 27 June 2007
New Hybrid Snapshot

I'm getting tired of writing the disclaimers and warnings, but they still apply.

The most important change in this snapshot is that there's now a virtual file system for the java.home directory. This has been a long time coming, but the proverbial straw was the fact that the OpenJDK timezone code reads files from the java.home/lib/zi/ directory (IMHO they really should be using resources for these things).

Currently the virtual java.home directory is C:\.virtual-ikvm-home\ on Windows and /.virtual-ikvm-home/ on Unix, but this is subject to change (please let me know if you have thoughts on this). The only contents in there so far is the /lib/zi/ directory tree and only a few file operations are supported (notably the ones required by the timezone code), but expect that eventually all (read-only) file system operations will be supported and more virtual files to appear in there.

Why a Virtual File System Instead Of a Real Java Home Directory?

The main reason is that I want IKVM to behave like a .NET library as much as possible. That means it should be possible to install it into the GAC and support the versioning and side-by-side capabilities of .NET, that's very hard to do when you have to manage real directories.

Changes:

  • OpenJDK: Integrated java.util.spi, java.util.prefs and java.util.logging packages.
  • OpenJDK: Integrated java.text and java.text.spi packages (except for java.text.Bidi class, for which Sun uses native code, so we'll continue to use GNU Classpath's pure Java version.)
  • OpenJDK: Changed build script to include all resources from OpenJDK generated resources.jar.
  • OpenJDK: Integrated java.rmi package.
  • OpenJDK: Changed system/extension class loader creation to make sure that an extension class loader always exists if there is a non-assembly system class loader.
  • OpenJDK: Improved exception handling in java.io.FileDescriptor.
  • OpenJDK: Removed AccessController.doPrivileged() call in Unsafe.fieldOffset(), to work around Mauve brokenness.
  • OpenJDK: Implemented the beginnings of a virtual file system for the java.home directory.
  • Changed JVM.IsUnix to use Environment.OSVersion.Platform.

Binaries available here: ikvmbin-hybrid-0.35.2734.zip.

Wednesday, 27 June 2007 08:57:35 (W. Europe Daylight Time, UTC+02:00)  #    Comments [3]
# Thursday, 21 June 2007
New Hybrid Snapshot

Another hybrid snapshot update. Just a reminder again: These snapshots have not been tested extensively and are known to be broken (and are much more broken than the non-hybrid snapshots I used to release). They are only intended to be used for testing and getting a feel of how things are going. If you find a bug specific to this snapshot, please don't file a bug on SourceForge, simply send a message to the ikvm-developers list or to me directly.

I'm not going to list everything that's known to be broken, but I will say that Eclipse 3.2 still runs, so at least some parts do work ;-)

This build also includes several GNU Classpath fixes that I haven't yet checked in, so if you're trying to do a hybrid build from cvs you'll end up with something even more broken than this build. [Update: I checked them in.]

Changes:

  • OpenJDK: Upgraded to OpenJDK bundle b13.
  • OpenJDK: Added more resources.
  • OpenJDK: Integrated java.lang.annotation and java.lang.ref packages.
  • OpenJDK: Integrated java.io and java.util packages.
  • Added -serialver option to ikvmstub that forces stub generator to include serialVersionUID field for all serializable classes (to make Japi results more accurate).
  • Fixed GetParameterAnnotations() to return the correct array length for instancehelper__ methods (static methods that represent instance methods on remapped types).
  • Fixed ikvm.io.InputStreamWrapper.available() to return non-zero when more data is available (as suggested by Mark Reinhold). Removed the NormalizerDataReader specific fix from map.xml.
  • Fixed ikvmstub to better handle private interface implementations. (Fixes System.Web.UI.Control subclassing issue).

Binaries available here: ikvmbin-hybrid-0.35.2728.zip.

Thursday, 21 June 2007 07:23:42 (W. Europe Daylight Time, UTC+02:00)  #    Comments [1]
# Tuesday, 19 June 2007
Five Years

Wow. Today it's five years ago that I started blogging about IKVM. I'd write more, but I'm having too much fun working on integrating the OpenJDK libraries :-)

Tuesday, 19 June 2007 06:32:18 (W. Europe Daylight Time, UTC+02:00)  #    Comments [3]