# Tuesday, 26 October 2010
« Memory Model Fix | Main | C# Async CTP »
How to Hack Your Own JIT Intrinsic

Yesterday I wrote about Thread.MemoryBarrier() and some of its performance characteristics. I wanted to do some benchmarking to see whether mfence really is faster than a locked memory operation, but instead of writing a microbenchmark (which I had already done) I wanted to run code that was a little bit more "real". So I came with a hack to allow mfence to be used in managed code. Please note that this is a hack and not something you should use outside of an experimental context. The code is available here.

The code includes a microbenchmark, but not the "real" benchmark (based on LinkedBlockingQueue) that I used.

I also tested a Pentium 4 class machine and a Core i7. On the Pentium 4 and my Core 2 Duo the mfence wins out signicantly, but on the Core i7 mfence is significantly slower, oddly enough.

How the Hack Works

The MemoryBarrierHack.cs file contains two classes, __Hack__DoNotUse and Program. The __Hack__DoNotUse class contains the MemoryBarrier method and a static constructor to patch the MemoryBarrier method. The MemoryBarrier method is patched to patch the call site of its caller and replace the call with an mfence and a mov al,imm8 as a filler. This means that when you want a memory barrier, you simply call the static method and when that call executes, the first time it will act as a memory barrier (because of the locked memory operation) and also patch the call so that the next time it will be an mfence instruction.

I built a modified IKVM runtime that uses this trick and used that to benchmark the LinkedBlockingQueue. On my system it showed a performance improvement of about 7% with mfence versus .NET 4.0's MemoryBarrier instrinsic.

Tuesday, 26 October 2010 11:50:03 (W. Europe Daylight Time, UTC+02:00)  #    Comments [2]