# Monday, 17 November 2008
« Introducing IKVM.Reflection.Emit | Main | Multi Target Support »
.NET Array Weirdness

One of the lesser known features of the .NET runtime is the support for multidimensional and/or non-zero-based arrays. The typical arrays that you use (and that are the same as the arrays in Java) are called vectors in the CLI specification, but unfortunately the non-vector arrays don't have a specific name (they are simply called arrays in the CLI specification).

In C# you can easily create a non-vector array:

int[,] arr1 = new int[4, 4]

This creates a two dimensional array of integers with 16 elements. This array is zero based. If you want to create a non-zero-based array, you have to use the API, because C# doesn't directly support that:

int[,] arr2 = (int[,])Array.CreateInstance(typeof(int), new[] { 4, 4 }, new[] { 1, 1 });

This also creates a two dimensional array of integers with 16 elements, but in this case the indexes run from 1 through 5.

So far, so good. Now let's look at how things look at the IL level. If you use ildasm to look at the first C# based instantation you'll see:

ldc.i4.1
ldc.i4.1
newobj instance void int32[0...,0...]::.ctor(int32, int32)

Most of this looks straightforward if you're familiar with IL, except for the [0...,0...] part. It looks like the lower bounds are part of the type, but a simple experiment shows that this is not the case:

Console.WriteLine(arr1.GetType() == arr2.GetType());

This prints out True. This implies that the lower bounds are not part of the type (and the CLI specification confirms this). So why are they part of the signature? I don't know for sure, but it does have an interesting consequence. You can overload methods based on this:

.assembly extern mscorlib { }
.assembly Test { }
.module Test.exe

.class public Program
{
  .method public static void Foo(int32[0...,0...] foo)
  {
    ret
  }

  .method public static void Foo(int32[1...,1...] foo)
  {
    ret
  }

  .method private static void Main()
  {
    .entrypoint
    ldc.i4.1
    ldc.i4.1
    newobj instance void int32[0...,0...]::.ctor(int32, int32)
    call void Program::Foo(int32[0...,0...])
    ret
  }
}

This is a valid and verifiable application. Unfortunately, using the  .NET reflection API it is impossible to see the difference in signatures between the two Foo methods.

This limitation in reflection means that code compiled with the new IKVM.Reflection.Emit based ikvmc won't be able to call methods that have non-zero-based array parameters. It is also impossible to override these methods, but that was already the case with the previous System.Reflection.Emit based implementation as well.

Finally, in the above text I talk about the lower bounds, but the same thing applies to the upper bounds of the array.

Monday, 17 November 2008 07:08:46 (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
Thursday, 11 December 2008 19:20:44 (W. Europe Standard Time, UTC+01:00)
Weird. To be honest, in code where I want to squueze out performance I avoid use of the rectangular arrays and use jagged arrays (arrays of arrays) instead. The ability to keep a reference to the 'inner' array when in an inner loop helps a lot - even after optimisation.
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)  

Live Comment Preview