Tuesday, November 10, 2009

Going Crazy with Generics, or The Story of ThreadLocal(Of T)

Jon Skeet recently blogged about the performance of [ThreadStatic] versus the new .NET 4.0 ThreadLocal<T>. I was surprised to see that ThreadLocal<T> was faster than [ThreadStatic], because ThreadLocal<T> uses [ThreadStatic] as the underlying primitive.

How do you go from a static field to a per instance field? It's simple, once you think of it. You (ab)use generic types. Here's a simplified ThreadLocal<T>:

public class ThreadLocal<T>
{
  HolderBase holder;
  static int count;
  static Type[] types = new Type[] { typeof(C1), typeof(C2), typeof(C3) };

  abstract class HolderBase
  {
    internal abstract T Value { get; set; }
  }

  class C1 { }
  class C2 { }
  class C3 { }

  class Holder : HolderBase
  {
    [ThreadStatic]
    static T val;

    internal override T Value
    {
      get { return val; }
      set { val = value; }
    }
  }

  public ThreadLocal()
  {
    holder = MakeHolder(Interlocked.Increment(ref count) - 1);
  }

  HolderBase MakeHolder(int index)
  {
    Type t1 = types[index % 3];
    Type t2 = types[(index / 3) % 3];
    Type t3 = types[index / 9];
    Type type = typeof(Holder<,,>).MakeGenericType(typeof(T), t1, t2, t3);
    return (HolderBase)Activator.CreateInstance(type);
  }

  public T Value
  {
    get { return holder.Value; }
    set { holder.Value = value; }
  }
}

The real ThreadLocal<T> type in .NET 4.0 beta 2 is much more complex, because it has to deal with recycling the types and protecting against returning a value from a recycled type. It also uses a higher base counting system  to number the types, the maximum number of types generated (per T) is 4096 in beta 2. After you allocate more than that, it falls back to using a holder type that uses Thread.SetData().

I'm not sure what to make of this. It's a clever trick, but I think it ultimately is too clever. I benchmarked a simpler approach using arrays (where each ThreadLocal<T> simply allocated an index in the [ThreadStatic] array) and it was a little bit faster and doesn't suffer from the downsides of creating a gazillion types (which probably take more memory and those types stay around until the AppDomain is destroyed).

Finally a tip for Microsoft, move the Cn types out of ThreadLocal, because currently they are also generic (due to the fact that C# automatically makes nested types generic based on the outer type's generic type parameters) and that is unnecessarily wasteful.

11/10/2009 7:03:31 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [0]
Name
E-mail
Home page

I apologize for the lameness of this, but the comment spam was driving me nuts. In order to be able to post a comment, you need to answer a simple question. Hopefully this question is easy enough not to annoy serious commenters, but hard enough to keep the spammers away.

Anti-Spam Question: What method on java.lang.System returns an object's original hashcode (i.e. the one that would be returned by java.lang.Object.hashCode() if it wasn't overridden)? (case is significant)

Answer:  
Comment (HTML not allowed)