Tuesday, March 31, 2009

Generic collections of value types

When we ported my company's codebase from .NET 1.1 to 2.0 I was excited about being able to use generic collections. As I was looking for guidance on when and how to use them I remember reading the following in the MSDN documentation for List<T>:
Performance Considerations
In deciding whether to use the List<(Of <(T>)>) or ArrayList class, both of which have similar functionality, remember that the List<(Of <(T>)>) class performs better in most cases and is type safe. If a reference type is used for type T of the List<(Of <(T>)>) class, the behavior of the two classes is identical. However, if a value type is used for type T, you need to consider implementation and boxing issues.
If a value type is used for type T, the compiler generates an implementation of the List<(Of <(T>)>) class specifically for that value type. That means a list element of a List<(Of <(T>)>) object does not have to be boxed before the element can be used, and after about 500 list elements are created the memory saved not boxing list elements is greater than the memory used to generate the class implementation.
So based on that we tried to use List<T> for reference types (objects) but rarely for value types (ints, strings, longs, etc). Now the other day I was reading the excellent book C# in Depth by Jon Skeet (which I highly recommend) and came across the following:
Let’s start with a simple situation first, with a single type parameter—we’ll use List<T> for the sake of convenience. The JIT creates different code for each value type argument—int, long, Guid, and the like—that we use. However, it shares the native code generated for all the closed types that use a reference type as the type argument, such as string, Stream, and StringBuilder. It can do this because all references are the same size (they are 4 bytes on a 32-bit CLR and 8 bytes on a 64-bit CLR, but within any one CLR all references are the same size). An array of references will always be the same size whatever the references happen to be. The space required on the stack for a reference will always be the same. It can use the same register optimizations whatever type is being used—the List<Reason> goes on.
Each of the types still has its own static fields as described in section 3.4.1, but the code that is executed is reused. Of course, the JIT still does all of this lazily—it won’t generate the code for List<int> before it needs to, and it will cache that code for all future uses of List<int>
After reading that I posted a question to the book’s forum and confirmed with the author what should have made sense to me all along. Yes the JIT creates a new class for List<int> but it reuses that class for all other variables of that type. So while the MSDN article is technically correct, for any reasonable size program it’s a red herring. We incur the memory overhead once, not for every invocation of List<int>. For the common value types (int, long, DateTime, double) we should be using generic collections.
Again from the book:
However, the performance benefits of generics can be huge, and again that comes down to having the opportunity to JIT to different code for different types. Consider a List<byte>, for instance. In .NET 1.1, adding individual bytes to an ArrayList would have meant boxing each one of them, and storing a reference to each boxed value. Using List<byte> has no such impact—List<T> has a member of type T[] to replace the object[] within ArrayList, and that array is of the appropriate type, taking the appropriate space. So List<byte> has a straight byte[] within it used to store the elements of the array. (In many ways this makes a List<byte> behave like a MemoryStream.)
Figure 3.3 shows an ArrayList and a List<byte>, each with the same six values. (The arrays themselves have more than six elements, to allow for growth. Both List<T> and ArrayList have a buffer, and they create a larger buffer when they need to.)
The difference in efficiency here is incredible. Let’s look at the ArrayList first, considering a 32-bit CLR.8 Each of the boxed bytes will take up 8 bytes of object overhead, plus 4 bytes (1 byte, rounded up to a word boundary) for the data itself. On top of that, you’ve got all the references themselves, each of which takes up 4 bytes. So for each byte of useful data, we’re paying at least 16 bytes—and then there’s the extra unused space for references in the buffer.
Compare this with the List<byte>. Each byte in the list takes up a single byte within the elements array. There’s still “wasted” space in the buffer, waiting to be used potentially by new items—but at least we’re only wasting a single byte per unused element there.
We don’t just gain space, but execution speed too. We don’t need the time taken to allocate the box, the type checking involved in unboxing the bytes in order to get at them, or the garbage collection of the boxes when they’re no longer referenced.


Final Recommendation

  • Use Generic collections for collections of common value types (int/long/double/DateTime) wherever possible.
  • Evaluate use of generics on a case by case basis for less used value types (uint/byte/bool/etc).
  • Fix up existing collections as you write new code
Also here is a good guide to replacing a Hashtable with a Generic Dictionary. It has a table showing the equivalent members in the Dictionary class from the Hashtable couterpoints.

Generic thread-safe access to Cache

In my last post I showed how we needed to move the log4net logger to Cache and use a lock to avoid multiple loads of the config file. Well we had some other places in the code that also stored objects in Cache so I wanted to repeat the same pattern. Using a generic method and a delegate I was able to create the following function that can safely access the Cache from multiple threads:
/// <summary>
/// Delegate to a method that returns a new instance of the object 
/// if to be used if its not found in the Cache.
/// </summary>
public delegate object CacheLoaderDelegate();

/// <summary>
/// Return a typed object from Cache. This method locks the creation of a 
/// new instance to provide thread safety.
/// </summary>
/// <typeparam name="T">Type of object to return.</typeparam>
/// <param name="key">The key used to reference the object.</param>
/// <param name="expirationMinutes">The interval between the time the
/// inserted object is last accessed and the time at which that object expires. 
/// If this value is the equivalent of 20 minutes, the object will expire and 
/// be removed from the cache 20 minutes after it was last accessed.</param>
/// <param name="loaderDelegate">Delegate to a method that returns a new 
/// instance of the object if to be used if its not found in the Cache.</param>
/// <returns>Instance of T</returns>
public static T GetObject<T>(string key, int expirationMinutes, CacheLoaderDelegate loaderDelegate) {
  Cache webCache = HttpRuntime.Cache;
  object value = webCache[key];

  // If the value is null create a new instance and add it to the cache.
  if (value == null) {

    // The reason for the lock and double null check is explained in this article
    // http://aspalliance.com/1705_A_New_Approach_to_HttpRuntimeCache_Management.all

    // It shows a coding flaw where multiple threads are checking the cache 
    // concurrently. Each of them discovers the object missing from cache and
    // attempts to create and insert it. Because thread 1 has not inserted the 
    // object before thread 2 checks for it, they both believe it’s missing and both 
    // create it. The solution is to use a mutex that prevents multiple threads from 
    // simultaneously entering the section of code that populates the cache.
    lock (typeof(CacheUtils)) {
      value = webCache[key];
      if (value == null) {
        // Call loader delegate to a get a new instance.
        value = loaderDelegate();

        // Insert the new object in the Cache for a sliding number of minutes. 
        // This means if the ConfigurationManager is not accessed in that time period 
        // it will be removed from cache. Also see notes in the class comments.
        webCache.Insert(key, value, null, Cache.NoAbsoluteExpiration, 
        TimeSpan.FromMinutes(expirationMinutes));
      }
    }
  } 

  // Verify that value (whether from the cache or the delegate is of the requested type.
  if (!(value is T)) {
    string message = string.Format("The value in the cache or returned by the loader delegate does not match the requested type: {0}", typeof(T));
    throw new InvalidCastException(message);
  }

  // Return the requested type.
  return (T)value;
}

Friday, March 20, 2009

Log4Net logging from a web farm

We have a asp.net web handler that we use to process web survey submissions. We have been using log4net for debug and error logging for awhile now and recently have been having some severe performance issues with it.

When we profiled the code we found that the initialization of the Logging component was spending a fair amount of time reading in the XML configuration. We were initializing the component on each page request but it didn't need to be. We figured we could put it inside of an object that we get from Cache and share it accross multiple requests. 
public static ConfigurationManager GetInstance(Cache webCache, string configPath) {
    ConfigurationManager configManager = (ConfigurationManager) webCache[CACHE_KEY];
    if (configManager == null) {
        // Create a new manager from the stored config information.
        configManager = new ConfigurationManager(configPath);
    }
    return configManager;
}

Sure enough that worked but there were still times when under heavy load it was taking longer than expected to load. One of the team members found a article which helped identify the next problem, too many requests trying to initialize the ConfigurationManager at the same time. Time to apply some locking:
private static readonly object locker = new object();
public static ConfigurationManager GetInstance(Cache webCache, string configPath) {
    ConfigurationManager configManager = (ConfigurationManager) webCache[CACHE_KEY];
    if (configManager == null) {
        lock(locker) {
            ConfigurationManager configManager = (ConfigurationManager) webCache[CACHE_KEY];
            if (configManager == null) {
                // Create a new manager from the stored config information.
                configManager = new ConfigurationManager(configPath);
            }
        }
    }
    return configManager;
}

That's better but still one more thing we need to ensure. We are using a RollingFileAppender to log errors to a single log file. By default the FileAppender and RollingFileAppender establish exclusive locks on the output file when they are created. They also support another locking model call MinimalLock which only locks the file when its being written to. In a web scenario you should almost never use the default exclusive lock especially if the log compoent is being cached for multiple requests.