Tuesday, December 9, 2008

OnDeserializationAttribute, Dictionary members, and GetHashCode

As it turns out, the Dictionary<T, U> generic collection in the .NET Framework uses custom serialization and builds itself in its OnDeserialized method. What this means is that, if you deserialize such a Dictionary (using any of the Formatters), it will proceed to deserialize its keys and values and add them back to itself. No problems there, right?

What's a bit surprising is that the keys and values do not have any of their custom serialized members (such as, oh, their own generic Dictionary members) deserialized before that happens. This could prove a real problem for you if you override object.Equals, implement IEquatable<T>, and override GetHashCode. When a key is added to a Dictionary, it is added by hashcode (no surprise). That hashcode is then immutable, at least as far as the Dictionary is concerned. Changes to the key that alter the hashcode do not change the hashcode for the object that the Dictionary remembers. Normally, that's not a problem, but some methods, such as Dictionary<T, U>.ContainsKey, manage their fast operation by comparing hashcodes -- in the case of that method, it looks for a key with the right hashcode, then calls key.Equals(candidate). These methods, therefore, will not perform as expected if modifications are made to the keys that affect their hashcode. Optimally, the keys should be immutable.

All discussion of the innards of the Dictionary class aside, the real point here is that if you have keys in a Dictionary that are mutable objects (let's call instances of class Entity) whose hashcodes can change based on the values of their members, and any of those members are themselves Dictionaries, the hashcodes that the Dictionary<Entity, U> object stores will be incorrect on deserialization. At the time the Dictionary of Entities is deserialized, the Dictionary members of the Entity instance have not yet been deserialized, so the Entity object's hashcode is incorrect given its actual data. You'll be in the fun position of having keys that are in your Dictionary (via the Keys property) appear as not being in your Dictionary (according to ContainsKey). This can be a real pain to debug, especially if you're working in ASP.NET and you have relatively limited visibility into the automated (de)serialization process.

In the case above, the solution is to add an OnDeserialization method to your Entity class and mark it with the OnDeserialization attribute. In that method, you should call the OnDeserialization method for each of your dictionaries. Currently, the implementation of the Framework is such that your OnDeserialization method will be called before the Dictionary calls GetHashCode on the instance (thankfully, as it wouldn't make sense any other way), and your hashcodes will be correct.

(See http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=322952 for Microsoft's confirmation of this behavior and solution.)

No comments: