June 4, 2010

How proxies may break reflexivity of the equals method

Recently I came across a strange behaviour in one of my applications. Look at the following simplified piece of code.


public bool RemoveFirst(IList<IPerson> persons)
{
    var firstPerson = persons.First();
    return persons.Remove(firstPerson);
}

Although in my unit test fixtures the implementation was proven to work well, in integration tests the method failed and always returned false. It took me quite a while to figure out that the use of interface proxies actually broke the reflexivity of the Equals method. If you are not familiar with the term reflexivity: Basically it means that given a relation over a set of elements, each element is in relation to itself, what in our concrete case means that each element should equal itself. Obviously the list needs this to determine the index of the item which should be removed.


So what happened? I already noted that I use proxies in my application. Proxies offer a great opportunity to make use of aspect orientated programming paradigms without actually using an AOP framework like PostSharp. In particular, I am using the DynamicProxy library of the castle project for proxy generation. Although this post will base on DynamicProxy most of it will apply to most proxy generation libraries.


Basically there are two types of proxies: class proxies and interface proxies. The idea behind the former is to construct a new class which derives from the class that should be proxied and override all virtual members. For interface proxies on the other hand, an independent class is created which implements the provided interface and delegates calls to an instance of the proxied class. Obviously there are two seperate objects involved in the interface proxy approach. Note: I am not going to go into detail which approach is "better" or fits best some particular needs, but I am planning to do so in some future post.


Here is a typical Equals implementation of a Person class which has been generated by ReSharper and checks if the Id property is equal. Note: I am assuming that there exists an IPerson interface whose solely purpose is to be able to use interface proxies, so there are no other implementations of IPerson than Person and the dynamically created proxies.


public bool Equals(Person other)
{
    if (ReferenceEquals(null, other)) return false;
    if (ReferenceEquals(this, other)) return true;
    return other.Id.Equals(Id);
}

public override bool Equals(object obj)
{
    if (ReferenceEquals(null, obj)) return false;
    if (ReferenceEquals(this, obj)) return true;
    if (obj.GetType() != typeof (Person)) return false;
    return Equals((Person) obj);
}

public override int GetHashCode()
{
    return Id.GetHashCode();
}

In unit tests you are typically using only direct instances of Person because they are easier to create and a proxy is believed to be a transparent layer which has no side effects. If you look in detail, you will notice that there is an explicit check of the other object's type in line 12 which will obviously cause the method to return false if comparing a proxy to itself (the interface proxy delegates the evaluation of Equals to the proxied instance, so the object is of type Person, but the other object is still the interface proxy). There are quite a few variations of this particular line which have different impacts on your application.


// Variation 1
if (obj.GetType() != GetType()) return false;

// Variation 2
var other = obj as Person;
if (ReferenceEquals(null, other)) return false;

// Variation 3
var other = obj as IPerson;
if (ReferenceEquals(null, other)) return false;

Variation 1 would work fine for class proxies (note that if we would use class proxies line 11 would already return true), but the usage interface proxies introduces a second, different type for each proxied object what causes this variation to fail. The other two variations are following Microsoft's guideline on implementing the Equals method and differ only slightly by either allowing Person or IPerson. Of course only IPerson will lead to the desired behaviour, but we need to think of the side effects caused by this modification. Now, we are not only allowing instances of Person to be compared with each other successfully, but also any implementation of IPerson.


For what reason are we introducing (non abstract) class hierarchies in data models? Usually we want to express that there are entities which differ from others in a way so that they need to be treated differently by our application. I believe that in well designed systems such a difference between an instance of a super class and a sub class should break equality. For that reason I do not really like the guideline's way and I prefer to stick with variation 1.


As already stated above, variation 1 will not work with interface proxies. I think the best way is to explicitly handle proxies. Although they have a different type, this is just a necessity coming up during implementation and not part of the class design which means that we consider an interface proxy for a direct Person instance to be a Person. One way to do so is to access the proxy target directly for equality comparison:


// insert after line 11
var targetAccessor = obj as IProxyTargetAccessor;
if (!ReferenceEquals(null, targetAccessor))
    return Equals(targetAccessor.DynProxyGetTarget());

It is important to think about how introducing proxies impacts your system. You should always be aware of the fact your application is using proxies, so consider to use them also in unit tests or at least in integration tests.

June 2, 2010

Hello out there

I am a student and C# developer from Austria. I decided to start a blog about my development experiences. Let's see where this will end..