Software Testing Blog

Delegates and structural identity

Suppose we have a couple methods:

static void M(Function f) { ... }
static bool P(int x) { return x > 0; }

and couple delegate types:

delegate bool Function(int x);
delegate bool Predicate(int x);

And some code which calls the method:

Function function = P;
Predicate predicate = P;
M(P);         // OK
M(function);  // OK
M(predicate); // Compilation error?!

A question I am frequently asked is “why is there a type mismatch error on the last line, but the previous two lines work fine?” They appear to all do the same thing, so why is one of them considered wrong?

This is unfortunate, and if we had to design the whole language and runtime type system over again from scratch, likely this would be made to work.

That however, does not actually answer the question. So let’s take a closer look at what is going on here.

First, how does the first call M(P) work? When a method group — that is, an expression which gives the name associated with one or more methods, but which does not invoke the method — is used in a context where a delegate value is expected, then the compiler does overload resolution as though there had been an argument list there, where the types of the arguments were the types of the delegate’s formal parameters. That is the compiler determines which method called P would be invoked if the code contained P(some_int). It also verifies that the return type of the chosen method is compatible with the return type of the delegate. The compiler then generates code that supplies a delegate object that refers to the chosen method. This feature was added to C# 2.0; before that you would have had to write

M(new Function(P)); 

Similarly for the assignments of P to local variables function and predicate; overload resolution is used to determine which method named P is the unambiguous match, and then it is converted to the appropriate delegate type.

It shouldn’t be surprising the M(function) works; the argument is of the necessary type. But why should M(predicate) fail? It seems like everything necessary is supplied.

Well, let’s consider a slightly different case:

class Meters
  public double value;
class Miles
  public double value;

These classes are not exactly paragons of good design, but bear with me for a moment here. I hope that you agree that passing an instance of Miles to a method that expects Meters should be very illegal! Just because two types have the same fields with the same names in the same order does not mean that they are semantically identical and can be used interchangably. Type systems which support this kind of conversion are called structural type systems, because whether two things are identical is determined by their structure, not by their name. Delegates do not have structural identity in the CLR type system. Their identity is determined by their full name and number of generic type parameters.

Now, what are some reasons against having structural identity on delegates? Well, just as meters are not miles, you might have two kinds of delegates that are structurally the same but logically different:

delegate bool TryAction(int x);
delegate bool Predicate(int x);

A TryAction takes an integer, produces a side effect, and returns a Boolean indicating success; a Predicate takes an integer and determines if it does or does not meet some condition but has no side effect. You certainly don’t want to mix them up, particularly if you’re trying to tame some side-effect-infested code.

However, in practice we’ve seen that (1) people do not really use the delegate type system to express these sorts of concerns, and (2) the error is unlikely in the first place; one typically does not accidentally use an event handler that produces a side effect where a LINQ predicate is intended. If had we to do it all over again, I think there would be a strong push towards making delegates use structural typing.

Given all that, what can you do about the problem? Suppose you have a Predicate in hand and you need a Function, what can you do?

The easiest thing to do is to use another method group conversion:

M(predicate.Invoke); // OK!

Every delegate has an Invoke method, which from the type system’s point of view is just another method that can be turned into a delegate. Of course what you end up with is a delegate that invokes another delegate, which invokes a method. This is yet another proof of the famous computer science maxim “every problem but one can be solved by adding another level of indirection”. (The one exception is of course “My program has too much indirection”.)

Now, you might wonder if that’s the case why the C# compiler doesn’t simply generate M(new Function(predicate.Invoke)) when it sees M(predicate). In fact, that is precisely what Visual Basic does! This illustrates a subtle but very real difference in the design philosophies of VB and C#: VB says “this code looks like it is expressing an intention so I will try to fulfil that intention even if I have to do something a bit goofy“. C# says “this code looks like it might be wrong so I’m going to bring it to the attention of the developer“. Neither is the “right” principle; they’re just different design philosophies. The C# developer also might think it a bit weird if Predicate, a reference type, can be converted to Function, another reference type, and yet referential identity is violated; that’s unusual in C#.

  1. As always a very nice writeup. I was wondering, could you expand a bit on what .NET would have look like if delegates would follow structural typing? Would there even be something like a delegate?

  2. Eric, another way to work around the problem is with the new operator: M(new Function(predicate));.

    Does the compiler handle this differently from M(predicate.Invoke);?

  3. Pieter, I think these days most people do use delegates with structural typing thanks to an addition to the standard library: the handy Func generic delegate. The issue only comes about when you try to convert one delegate type into another, but if everyone can agree on a single delegate type to use (aka Func) you’ll never see a conversion issue.

    I would imagine a do-over of C# would either disallow users from creating their own delegate types (delegate bool Func()) or strongly discourage the use of Func from the beginning. Ever since .Net 3.5 when Func was introduced, I don’t think I’ve had a reason to use a custom delegate type.

  4. Hmm. I would hope that if you were going to do it all over again, you would keep the semantic typing: most of the problems in question would be gone because generics and the Func/Action delegates would already have been defined. Therefore, there would be far less need to create explicit delegate types in the first place, *except* for cases like your Predicate/TryAction. As an example, compare List.Find(Predicate) and Enumerable.Where(Func<T, bool>). See also “PropertyChangedEventHandler” vs “EventHandler<PropertyChangedEventArgs>”

    (I’m sure that if I use the brackets explicitly, this will be one of those blogs that mangles them, and if I use lt/gt, it will be one of the blogs that should have just used the brackets because it won’t escape the HTML entities. Given that, I choose the one that, should I guess wrong, will provide info that readers can figure out, and not the one that would be missing entirely)

  5. Point (1) “people do not really use the delegate type system to express these sorts of concerns” is interesting because a criticism of Java 8, that I have just recently read, is that there is in the framework a proliferation of structurally-identical-yet-differently-named functional interfaces (aka single abstract method interfaces) and that as a result the programmer must remember (and use!) a bunch of different names for the exact same “thing”.

    (See What’s Wrong in Java 8, Part II: Functions & Primitives which points out that Java 8 has 43 functional interfaces in java.util.function, including 5 for nullary functions.

    (You might think, with lambdas, that this is just for purposes of documentation and you never actually need to bother with it, but the author claims (without providing a justification) that

    As long as Java can infer the type, we may think we have no problem. However, if you want to manipulate functions in a functional way, you will soon face the problem of Java being unable to infer a type. Worst, Java will sometime infer the type and stay silent while using a type which is no the one you intended.

    So there may be an issue there.)

Leave a Reply

Your email address will not be published. Required fields are marked *