Generalize Expression<TDelegate> to just Expression<T>

Topics: C# Language Design, VB Language Design
Sep 22, 2014 at 4:53 PM
Currently C# is able to generate expression trees automatically for lambda expressions that are being assigned to a variable (or parameter) of type Expression<TDelegate>.

Not a lot of languages can do that, and I think C# has scored a hit with this feature, that enables strongly typed APIs and LINQ to SQL.

There are two things that I don't like however, one is that is too implicit: Developers take a while to realize which queries run on memory and witch ones work on in SQL. LISP adds a small ' at the beginning and I think is a good idea, but anyway is too late for that.

The second one is that C# compiler can only generate expression trees for lambdas. Why is that? There's actually no relationship between lambda expression and expression trees:

Why I can write this:
Expression<Func<stringint>>  expression  = s => s.Length;
And gets translated to:
ParameterExpression expression2 = Expression.Parameter(typeof(string), "s");
Expression<Func<string, int>> expression = Expression.Lambda<Func<string, int>>(
    Expression.Property(expression2 , (MethodInfo)methodof(string.get_Length)), 
   new ParameterExpression[] { expression2 });
But I can not write:
Expression<int> exp = 2 + 3;
And get it translated to something like:
Expression<int> exp = new Expression<int>(
    Expression.Add(Expression.Constant(2), Expression.Constant(3))); 
That will simplify some fluent APIs, like:
p.ValidateProperty(p2=>p2.Name)
//Will become
p.ValidateProperty(p.Name)
Maybe because without some quotation symbol it will look confusing?
Sep 23, 2014 at 10:38 AM
Olmo wrote:
The second one is that C# compiler can only generate expression trees for lambdas. Why is that? There's actually no relationship between lambda expression and expression trees:
The reason is probably because if you allow arbitrary expressions like
Expression ex = p.SomeProperty;
then the compiler could not universally decide whether it should generate Expression.Property(Expression.Constant(p), "SomeProperty") or get the actual value of the property i.e. whether it should evaluate the right side or generate an expression from it. Lambda syntax makes it unambiguous (except when deciding between expression and anonymous delegate).

As for Expression<int> it would have been somewhat better to have it called LambdaExpression<T> to reflect the role of the class. Considering the above I don't see much value. Expressions already expose the return type with a property. Since various expression types expose various expression properties (e.g. you get Left and Right for BinaryExpression) you would need a BinaryExpression<T>, NewExpression<T>, MethodCallExpression<T>, different types for argument type combinations (MethodCallExpression<TRet, TArg1, TArg2...>) etc. And of course the non-generic API. This would explode the API surface quite drastically.

I would be extremely glad for simply being able to reuse a ParameterExpression for multiple lambdas, so that I can easily combine the bodies without using a custom ExpressionVisitor to 'make them compatible' first or resort to the factory methods. That would probably need new syntax though.

Olmo wrote:
That will simplify some fluent APIs, like:
p.ValidateProperty(p2=>p2.Name)
//Will become
p.ValidateProperty(p.Name)
This sounds like a great use case for the infoof operator. You could use the somewhat longer
p.ValidateProperty(() => p.Name);
Sep 23, 2014 at 12:30 PM
Well, in the current design LambdaExpression is abstract and Expression<T> inherits from it, but in my design will be different:

The hierarchy of Expressions will remain intact, and LambdaExpression will be non-abstract as any other.

Expression<T> then will be a expression container, not inheriting from Expression at all but having an GeneratedExpression property as his only member. The Type of this generated expression should be typeof(T).

Any expression assigned to a Expression<T> will make the C# compiler do his magic, except if the type of the expression is Expression<T> instead of T, more or less like params creates and array except if the value is already a compatible array type.

I have to admit that I don't see many real use cases for this generalization, but it looks better, doesn't it?

About reusing ParameterExpressions, I've an extension method Evaluate that is more or less like Invoke for expressions, an then a visitor that does the conversion.