This project is read-only.

C# Language Design Notes for Oct 7, 2013 (Part I)

Topics: C# Language Design
Jul 22, 2014 at 12:01 AM
Edited Jul 22, 2014 at 12:02 AM

C# Language Design Notes for Oct 7, 2013

Notes are archived here.

This is Part I. Part II is here.

Agenda

We looked at a couple of feature ideas that either came up recently or deserved a second hearing.
  1. Invariant meaning of names <scrap the rule>
  2. Type testing expression <can’t decide on good syntax>
  3. Local functions <not enough scenarios>
  4. nameof operator <yes>

Invariant meaning of names

C# has a somewhat unique and obscure rule called “invariant meaning in blocks” (documented in section 7.6.2.1 of the language specification) which stipulates that if a simple name is used to mean one thing, then nowhere in the immediately enclosing block can the same simple name be used to mean something else.

The idea is to reduce confusion, make cut & paste refactoring a little more safe, and so on.

It is really hard to get data on who has been saved from a mistake by this rule. On the other hand, everyone on the design team has experience being limited by it in scenarios that seemed perfectly legit.

The rule has proven to be surprisingly expensive to implement and uphold incrementally in Roslyn. This has to do with the fact that it cannot be tied to a declaration site: it is a rule about use sites only, and information must therefore be tracked per use – only to establish in the 99.9% case that no, the rule wasn’t violated with this keystroke either.

Conclusion

The invariant meaning rule is well intentioned, but causes significant nuisance for what seems to be very little benefit. It is time to let it go.

Type testing expressions

With declaration expressions you can now test the type of a value and assign it to a fresh variable under the more specialized type, all in one expression. For reference types and nullable value types:
if ((var s = e as string) != null) { … s … } // inline type test
For non-nullable value types it is a little more convoluted, but doable:
if ((var i = e as int?) != null) { … i.Value … } // inline type test
One can imagine a slightly nicer syntax using a TryConvert method:
if (MyHelpers.TryConvert(e, out string s)) { … s … }
The signature of the TryConvert method would be something like
public static bool TryConvert<TSource, TResult>(TSource src, out TResult res);
The problem is that you cannot actually implement TryConvert efficiently: It needs different logic depending on whether TResult is a non-nullable value type or a nullable type. But you cannot overload a method on constraints alone, so you need two methods with different names (or in different classes).

This leads to the idea of having dedicated syntax for type testing: a syntax that will a) take an expression, a target type and a fresh variable name, b) return a boolean for whether the test succeeds, c) introduce a fresh variable of the target type, and d) assign the converted value to the fresh variable if possible, or the default value otherwise.

What should that syntax be? A previous proposal was an augmented version of the “is” operator, allowing an optional variable name to be tagged onto the type:
if (e is string s) { … s … } // augmented is operator
Opinions on this syntax differ rather wildly. While we agree that some mix of “is” and “as” keywords is probably the way to go, no proposal seems appealing to everyone involved. Here are a few:
e is T x
T x is e
T e as x
(A few sillier proposals were made:
e x is T
T e x as
But this doesn’t feel like the right time to put Easter eggs in the language.)

Conclusion

Probably 90% of cases are with reference (or nullable) types, where the declaration-expression approach is not too horrible. As long as we cannot agree on a killer syntax, we are fine with not doing anything.

Local functions

When we looked at local functions on Apr 15, we lumped them together with local class declarations, and dismissed them as a package. We may have given them somewhat short shrift, and as we have had more calls for them we want to make sure we do the right thing with local functions in their own right.

A certain class of scenarios is where you need to declare a helper function for a function body, but no other function needs it. Why would you need to declare a helper function? Here are a few scenarios:
  • Task-returning functions may be fast-path optimized and not implemented as async functions: instead they delegate to an async function only when they cannot take the fast path.
  • Iterators cannot do eager argument validation so they are almost always wrapped in a non-iterator function which does validation and delegates to a private iterator.
  • Exception filters can only contain expressions – if they need to execute statements, they need to do it in a helper function.
Allowing these helper functions to be declared inside the enclosing method instead of as private siblings would not only avoid pollution of the class’ namespace, but would also allow them to capture type parameters, parameters and locals from the enclosing method. Instead of writing
public static IEnumerable<T> Filter<T>(IEnumerable<T> s, Func<T, bool> p)
{
    if (s == null) throw new ArgumentNullException("s");
    if (p == null) throw new ArgumentNullException("p");
    return FilterImpl<T>(s, p);
}
private static IEnumerable<T> FilterImpl<T>(IEnumerable<T> s, Func<T, bool> p)
{
    foreach (var e in s)
        if (p(e)) yield return e;
}
You could just write this:
public static IEnumerable<T> Filter<T>(IEnumerable<T> s, Func<T, bool> p)
{
    if (s == null) throw new ArgumentNullException("s");
    if (p == null) throw new ArgumentNullException("p");
    IEnumerable<T> Impl() // Doesn’t need unique name, type params or params
    {
        foreach (var e in s)          // s is in scope
            if (p(e)) yield return e; // p is in scope
    }
    return Impl<T>(s, p);
}
The underlying mechanism would be exactly the same as for lambdas: we would generate a display class with a method on it. The only difference is that we would not take a delegate to that method.

While it is nicer, though, it is reasonable to ask if it has that much over private sibling methods. Also, those scenarios probably aren’t super common.

Inferred types for lambdas

This did bring up the discussion about possibly inferring a type for lambda expressions. One of the reasons they are so unfit for use as local functions is that you have to write out their delegate type. This is particularly annoying if you want to immediately invoke the function:
((Func<int,int>)(x => x*x))(3); // What??!?
VB infers a type for lambdas, but a fresh one every time. This is ok only because VB also has more lax conversion rules between delegate types, and the result, if you are not careful, is a chain of costly delegate allocations.

One option would be to infer a type for lambdas only when there happens to be a suitable Func<…> or Action<…> type in scope. This would tie the compiler to the pattern used by the BCL, but not to the BCL itself. It would allow the BCL to add more (longer) overloads in the future, and it would allow others to add different overloads, e.g. with ref and out parameters.

Conclusion

At the end of the day we are not ready to add anything here. No local functions and no type inference for lambdas.

Part II
Jul 22, 2014 at 1:01 PM
Probably 90% of cases are with reference (or nullable) types, where the declaration-expression approach is not too horrible. As long as we cannot agree on a killer syntax, we are fine with not doing anything.
How about not changing the name of the variable at all?
object s = ...;
if(s is string)
{
    //s is a string in this scope.
}
It could possibly break some existing code (implicit interfaces), so you could use some other keyword like becomes.
Jul 22, 2014 at 2:13 PM
This is really interesting, as many of these conversations have repeated themselves in threads here on the design forums in the months since this meeting (no doubt part of the impetus for posting these notes).

I think Local Functions is a killer feature that would be really powerful to add to the language. I disagree that the use cases are narrow. In the majority of cases where you currently use an iterator method you want to first do some argument checking. This means that the use case is nearly as large as that for iterators themselves, which clearly you saw enough value to add into the language in the first place. In addition, it would no doubt offer other interesting possibilities. Given that, I'd venture to guess that the set of use cases for Local Functions may be larger than that for iterators!

I also strongly support Inferred Types for Lambdas. I realize the possible pitfall of creating a chain of delegate allocations, but this could and should be addressed through adding diagnostics. The inability to use var for lambdas remains one of the ugliest parts of C#, IMO, and it also makes it a lot harder that in needs to be to program in a functional style.

I know others might feel more strongly, but I'm largely indifferent on the Type Testing syntax. While it seems nice-to-have, I don't think it's anywhere nearly useful enough to warrant inclusion over more appealing candidates like the two I referenced above.
Jul 22, 2014 at 5:02 PM
It's sad that type inference for lambda won't be done.

It's really an ugly part of the C# language currently.
Both when using lambdas as local functions and when functional programming is involved.
Aug 11, 2014 at 4:55 PM
Another vote for local functions. Improves encapsulation. I find them very natural to use in other languages.
Oct 3, 2014 at 10:12 PM
Edited Oct 4, 2014 at 3:44 AM
Wrote same proposal in semicolon operator discussion, but it definitely is more appropriate here. Imho, local functions are using once most of the time. So, i suggest next syntax for local functions
if( 
    { 
        var x = Foo(); 
        var y = Bar(); 
        Write(x); 
        return x * y > 0; 
    } 
) 
{ 
// 
}

int a = 
    lock(dataInstance) 
    { 
        return dataInstance.Property1; 
    }; 

var x = 
    lock(_dict){ 
        return { 
            foreach(var pair in _dict) 
            { 
                yield return new {
                    A = pair.Key, 
                    B = pair.Value.Property1
                }; 
            } 
        }.ToList(); 
    }; //List<Anonymous type> 

var y = 
    using(var conn = new SqlConnection(...)) 
    { 
        return 
            (await conn.QueryAsync<LongLongLongClassName>( 
                "sql query" 
            )).GroupBy(/* .... */) 
            .Select(/* ... */) 
            .ToList(); // Long long long final typename 
    };
Narrow scoping for temp variables, less redundant explicit typed. Also it replaces semicolon operator usecases just fine.

Would be nice also mark variables in such blocks with new "out" word to make them visible in next block scope.
while ( 
    {
        out var buffer = new byte[100]; 
        return await stream.ReadAsync(buffer,0,100);
    } > 0
)
{
  // Work with buffer
}
bool b = false;
string xS = "42";
if(
    {
        // No
        if(b)
        {
            out var x = int.Parse(xS);
        }
        else
        {
            out var x = -int.Parse(xS);
        }
        return true;
    }
)
{
    //Ambiguous x
    Console.WriteLine(x);
}

// =>

if(
    {
        //Restrict to one declaration
        out int x;
        if(b)
        {
            x = int.Parse(xS);
        }
        else {
            x = -int.Parse(xS);
        }
        return true;
    }
)
{
    Console.WriteLine(x);
}
Actually, it replaces some usecases of declaration expression too.

Kinda

int result;
var myInt =
  int.TryParse("42", out result)
    ? result
    : 0;

//=>

int x = 
    try{
        return int.Parse("42");
    }
    catch()
    {
        return 0;
    };

// instead of 

return int.TryParse(input, out var result) ? result : 0;
And probably pattern matching (if i understand it right).
public static class Polar {
    public static bool operator is( Cartesian c, out double R, out double Theta)
   {
        R = Math.Sqrt(c.X*c.X + c.Y*c.Y);
        Theta = Math.Atan2(c.Y, c.X);
        return c.X != 0 || c.Y != 0;
   }
}
var c = Cartesian(3, 4);
if (c is Polar(var R, *))
   Console.WriteLine(R);
   
// =>

object objC = new Cartesian(3, 4);
if (objC == null)
{
}
else if (
    {
        var c = objC as Cartesian;
        if(c == null)
            return false;
        if(c.X != 0 || c.Y != 0)
        {
            out var R = Math.Sqrt(c.X*c.X + c.Y*c.Y);
            out var Theta = Math.Atan2(c.Y, c.X);
            return true;
        }
        return false;
    }
)
{
    /* 
    * R, Theta are visible, compile time check for
    * 'Variable 'R' might not be initialized before accessing'
    * is possible
    */
    Console.WriteLine("{0} {1}", R, Theta);
}
else if(
    {
        var cString = objC as string;
        if(cString == null)
            return false;
        out var length = cString.Length;
        return true;
    }
)
{
    Console.WriteLine("{0}", length);
}
else{
    // R, Theta, length aren't visible, one next block only
}
Imho it's much more clear design. Plus we don't calculate R,Theta always (otherwise in pattern matching it might be not initialized when needed, obvious overhead).

P.S. Vote here please http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/6504697-block-code-as-value-another-syntax-for-semicolon

P.P.S. Generally it would be nice allow
var x;
x = 4;

var y;
if(int.TryParse(s, out y))
{
}
bool b = false;
string xS = "42";
if(
    {
        // Best
        out var x;
        if(b)
        {
            x = int.Parse(xS);
        }
        else{
            x = -int.Parse(xS);
        }
        return true;
    }
)
{
    Console.WriteLine(x);
}
things. Don't see much problem with resolving var a little later. Would solve good chunk of scoping issues.