C# Language Design Notes for Apr 21, 2014 (Part II)

Topics: C# Language Design
Coordinator
Apr 30, 2014 at 3:36 PM
Edited Apr 30, 2014 at 5:36 PM
This is Part II. Part I is here and Part III is here.

Primary constructor bodies

By far the most commonly reported reason why people cannot use primary constructors is that they don’t allow for easy argument validation: there is simply no “body” within which to perform checks and throw exceptions.

We could certainly change that. The simplest thing, syntactically, is to just let you write a block directly in the type body, and that block then gets executed when the object is constructed:
public class ConfigurationException(Configuration configuration, string message) 
    : Exception(message)
{
    {
        if (configuration == null) 
        {
            throw new ArgumentNullException(nameof(configuration));
        }
    }
    private Configuration configuration = configuration;
    public override string ToString() => Message + "(" + configuration + ")";
}
This looks nice, but there is a core question we need to answer: when exactly is that block executed? There seem to be two coherent answers to that, and we need to choose:
  1. The block is an initializer body. It runs before the base call, following the same textual order as the surrounding field and property initializers. You could even imagine allowing multiple of them interspersed with field initialization, and they can occur regardless of whether there is a primary constructor.
  2. The block is a constructor body. It is the body of the primary constructor and therefore runs after the base call. You can only have one, and only if there is a primary constructor that it can be part of.
Both approaches have pros and cons. The initializer body corresponds to a similar feature in Java, and has the advantage that you can weed out bad parameters before you start digging into them or pass them to the base initializer (though arguments passed to the base initializer should probably be validated by the base initializer rather than in the derived class anyway).

As an example of this issue, our previous example where an initializer digs into the contents of a primary constructor parameter, wouldn’t work if the validation was done in a constructor body, after initialization (here in a simplified version):
    public bool IsRemote { get; } = configuration.Settings["remote"];
If the passed-in configuration is null, this would yield a null reference exception before a constructor body would have a chance to complain (by throwing a better exception). Instead, in a constructor body interpretation, the initialization of IsRemote would either have to happen in the constructor body as well, following the check, or it would have to make copious use of the null propagating operator that we’re also adding:
    public bool IsRemote { get; } = configuration?.Settings?["remote"] ?? false;
On the other hand, the notion of a constructor body is certainly more familiar, and it is easy to understand that the block is stitched together with the parameter list and the base initializer to produce the constructor declaration underlying the primary constructor.

Moreover, a constructor body has access to fields and members, while this access during initialization time is prohibited. Therefore, a constructor body can call helper methods etc. on the instance under construction; also a common pattern.

Conclusion

At the end of the day we have to make a choice. Here, familiarity wins. While the initializer body approach has allure, it is also very much a new thing. Constructor bodies on the other hand work the way they work. The downsides have workarounds. So a constructor body it is.

In a partial type, the constructor body must be in the same part as the primary constructor. Scope-wise, the constructor body is nested within the scope for the primary constructor’s base arguments, which in turn is nested within the scope for the field and property initializers of that part, which in turn is nested within the initialization scope that contains the primary constructor parameters:
partial class C(int x1) : B(int x3 = x1 /* x2 in scope but can’t be used */)
{
    public int X0 { get; } = (int x2 = x1);
    {
        int x4 = X0 + x1 + x2 + x3;
    }
}
Let’s look at the scopes (and corresponding declaration spaces) nested in each other here:
  • The scope S4 spans the primary constructor body. It directly contains the local variable x4, and is nested within S3.
  • The scope S3 spans S4 plus the argument list to the primary constructor’s base initializer. It directly contains the local variable x3, and is nested within S2.
  • The scope S2 spans S3 plus all field and property initializers in this part of the type declaration. It directly contains the local variable x2, and is nested within S1.
  • The scope S1 spans S2 plus similar “S2’s” from other parts of the type declaration, plus the parameter list of the primary constructor. It directly contains the parameter x1, and is nested within S0.
  • The scope S0 spans all parts of the whole type declaration, including S1. It directly contains the property X0.
On top of this, the usual rule applies for local variables, that they cannot be used in a position that textually precedes their declaration.

Assignment to getter-only auto-properties

There are situations where you cannot use a primary constructor. We have to make sure that you do not fall off too steep of a cliff when you are forced to abandon primary constructor syntax and use an ordinary constructor.

One of the main nuisances that has been pointed out is that the only way to initialize a getter-only auto-property is with an initializer. If you want to initialize it from constructor parameters, you therefore need to have a primary constructor, so those parameters can be in scope for initialization. If you cannot have a primary constructor, then the property cannot be a getter-only auto-property: You have to fall back to existing, more lengthy and probably less fitting property syntax.

That’s a shame. The best way to level the playing field here is to allow assignment to getter-only auto-properties from within constructors:
public class ConfigurationException : Exception
{
    private Configuration configuration;
    public bool IsRemote { get; }
    public ConfigurationException(Configuration configuration, string message) 
        : base(message)
    {
        if (configuration == null) 
        {
            throw new ArgumentNullException(nameof(configuration));
        }
        this.configuration = configuration;
        IsRemote = configuration.Settings["remote"];
    }
}
The assignment to IsRemote would go directly to the underlying field (since there is no setter to call). Thus, semantics are a little different from assignment to get/set auto-properties, where the setter is called even if you assign from a constructor. The difference is observable if the property is virtual. We could restore symmetry by changing the meaning of assignment to a get/set auto-property to also go directly to the backing field, but that would be a breaking change.

Conclusion

Let’s allow assignment to getter-only auto-properties from constructor bodies. It translates into assignment directly to the underlying field (which is readonly). We are ok with the slight difference in semantics from get/set auto-property assignment.

Separate accessibility on type and primary constructor

There are scenarios where you don’t want the constructors of your type to have the same accessibility as the type. A common case is where the type is public, but the constructor is private or protected, object construction being exposed only through factories.

Should we invent syntax so that a primary constructor can get a different accessibility than its type?

Conclusion

No. There is no elegant way to address this. This is a fine example of a scenario where developers should just fall back to normal constructor syntax. With the previous decisions above, we’ve done our best to make sure that that cliff isn’t too steep.

Separate doc comments for field parameters and their fields

Doc comments for a primary constructor parameter apply to the parameter. If the parameter is a field parameter, there is no way to add a doc comment that goes on the field itself. Should there be?

Conclusion

No. If the field needs separate doc comments, it should just be declared as a normal field. With the introduction of initialization scopes above, this is now not only possible but easy.

Part III
Apr 30, 2014 at 6:02 PM
Edited Apr 30, 2014 at 6:04 PM
madst wrote:
The assignment to IsRemote would go directly to the underlying field (since there is no setter to call). Thus, semantics are a little different from assignment to get/set auto-properties, where the setter is called even if you assign from a constructor. The difference is observable if the property is virtual. We could restore symmetry by changing the meaning of assignment to a get/set auto-property to also go directly to the backing field, but that would be a breaking change.
I don't understand your reasoning. There is no setter, hence there can be no virtual setter, hence writing to the backing field is actually the expected behavior. Making it consistent with get/set auto-properties would be extremely strange - and since there is no setter, you wouldn't be able to override it anyway, would you?

Also, there's - in my opinion - still the issue of primary constructors cluttering up the type definition as discussed in this thread: https://roslyn.codeplex.com/discussions/541421. Consider the following contrived example:
class MyType<T>(Dictionary<string, T> stringToObjectMap, ConfigurationFile configurationFile) 
     : MyBase(stringToObjectMap, configurationFile, isReadonly: false)
     where T : IComparable
{
   // ...
}
It is extremely hard to see here what it is you're actually declaring. Instead, the proposed new syntax in the other thread would improve readability, be more in line with how generic constraints are specified and would re-use the usual : base() syntax for calling the base constructor:
class MyType<T> : MyBase
    new(Dictionary<string, object> stringToObjectMap, ConfigurationFile configurationFile)
        : base(stringToObjectMap, configurationFile, isReadonly: false)
    where T : IComparable
{
   // ...
}
May 1, 2014 at 10:21 PM
For me writing (even from constructor) to Property, that have no set accessor looks ugly. At the same time if we want allow writing in field only from constructor, we use readonly keyword. Maybe it worth to add readonly keyword to set accessor for the same result. Something like:
public class ConfigurationException : Exception
{
    private Configuration configuration;
    public bool IsRemote { get; private readonly set; }
    public ConfigurationException(Configuration configuration, string message) 
        : base(message)
    {
        if (configuration == null) 
        {
            throw new ArgumentNullException(nameof(configuration));
        }
        this.configuration = configuration;
        IsRemote = configuration.Settings["remote"];
    }
}
Or even:
public class SomeClass
{
    private readonly int _input;

    public int Input
    {
        get { return _input; }
        private readonly set
        {
            if (value < 0) throw new ArgumentOutOfRangeException();
            _input = value;
        }
    }

    public SomeClass(int input)
    {
        Input = input;
    }
}
In that it should be allowed to write in readonly fields from readonly setters. At the same time "readonly set" also looks ugly.
Coordinator
May 1, 2014 at 10:56 PM
@Expandable: I'm sorry, I did not explain it very well. Let me try again.

Today, writing to a property always calls the setter. That is the case whether you are in a constructor or not, and whether the property is an auto-property or not. This feature breaks with that. That was really all I was trying to say.

@darkman666: that is really what you are taking issue with. And I agree that it is a "wart" - an inconsistency in the language. However, as we considered the alternatives, we really don't think this is very jarring. The backing field of a getter-only auto-property is readonly, and these rules are the same as we have for readonly fields today: you can assign from the constructor only.

There is no reason to add more syntax for this. There is no way we could win back benefits to match the cost of that.
May 1, 2014 at 11:29 PM
madst wrote:
At the end of the day we have to make a choice. Here, familiarity wins. While the initializer body approach has allure, it is also very much a new thing. Constructor bodies on the other hand work the way they work. The downsides have workarounds. So a constructor body it is.
The one thing a primary constructor as earlier conceived would be able to do that a normal constructor cannot is initialize fields with values that depend upon constructor parameters before the base constructor is called. If constructor parameters are only available within a block of code that won't run until after the base constructor, what's the point?
May 1, 2014 at 11:32 PM
madst wrote:
Today, writing to a property always calls the setter. That is the case whether you are in a constructor or not, and whether the property is an auto-property or not. This feature breaks with that. That was really all I was trying to say.
In what cases would code know or care whether code which syntactically wrote to a private-set autoproperty called the setter or simply wrote the field? Even if the Jitter would have a high likelihood of replacing the setter call with an inline write to the field, is anything whatsoever gained by making it expend the effort? There's no way the auto-property is going to be replaced with something else without recompiling every piece of code which could possibly write to it, so I see no reason a non-virtual call to a method that cannot be anything other than a simple field-write should be any different semantically from a simple field write.
May 1, 2014 at 11:34 PM
darkman666 wrote:
For me writing (even from constructor) to Property, that have no set accessor looks ugly. At the same time if we want allow writing in field only from constructor, we use readonly keyword.
public bool IsRemote { get; private readonly set; }
How about eliminate the word set? After all, if "readonly" variables can be written in a constructor, why not readonly properties?
Coordinator
May 1, 2014 at 11:38 PM
Edited May 1, 2014 at 11:38 PM
supercat wrote:
madst wrote:
Today, writing to a property always calls the setter. That is the case whether you are in a constructor or not, and whether the property is an auto-property or not. This feature breaks with that. That was really all I was trying to say.
In what cases would code know or care whether code which syntactically wrote to a private-set autoproperty called the setter or simply wrote the field? Even if the Jitter would have a high likelihood of replacing the setter call with an inline write to the field, is anything whatsoever gained by making it expend the effort? There's no way the auto-property is going to be replaced with something else without recompiling every piece of code which could possibly write to it, so I see no reason a non-virtual call to a method that cannot be anything other than a simple field-write should be any different semantically from a simple field write.
If the auto-property is virtual, then there's a difference. A setter can be overwritten in a derived class.
Coordinator
May 1, 2014 at 11:39 PM
supercat wrote:
darkman666 wrote:
For me writing (even from constructor) to Property, that have no set accessor looks ugly. At the same time if we want allow writing in field only from constructor, we use readonly keyword.
public bool IsRemote { get; private readonly set; }
How about eliminate the word set? After all, if "readonly" variables can be written in a constructor, why not readonly properties?
How about eliminate the words private readonly set;? :-)
May 1, 2014 at 11:49 PM
madst wrote:
supercat wrote:
darkman666 wrote:
For me writing (even from constructor) to Property, that have no set accessor looks ugly. At the same time if we want allow writing in field only from constructor, we use readonly keyword.
public bool IsRemote { get; private readonly set; }
How about eliminate the word set? After all, if "readonly" variables can be written in a constructor, why not readonly properties?
How about eliminate the words private readonly set;? :-)
If we have explicit setter, it will solve all issues with virtual properties:
public class SomeClass
{
    private readonly int _input;

    public virtual int Input
    {
        get { return _input; }
        protected readonly set
        {
            if (value < 0) throw new ArgumentOutOfRangeException();
            _input = value;
        }
    }

    public SomeClass(int input)
    {
        Input = input;
    }
}
or even
public class SomeClass
{
    public virtual int Input { get; protected readonly set; }

    public SomeClass(int input)
    {
        Input = input;
    }
}
Here we still have virtual property, which set accessor could be overridden in derived class.
May 1, 2014 at 11:55 PM
I don't think protected readonly makes any sense. There are cases where private virtual would make sense if the run-time supported such usage (derived classes could specify overrides, but only the base class could call them, and the only way the base-class method could be called would be from the derived-class override of that same method), but since the run-time doesn't support such a concept, I see no use case for protected readonly since the CLR would have no mechanism to distinguish it from protected.
May 2, 2014 at 12:00 AM
One even better idea. We can define it in such way:
public class SomeClass
{
    public virtual readonly int Input { get; protected set; }

    public SomeClass(int input)
    {
        Input = input;
    }
}
In such form this property declaration looks same way, as field declaration and readonly keyword means absolutely same - you can write to this property only in initiallizer/constructor (same as for field).
Property still have set accessor. Readonly keyword can be mapped to some Attribute in compile time, that will allow assembly users know, that this property setter should be used only in constructor. The only problem, that other languages should also understand this attribute or they will be able to write to this property later. But, as I remember, it works pretty same with readonly fields - CLR doesn't restrict to write to readonly fields in any time (but compiler does).
May 2, 2014 at 7:03 AM
@madst: Thanks for the explanation. I think I understand what you mean, but I don't think it's a problem at all. Auto-implemented readonly properties have no setter that you could possibly call, hence I don't see it as a problem. Others, apparently, do.

@darkman666: I think there are (at least) three main arguments in favor of the proposed syntax vs. what you're trying to do:
1) int MyReadonlyProperty { get; } simply has no setter, hence it maps regularly to a CLR feature that is understood by any language targeting the CLR. No special compiler magic is required here.
2) We already have get-only properties without setters that are therefore readonly. Just like C# 2 (was it 2?) introduced auto-implemented not-readonly properties, we now get auto-implemented read-only properties with the special feature that they can be assigned to in a constructor.
3) Your trying to come up with a feature that is bad practice anyway: Even though it's technically possible, many coding guidelines strongly discourage calling virtual members (methods or properties) in constructors (either directly or indirectly).

I do agree that there might be some benefit in allowing readonly property setters, though, but that is only relevant for non-auto-implemented ones. On the other hand, since readonly properties can only be set in the constructor anyway, why not just do your argument validation in the constructor just like you have to do it for fields anyway?

By the way, can we write int MyProperty { get; private set; } = 5? Or is the initialization syntax only allowed for readonly auto-implemented properties?
May 2, 2014 at 5:24 PM
Expandable wrote:

1) int MyReadonlyProperty { get; } simply has no setter, hence it maps regularly to a CLR feature that is understood by any language targeting the CLR. No special compiler magic is required here.

How about {get; private var;} or {get; private readonly var;} to expressly say that within the class, references to the member name should be interpreted as accesses to the backing field? Such behavior would, in addition to making the role of readonly clear, also allow for:
public struct foo {
  public int X, Y {get; private readonly var; }
  foo(int x, y) {
    X=x; Y=y;
  }
} 
without the compiler squawking about calling the "set" methods on an incompletely-filled in structure.
May 8, 2014 at 12:06 AM
madst wrote:

Conclusion

No. There is no elegant way to address this. This is a fine example of a scenario where developers should just fall back to normal constructor syntax.
What if normal constructor syntax is semantically inadequate, e.g. because the base constructor calls a virtual method which depends upon constructor or factory parameters?

It's possible to use ThreadStatic variables to make factory parameters accessible to field initializers, or to refrain from using readonly fields and have the base constructor call a PreInit method with its parameters to allow derived classes to do all the things they should have done before chaining to base. I don't think one can reasonably suggest, however, that either of these approaches is anywhere good enough to obviate the need for something better.

The designers of C# made a deliberate decision to have field initializers run before the base constructor, even though this greatly limited what field initializers could do. If classes are supposed to be able to prepare themselves for virtual method calls before chaining the base constructor, the present constructor syntax is grossly inadequate to achieve that in many common usage scenarios.
May 8, 2014 at 1:52 PM
Edited May 8, 2014 at 2:02 PM
Dear Mads,

All this story about scopes + primary constructor body is ABSOLUTELY crazy.
This design is hard to reason about, hard to support in IDE tooling and I have no idea why would somebody ever needs this.

I've understand why you proposing this, because after some compiling step things became clear:
class Person(int id, string name) : Entity(id, var normalizedName = name.ToUpper())
{
  public string Name { get; } = (var titleCaseName = name.ToTitleCase());

  {
    if (name != null) throw new ArgumentNullException("name");
    Console.WriteLine("Hello, {0}", titleCaseName);
  }
}
Compiles down to:
class Person : Entity
{
  private readonly string _name;

  public string Name { get { return _name; } }

  public Person(int id, string name)
  {
    _name = (var titleCaseName = name.ToTitleCase());
    base(id, var normalizedName = name.ToUpper());

    if (name != null) throw new ArgumentNullException("name");

    Console.WriteLine("Hello, {0}", titleCaseName); // ok, titleCaseName is in scope
  }
}
But this is absolutely crazy to support.
Who needs this? What for? How to reason about this?

Whats wrong with current design that disallows declaration expressions in members/base initilizers?
error CS8201: A declaration expression is not permitted in a variable-initializer of a field declaration,
              an attribute application, or in a class-base specification.
Or why not restrict scopes for each initializer?
class Person : Entity
{
  private readonly string _name;

  public string Name { get { return _name; } }

  public Person(int id, string name)
  {
    {
      _name = (var titleCaseName = name.ToTitleCase());
    }
    {
      base(id, var normalizedName = name.ToUpper());
    }
    {
       if (name != null) throw new ArgumentNullException("name");

       Console.WriteLine("Hello, {0}", titleCaseName); // error, no longer in scope
    }
  }
}
Coordinator
May 8, 2014 at 5:16 PM
@ControlFlow:

These rules may or may not be a step to far, but I don't agree that they are crazy. :-)

This starts out with the idea of the initialization scope, which turns out to be an excellent solution to a number of smaller issues around primary constructors. Now we have a scope that exists only at initialization time. Suddenly there's a place where it makes sense to have local variables (introduced by declaration expressions during initialization) live. This is worth exploring!

The next step is to figure out what that should actually look like to make sense. You want it to feel "natural". Interestingly, this tends to lead to a rather high number of nested scopes. This is par for the cause with declaration expressions: more places where locals can be introduced means more scopes need to be defined. The rules above for locals in the initialization scope are similar in complexity to those governing the for statement, for instance.

For a language implementer like you this looks like a lot of added complexity. But it is not random complexity: it is the set of rules that makes the scopes "just make sense" to a developer. At least that's the intention. What that means is:
  • Locals cannot be used at a point that's textually before where they are introduced (just like inside statement bodies)
  • Locals cannot be used at a point that's evaluated before where they are introduced (just like with the increment in a for loop, for instance)
These are simple principles - essentially embodiments of the principle of least surprise. What I've done in the design notes is just to spell out what the consequence is of applying those principles.

And to be honest I don't think these rules are hard to implement either. In the Roslyn compiler code base, introducing a nested scope is a simple operation. From an IDE point of view, the principles are super IDE friendly. Making locals only useable after they are introduced is key for completion and so on. Making them not appear across parts means that locals are still limited to one file, and no new questions of evaluation order arise.

Now all that said, we are doing our design process in the open precisely so that people can raise concerns. We have already made many changes because of feedback we get here on CodePlex and elsewhere. We're all better off for it! There may be great end user arguments for why we should pull this particular feature back a little - I look forward to hearing them. Implementers' woes we are probably a little less concerned about. After having built async (and added support for await in catch and finally blocks this time around) we have a very high tolerance for dealing with complexity in the implementation so that the language users get a natural, expressive and smooth experience.
May 8, 2014 at 7:02 PM
I posted a thread to suggest a syntax and focus on what I see as the most fundamental issues which a new syntax should try to fix:
  1. The C# designers thought that the advantages of having field initializers run before the base class constructor was sufficient to justify severe restrictions on what such initializers are allowed to do, but there is no decent mechanism to initialize any field whose value would depend upon constructor or factory-method parameters (I do not consider the use of ThreadStatic variables to be a decent mechanism).
  2. There is substantial value in knowing that any possible way via which a readonly variable might be initialized is the (only) way it will be initialized. Such a guarantee is presently available with readonly fields that are defined using initializer syntax, but not for fields whose values cannot be defined that way. It would be helpful to increase the number of situations in which the guarantee could be applied.
Consider, for example:
public partial class InitTest
{
    readonly int[] array1, array2;
    public InitTest(int size) : base()
    {
        array1 = new int[size];
        array2 = new int[size];
    }
}
How much code would one have to examine to know whether array1 and array2 can ever be observed as anything other than distinct arrays with matching lengths? Is there any way in which the above partial class could be written (using the present compiler) which would reduce the amount of code that would need to be inspected to check the invariant?

If the code could be written as something like:
public partial class InitTest
{
    partial new ConfigArrays(int size)
    {
      private readonly int[] array1 = new int[size];
      private readonly int[] array2 = new int[size];
    }

    public InitTest(int size) : ConfigArrays(size), base()
    {
    }
}
and all constructors were required to chain to ConfigArrays exactly once before chaining to base, how would that affect the amount of code inspection necessary to validate the invariant?
May 9, 2014 at 5:51 PM
madst wrote:

First of all, thank you for you answer!
...
This starts out with the idea of the initialization scope, which turns out to be an excellent solution to a number of smaller issues around primary constructors. Now we have a scope that exists only at initialization time. Suddenly there's a place where it makes sense to have local variables (introduced by declaration expressions during initialization) live. This is worth exploring!
...
I'm exploring problems of type declarations conversion to types with primary constructors for quite some time now.
Most of the problems can be solved with property initializers and explicit primary constructor body for side-effects (especially with currently proposed design, when this reference is allowed to use), event subscriptions, Init() calls and other ugly things people love to do in constructors...

I just can't see any of problems declaration expression in class-level members/base initializer can solve.
Are you trying to reduce the need in primary ctor body?
...
For a language implementer like you this looks like a lot of added complexity.
...
It's not that hard to support this, actually.
...
But it is not random complexity: it is the set of rules that makes the scopes "just make sense" to a developer. At least that's the intention. What that means is:
  • Locals cannot be used at a point that's textually before where they are introduced (just like inside statement bodies)
  • Locals cannot be used at a point that's evaluated before where they are introduced (just like with the increment in a for loop, for instance)
These are simple principles - essentially embodiments of the principle of least surprise. What I've done in the design notes is just to spell out what the consequence is of applying those principles.
...
Yep, I know these principles. Everything starts making some sense if developer knows C# initialization order well :)

But when I'm imagining me explaining someone:
class C(int x) : B(int y = x + 1) {
   readonly int _x = x; // why this is OK
   readonly int _y = y; // and why 'y' is not in scope here

   {
     _y = y; // but in scope here
   }
}
...this is just scares me.
...
Now all that said, we are doing our design process in the open precisely so that people can raise concerns. We have already made many changes because of feedback we get here on CodePlex and elsewhere. We're all better off for it!
...
This is really nice to hear :)

I like most of C# 6.0 changes and hope they will change the way users write C# (so the typical class initialization became more trivial and declarative, for example).
I like the idea of declaration expressions, they really can make out/ref-parameters code much more usable.
But on the other hand, they can be misused a lot - instead of introducing variable in declaration statement, users may start appending "var name = " inside arbitrarily complex statements to reuse values:
var builder = ArrayBuilder<string>.GetInstance(var count = this.Names.Count);
for (int i = 0; i < count; ++i) {
  builder.Add(Name(i));
}
I just can't easily find what count means here. And you are proposing to move this confusion level upper...
...
There may be great end user arguments for why we should pull this particular feature back a little - I look forward to hearing them.
...
Declaration expressions are ~useless at type declaration level, hard to explain, hard to reason about. That is all :)
May 9, 2014 at 7:13 PM
Edited May 9, 2014 at 7:43 PM
madst wrote:
After having built async (and added support for await in catch and finally blocks this time around) we have a very high tolerance for dealing with complexity in the implementation so that the language users get a natural, expressive and smooth experience.
Neat. Do you know if the same restrictions regarding catch and finally blocks have been removed for yield?

EDIT - oops, I misremembered. The limitation that I was thinking about was not being able to use yield within a try block for which a catch block has been defined, which VB.NET does currently permit.
May 9, 2014 at 7:25 PM
ControlFlow wrote:
I just can't see any of problems declaration expression in class-level members/base initializer can solve.
Are you trying to reduce the need in primary ctor body?
A fundamental issue is that a constructor body cannot execute until after the base constructor has been invoked; even if a base constructor happens not to invoke any virtual methods and almost everything could safely be handled in the constructor body, it would still be impossible for evaluation of base-constructor parameters to be deferred until that time.

There are a variety of ways by which the language could be extended to allow more useful things to be done before the invocation of the base constructor. Allowing the use of declaration expressions within a chained constructor call would be such way, although I think more generalized approaches should exist, and in most cases would likely be a better fit for requirements.

BTW, you seem to be expecting declaration expressions to have a larger scope than I'm imagining; Given (var foo=whatever; foo.biz(); foo.boz()) I would expect that foo would fall out of scope at the closing parenthesis. It looks as though in your last example you are expecting the count created in the first line to remain in scope? Also, I would expect that for the compiler to accept the your last example code without squawking the last subexpression in the first line should have to be rewritten as GetInstance(var count = this.Names.Count; count).
May 9, 2014 at 7:34 PM
Halo_Four wrote:
Neat. Do you know if the same restrictions regarding catch and finally blocks have been removed for yield?
If an exception occurs within an iterator's consumer, the iterator will be notified when its services are no longer required, but it will not be told why. Given:
void EvilIteratorTest<T>(IEnumerable<T> enumerator, bool evilness)
{
  foreach (var foo in someIterator)
    if (evilness) throw new Exception(); else break;
}
there would be no mechanism via which the enumerator could find out whether an exception occurred or the consumer just didn't feel like enumerating anymore. A finally can straddle a yield return; since it should run regardless of why the client no longer required the enumerator, but a catch would have no way of discovering when it should be run.
May 9, 2014 at 7:47 PM
Of course, I was referring to exceptions thown during iterator by the iterator code itself or a function it called.

The limitation of which I was really thinking was that where you cannot yield within a try block for which catch blocks have been defined, which is something that VB.NET iterators do permit. I have edited my original post to reflect my mistake.
May 9, 2014 at 7:53 PM
Halo_Four wrote:
Of course, I was referring to exceptions thown during iterator by the iterator code itself or a function it called.
Given the code:
try { do stuff } catch (Exception ex) { trap stuff} finally { cleanup stuff }
there is a strong expectation that "trap stuff" will run if "do stuff" doesn't complete or execute a "return". It would seem somewhat odd for an outside exception to cause execution to jump directly from a "yield return" to a "finally block". Perhaps as syntactic sugar a compiler could allow a try/catch block to straddle a yield return if the first catch was catch (AbandonedIteratorException), and that block did not have a parameterless throw. That would make it clear why the code flow was behaving as it was. Maybe suggest that as an idea?
May 10, 2014 at 8:08 AM
I find that I'm not convinced by the primary constructor feature at all. It brings same amount of trouble as solved problems.

I know similar features in other quick and easy languages, which are quite convenient and useful. But in enterpise-used language like C#, people care too many side aspects like accessibility, validation, api doc, etc., which defeats the purpose of this feature. And I feel the primary constructor design is becoming more than more heavy-weight, closer to a normal constructor. IIRC, one of the issues C# solves from C++ is that: there are multiple not-so-different ways to do the same thing.

I especially dislike the body of primary constructor is just in a brace pair, which can appear anywhere in side the type declaration.
1) on the free order: it is adding obfuscation. The free order of member modifiers is already bad, but it is still within one line. But now we need to search for a potential body in, say, a hundred lines of code?
2) no visual clue except braces: C#, just like all other C familar languages, uses a lot of braces. A bare pair of braces, is a nested scope in a function. Now it may also be a constructor body, which looks like a result of incorrect copy-paste.

I'd prefer not to have the body feature. In case we need to add some argument validation, it can be achieved in the old way:
public class ConfigurationException(Configuration configuration, string message) 
    : Exception(message)
{
    private Configuration configuration = (ValidationHelper.RequireNotNull(configuration); configuration);
    public override string ToString() => Message + "(" + configuration + ")";
}
May 10, 2014 at 3:52 PM
qrli wrote:
I find that I'm not convinced by the primary constructor feature at all. It brings same amount of trouble as solved problems.

I know similar features in other quick and easy languages, which are quite convenient and useful. But in enterpise-used language like C#, people care too many side aspects like accessibility, validation, api doc, etc., which defeats the purpose of this feature. And I feel the primary constructor design is becoming more than more heavy-weight, closer to a normal constructor. IIRC, one of the issues C# solves from C++ is that: there are multiple not-so-different ways to do the same thing.


I especially dislike the body of primary constructor is just in a brace pair, which can appear anywhere in side the type declaration.
I don't like the use of a "free-floating" brace pair either. If something is going to behave syntactically like a method scope, it should have something resembling a method declaration before it.
I'd prefer not to have the body feature. In case we need to add some argument validation, it can be achieved in the old way:
public class ConfigurationException(Configuration configuration, string message) 
    : Exception(message)
{
    private Configuration configuration = (ValidationHelper.RequireNotNull(configuration); configuration);
    public override string ToString() => Message + "(" + configuration + ")";
}
I dislike the use of the same identifier for the parameter and the field, without a qualifier. I would favor requiring disambiguation with params.configuration.

Have you looked at the thread with my "partial constructor" proposal yet? It is perhaps more complicated to implement than it would need to be, but it would be a major step toward making C# capable of expressing things that can be expressed in the CIL. The most fundamental aspects are the ability to have mostly-free-style code run before chaining to a base constructor, providing that the code does not use or expose references to the object under construction, and a guarantee that all constructors will behave consistently. If code can get by without need for nested subroutine-call semantics within a pre-constructor, an alternative approach might be to allow one constructor in a class to use a different syntax for chaining to the base constructor, and require that all other constructors chain through it.

How would this strike you:

Declaring a field or auto-property as params and including an access qualifier would be similar to readonly, but would require that its value be established before the base constructor runs [a quirky keyword choice, perhaps, but the goal would be to impose stronger semantics than readonly, and the normal usage case would be that the value be established based upon parameters passed to the constructor. One constructor would be allowed to chain to base within its body, but the compiler would ensure that no code path would be allowed to improperly access the object under construction before calling the base constructor. Access to read-only fields or autoproperties would be permitted if their value had been set; if the CIL would not allow the fields to actually be read, the compiler could cache the previously-written values.

Initialization would perform the following sequence:
  1. The initializers for all fields without params declarations, readonly or otherwise, would be run first, top-to-bottom, as is the case now.
  2. If the invoked constructor chains to another, that chaining would be processed next, as is the case now.
  3. Once chaining has reached the constructor that is going to chain to base, the portion of that constructor prior to the chained call would run.
  4. Next, field initializers for any fields marked params would run; those would be allowed to use the values any previously-written params fields
  5. Next the base constructor would run, and after that, the rest of the constructor body.
The compiler should enforce that every params field is written exactly once, and is not read until it is written. Such assurances would mean that a programmer who could tell that value was written with some particular value would be entitled to know that it would always hold that value, without having to look for code in a (possibly-different) constructor which could modify it. The compiler should also validate that every execution path will chain exactly once to a base constructor, but allow for the possibility of choosing which base constructor to invoke at run-time. It should also allow the base-constructor call to be used within a try block, but with the caveat that catch or finally blocks would be subject to the same restrictions as code preceding the base constructor call, and any catch block would be required to rethrow the exception. This would allow, e.g.
bool ok = false;
File theFile = File.Open(filename);
try
{
   base(theFile);
   ok = true;
}
finally
{
  if (ok) theFile.Close();
}
to avoid leaking a file if the base constructor throws an exception [there's still a weakness if a derived constructor can throw, alas].
May 11, 2014 at 5:34 AM
supercat wrote:
Have you looked at the thread with my "partial constructor" proposal yet? It is perhaps more complicated to implement than it would need to be, but it would be a major step toward making C# capable of expressing things that can be expressed in the CIL. The most fundamental aspects are the ability to have mostly-free-style code run before chaining to a base constructor, providing that the code does not use or expose references to the object under construction, and a guarantee that all constructors will behave consistently. If code can get by without need for nested subroutine-call semantics within a pre-constructor, an alternative approach might be to allow one constructor in a class to use a different syntax for chaining to the base constructor, and require that all other constructors chain through it.
I haven't. But admittedly, I used to expect such syntax:
public MyDerived(string fileName)
{
    ValidationHelper.RequireNotEmpty(fileName);
    base(fileName);
}
That's how Delphi deals with base construtor call. And I wanted the same for C#.
But latter I found this workaround:
public MyDerived(string fileName)
    : base(ValidationHelper.RequireNotEmpty(fileName))
{
}
For the try-finally case you described, I used to investigating that for C++, which is really tricky. But for C#, usually GC solves it in the end, so we do not need to take the headache. However, your example with a File object caused me to rethink it, because delayed release by GC has a side effect.

That being said, it is a rare case. If it really need to be solved, I'd like to extend the declaration expression syntax with using. Namely, using declaration expression:
public MyDerived(string fileName)
    : base((using var file = OpenFile(fileName)))
{
}
PS: Without adding new new feature, the same issue can also be workarounded by using factory method.
May 11, 2014 at 3:31 PM
qrli wrote:
For the try-finally case you described, I used to investigating that for C++, which is really tricky. But for C#, usually GC solves it in the end, so we do not need to take the headache. However, your example with a File object caused me to rethink it, because delayed release by GC has a side effect.
Code which relies upon finalizers for proper operation is generally broken. That the language fails to provide a proper way of handling failed object construction is IMHO a major language defect.
That being said, it is a rare case. If it really need to be solved, I'd like to extend the declaration expression syntax with using. Namely, using declaration expression:

``` public MyDerived(string fileName)
: base((using var file = OpenFile(fileName)))
I'd think a different syntax should be used, to indicate that the using variable should only be disposed if there's an exception, rather than in case of successful completion.
PS: Without adding new new feature, the same issue can also be workarounded by using factory method.
For non-inheritable classes, sure. If one has a way of passing parameters to pre-chaining initializers, it's possible to have a factory method construct a DisposeManager which includes a List<IDisposable>, pass that to a constructor (which in turn passes it to each derived constructor), and have DisposeManager include a T Register<T>(T it) where T:IDisposable { _disposalList.Add(it); return it; } method. Unfortunately, the present C# language requires the use of ThreadStatic variables to make such a pattern work.
May 17, 2014 at 5:55 PM
It seems to me that primary constructor bodies create more problems then they solve. They provide no additional power of expression (over regular constructors), they hardly help conciseness, they are confusing (people will fall into the trap thinking it's an initializer body) and can be very hard to find (unless the developer follows a strong style guide saying first fields, then constructor body, then other constructors, then everything else).

I suggest you consider dropping primary constructor bodies.
May 17, 2014 at 7:22 PM
KrisVDM wrote:
I suggest you consider dropping primary constructor bodies.
What would you think of the idea of simply saying that if a primary class parameter has the same name as a field, then field initializers would have to qualify the name with either params. or this. to identify whether the name was referring to parameter or the field?

Also, what would you think of the idea of allowing exactly one constructor to call to base within its body, and requiring that all other constructors chain through that one and not write any read-only fields? That would both improve the ability of classes to prepare their own fields before chaining to base, and would also provide a way of indicating that any invariants established in that constructor would apply to all class instances (without having to search elsewhere in the class, or (for partial classes, through all the files in a project) to ensure that there wasn't some other constructor which would do things differently.
May 18, 2014 at 1:14 AM
I have to agree. It seems that in a number of cases the C# team is proposing exceedingly simple syntax to cover some percentage of use cases, perhaps with an agenda to promote a style of programming, and then everyone flips out that that simplified syntax isn't complicated enough to cover the other percentage of use cases. Obviously normal constructors aren't going away and neither is any form of the syntax that compiles today so it's silly to try to make the new syntax as fully functional as the existing syntax, even if it may save a few additional lines of code.

Since read-only auto-properties have been extended to supporting normal constructors I honestly think that either primary constructors should be entirely dropped or relegated to a very specific form of immutable class, like a formal declaration for an anonymous type. The only things permitted would be field parameters, read-only auto-properties and expression-bodied members for calculated properties on those auto-properties. Like with anonymous types that class would automatically implement overrides for GetHashCode and Equals. These classes would then also fit in with the proposal for "record" classes and "matching" operators proposed here.
public class Rectangle(public int Width, public int Height)
{
    public int Area => Width * Height;
}

Rectangle rect = new Rectangle(25, 10);

if (rect matches Rectangle(25, var height))
{
    Console.WriteLine("Found a Rectangle with a Width of 25, and it has a height of \{height} and an area of \{rect.Area}.");
}

May 18, 2014 at 6:45 PM
I have to agree. It seems that in a number of cases the C# team is proposing exceedingly simple syntax to cover some percentage of use cases, perhaps with an agenda to promote a style of programming, and then everyone flips out that that simplified syntax isn't complicated enough to cover the other percentage of use cases.
As I see it, there are two separate issues and goals here. One goal to provide a means of using a short declaration to create types which meet certain patterns, rather than having to fill out all the boilerplate code necessary to implement such patterns. The pattern, generally speaking, is a data type which serves to aggregate named members with constructor-specified values. There are valid usage cases for the patterns being read-only fields, read-write fields, red-only properties, and read-write properties, and I see no reason not to accommodate arbitrary combinations of the above, since it would merely require a relatively simple "macro expansion".

Another goal, which might share a small part of the syntax but is otherwise entirely separate, is to make field initialization expressions, which run before the base constructor, more useful. Presently, a field declaration like protected readonly int[] arr = new int[20];, promises that--no matter what else any other code in the class may do--the field arr will never be seen as null, nor as identifying any array other than the one created by this declaration. It doesn't matter if the base constructor calls virtual methods that access the field, or if other constructors would want to set it to something else (the compiler won't let them). I would regard that as a very helpful promise--one much stronger and more useful than the watered down promise one would get with:
protected readonly int[] arr;

myType(int length) : base()
{
  arr = new int[length];
}
Here, there's no guarantee about whether virtual methods might observe arr as null, nor whether arr might get replaced with some other array. The amount of code one would have to examine to be certain that arr[] would never be seen as anything other than the twenty-element array created by the earlier declaration was vastly smaller than the amount one would have to examine to achieve similar certainty in the latter situation.

The second goal of the primary constructor syntax, as I see it, is to expand the number of cases where the compiler would be able to make the former guarantee without requiring icky code. There are some rather nasty ways one could achieve such guarantees using present syntax and ThreadStatic variables, but having to say something like:
class myClassMidLayer : myClassBase
{
  [ThreadStatic]static intParams = new threadStackHelper<int>()l;

  int[] arr = new int[intParams["Length"]];

  internal myClassMidlayer(int length, ...params...) : base()
  {
     ... Remainder of constructor
  }
}
class myClass : myClassMidLayer
{
  myClass(int length, ...) : base(intParams.PushFrame().Def("Length", length), ...params...)
  {
     intParams.PopFrame();
  }
}
to achieve the proper semantics would be pretty incredibly hideous.
May 20, 2014 at 8:44 AM
Edited May 20, 2014 at 8:59 AM
I don't have much to add, as I'm not a language expert.... but as a run-of-the-mill developer who manages to use lambdas, async etc. and appreciates the other syntax changes very much to make things easier for me, this proposal confuses the heck out of me and I don't really see how it's going to save me that much time or, more importantly, reduce bugs in my day-to-day job.

Which, at the end of the day, isn't that the point of these updates to C# 6.0?

The only scenario I would like it would be in DTOs, which in another post someone has said that this:

public class DTO(int x, int y, List<int> z)

becomes auto-compliled to a full class with getters and setters. Now that is really useful. However, if we're going with the full "power version" described here then my vote is to drop it as once it's in, you can't take it out. I
May 21, 2014 at 11:48 PM
bunceg wrote:
I don't have much to add, as I'm not a language expert.... but as a run-of-the-mill developer who manages to use lambdas, async etc. and appreciates the other syntax changes very much to make things easier for me, this proposal confuses the heck out of me and I don't really see how it's going to save me that much time or, more importantly, reduce bugs in my day-to-day job.

Which, at the end of the day, isn't that the point of these updates to C# 6.0?

... if we're going with the full "power version" described here then my vote is to drop it ...
Oh, how i second that vote!! I'm confused by this initialization scope and i read part 1 and part 2 of this topic plus another brief that went over each of the new proposals.
this.myVar = myVar;
I don't understand why not stick with that. it's so clear. why confuse us? maybe the explanations were not clear. maybe i would want this feature. but i'm lost with how to tell which myVar would be which when you have different 'initialization scope'.

I love the idea to put underscores in numbers! that's just great!!
I love finally adding filtered exceptions to C#. not that it's a useful feature, but handicapping C# (the best .NET language) where VB and F# can do things that just aren't available in C# is a very poor design choice in my opinion.