This project is read-only.

Primary Constructors--focus on field initializers

Topics: C# Language Design, VB Language Design
May 3, 2014 at 7:42 PM
A field declaration like private readonly Foo myFoo = new Foo(); has a wonderful advantage over any other kind of declaration: that line in and of itself is sufficient to guarantee that every access to field myFoo will always see a reference to the same instance of Foo, regardless of anything that constructors or virtual methods might do. In that regard, it is much superior to
private readonly myFoo;

myThing(int someNumber) : base (someNumber)
{
  myFoo = new Foo(someNumber);
}
because the latter form does nothing to guarantee that myFoo won't be read before it is written. Further, one must examine every constructor throughout the entire class to know whether all of them will write it the same way.

Unfortunately, despite its huge advantages, the combined declare-and-initialize construct is unusable in most real-world situations because the proper content for a write-once field will usually depend upon things to which field initializers have no access.

I would suggest that the primary focus when designing enhanced forms of constructor should be to have a form of combined field declarations/initialization which makes the promises associated with the present ones, but is more widely usable. Although the existing method of initializing readonly fields in the main constructor body must remain for compatibility, I would suggest that it be used as seldom as possible in place of a much-more-statically-verifiable approach.

In particular, I would like to see a syntax for partial-constructor "methods" something like:
partial new partialMethodName( ...params...)
{
   method code and declarations;
}
and
partial late new partialMethodName( ...params...)
{
   method code and declarations;
}
within a partial constructor, declarations with access class specifiers would declare fields in the surrounding class. Non-late partial constructors would be forbidden from using methods or properties of this (see later note regarding autoproperties), the combination of field declarations and partial-constructor calls made by any partial constructor must be identical for every non-exceptional code path, and any exception caught from partial-constructor calls must be unconditionally rethrown. Additionally, constructors must unconditionally cause every partial constructor to be invoked exactly once, and all constructors must invoke them in the same order. Partial constructors tagged late must be invoked after the base constructor; those not tagged must be invoked before.

Such rules should allow a compiler to easily statically verify that each readonly variable is written exactly once, and that--except for variables in "late" partial constructors, it cannot possibly be read before that. The assurance that variables cannot be observed before they are written would not be applicable to those declared in late partial constructors, but the late declaration would make the lack of assurance clear.

One slight conceptual difficulty I see would be with the inability to use properties within partial constructors, even in cases where the property to be used would have been statically recognizable as having been set. To ease that, I would suggest adding {... private var;} and {... private readonly var;} as options in non-virtual auto-property declarations. Such declarations would mean that within the class, the property name would be regarded as synonymous with the backing field.

A similar feature would be helpful in VB.NET, though the syntax would of course have to be different. Perhaps something like Partial Initializer Fred(...params...) . Invocation of partial initializers before a chained call to New() or Base() should be pretty straightforward, though.
May 7, 2014 at 10:40 PM
What problem this is trying to solve?

I think the main point for readonly fields is to be sure that the value won't change during the object livecycle, but the constructor logic is usually simple enough and completely under you control.

Why determining that all readonly fields are written before read, and not set more than once, is an issue for you?
May 8, 2014 at 12:29 AM
Edited May 8, 2014 at 12:32 AM
Olmo wrote:
What problem this is trying to solve?
Two main problems:

First of all, the designers of C# thought it important to allow subclasses to initialize fields before chaining to a base constructor (which is why field initializers are processed then), but field initializers are not usable in cases that would would require access to either constructor parameters or previously-initialized fields. Of the cases where early initialization is important, those where initial values would depend upon constructor parameters are the most important but least well-supported (not quite totally impossible, but impossible without using a level of horribly hideous hackery most shops would likely consider unacceptable). If early initialization is still a design goal, language should provide proper support for a very common usage case (having fields whose values depend either upon constructor parameters or upon values computed from those parameters).

Secondly and perhaps more importantly, if a type has multiple constructors, it should be possible to avoid duplication of common aspects. Presently, if a type has a readonly field whose value would depend upon constructor parameters or the values of other fields, the code to initialize that field must be repeated in every constructor body. Given:
public class Foo
{
  protected readonly int arr1[], arr2[];
  public Foo(int size) : base()
  {
    arr1 = new int[size];
    arr2 = new int[size];
  }
  ....
}
It would appear that arr1 and arr2 would always be identically-sized arrays, but the only way to be sure of that is to examine the body of every single constructor, and somehow ensure that every constructor that is ever written in future initializes those variables to identically-sized arrays.

By contrast, given
public class Foo
{
  partial new MakeArrays(int size)
  {
    protected readonly int arr1[] = new int[size];
    protected readonly int arr2[] = new int[size];
  }
  ...
  public Foo(int size) MakeArrays(size), base()
  {
  }
  ....
}
one could be certain that the class invariant of both arrays being the same size would always be upheld, and would even apply during execution of the base constructor.
I think the main point for readonly fields is to be sure that the value won't change during the object livecycle, but the constructor logic is usually simple enough and completely under you control.
A field which never has and never will identify anything other than a particular object may be regarded as synonymous with that object. If two such fields identify objects which share some immutable trait (such as array length), the trait values may be regarded as synonymous. If there is ever a time when an invariant may be observed not to hold, reasoning about a program becomes more difficult.

I'll agree that constructors are often simple, but that doesn't mean they always are. If a class has two or more constructors that differ enough that they need to have different bodies, being able to factor the common portions into a shared block of code will make it much easier to ensure that all expected invariants will always hold, without having to worry about the possibility that identical-looking chunks of code in the constructors don't quite match perfectly.

Further, while constructors are discouraged from calling virtual methods, such calls are legal and in some cases cannot be avoided. If a constructor is going to call a virtual method and a subclass might override it, the only way to ensure that an invariant will hold is to establish it before the base constructor call.
Why determining that all readonly fields are written before read, and not set more than once, is an issue for you?
If a compiler statically verifies that a field will never be written after it has been read, and programmer observes that a field is written with a particular value, then someone inspecting that code can know that any read of that field will yield that value without having to examine any other code, nor determine when the read or write occurs. If a field is initialized a partial initializer's parameters, such dependency will be clear. By contrast, if one constructor simply neglects to set a field, the fact that it may never be set would be far from obvious.

Also, from what I understand, the .NET verifier won't allow a reference to an object instance under construction to be passed to any method, nor stored anywhere, prior to calling the base constructor. Putting the code which is shared among constructors into what looks like a method will improve legibility, but it's not possible to invoke a real instance method prior to chaining to the base constructor. The only way to allow that syntax is to have each constructor incorporate the code from all the partial constructors. While some aspects of the proposed restrictions could be relaxed, I think there's great value in knowing if something can happen in a certain way, it will always happen in that way.