Alternative “primary constructor” idea

Topics: C# Language Design
Apr 8, 2014 at 10:30 AM
Edited Apr 8, 2014 at 1:45 PM
The following rather significant problems have been identified with the current design of primary constructors:
  • You can’t have argument validation (e.g. ArgumentNullException).
  • You can’t have more than one.
  • You can’t invoke any base constructor from any other constructor than the primary constructor. This means your entire class can rely on only one of the base class’s constructors.
  • It clutters up the class declaration line and it is visually far removed from all the other constructors. (This issue becomes even more significant if primary constructors can have base-constructor calls and/or entire bodies.)
Let’s look at an example:
class Person
{
    public string Name { get; private set; }
    public DateTime Birthdate { get; private set; }

    public Person(string name, DateTime birthdate)
    {
        if (name == null)
            throw new ArgumentNullException("name");
        Name = name;
        Birthdate = birthdate;
    }
}
In this example,
  • A primary constructor cannot be used because argument validation is required.
  • If primary constructors could have a body, that body would be lexically outside the class body, which is very weird.
  • Even with a primary constructor, both public properties still need to be typed out.
  • The type of each property is still typed twice (once in the constructor parameter declaration, once in the property declaration).

Proposal

I would like to propose the following alternative:
class Person
{
    public string Name { get; }         // Getter-only auto-properties
    public DateTime Birthdate { get; }

    public Person(this.Name, this.Birthdate)
    {
        if (Name == null)
            throw new ArgumentNullException("Name");
    }
}
The proposal is to add a syntax element where a constructor parameter can be this.X instead of <type> X. The meaning is that the property or field X is automatically assigned the value of the parameter, and the type of the parameter is automatically the type of the property or field.

Benefits:
  • Argument validation can still be done.
  • Base constructors can still be applied in the same familiar old syntax.
  • The type of each property only needs to be typed once.
  • You can have any number of such constructors which set different subsets of the fields/properties in the class and/or call different base constructors and/or perform different argument validations.
  • They are still “just” constructor parameters, which means that existing features of parameters (default values, custom attributes, params) can be re-used without modification.
Comments?
Apr 8, 2014 at 11:06 AM
I like this proposed feature a lot, though I feel there are some things that need to be addressed. These are not necessarily problems that make the feature unimplementable, but they still need to be considered:
  • When a field is initialized with a this parameter, does the corresponding field initializer run?
  • At what point are this parameter values assigned to their corresponding members? As far as I can tell, there are four possibilities:
    • At the very beginning of the ctor, before the field initializers (bad – unless you suppress individual initializers, you'll overwrite the this parameter values!)
    • After field initializers, but before the base/this ctor call (potentially bad if using : this(...) ctor call, which may overwrite the this parameter value)
    • After the base/this ctor call, but before the rest of the ctor body
    • At the very end of the ctor (might permit modification of the this parameters if necessary, but has problems)
  • Your example seems to indicate that Name (and presumably Birthdate) are available inside the ctor body. Do they refer to the parameters, or to the members on the class? If the former, what happens if you reassign the parameter? Maybe they should be readonly.
  • Can this parameters be used to initialize members in a base class? My intuitive reaction is to say no, that should be the responsibility of the base ctor, but maybe a compelling case can be made for it.
While I agree that the primary constructor is limited for exactly the reasons you've mentioned, it does have one additional benefit, if I understand that proposal correctly: it automatically declares the primary ctor parameters as fields on the class.

This proposal would also slightly complicate the grammar, in that constructor parameter lists would now be different from other parameter lists, but that may be a minor point.
Apr 8, 2014 at 12:53 PM
Edited Apr 8, 2014 at 1:11 PM
When a field is initialized with a this parameter, does the corresponding field initializer run?
No. The this.X parameter replaces the field initializer.
At what point are this parameter values assigned to their corresponding members?
After some deliberation and discussion, I think it should be before this/base constructor calls, primarily because that is the order in which it’s written.
Do they refer to the parameters, or to the members on the class?
I think the names should refer to the members, and the parameters should be inaccessible. This helps the “pit of success”: You can no longer accidentally assign to the parameter instead of the field and then expect another method to see the new value in the field.
Can this parameters be used to initialize members in a base class?
My first instinct would have been yes, but after some consideration, I think I agree with you that the answer should be no, mostly because members in the base class would typically be initialised by a base constructor call.
it automatically declares the primary ctor parameters as fields on the class.
I actually see that as a downside. It means you cannot use the new syntax at all if you want a set of constructors none of which covers all the fields and only the fields.
This proposal would also slightly complicate the grammar, in that constructor parameter lists would now be different from other parameter lists
I don’t think it complicates the grammar any more than the existing design does, which augments the primary constructor’s parameter list with access modifiers for the fields (i.e. you can write class Point(public int X, public int Y, private bool Foo)).
Apr 8, 2014 at 12:57 PM
Another thing that just occurred to me: if a constructor declares a parameter this.X, it should probably be illegal to declare a local variable named X in the constructor. E.g.:
public Thing(this.Foo)
{
    var Foo = 4.2; // Error: 'Foo' is already used to mean something else
}
public Thing(this.Bar)
    : base(out var Bar) // Likewise
{ }
Apr 8, 2014 at 2:06 PM
I just want to point out that you definitely can do argument validation in current design.
public class C(object x) {
    public object X { get; } = Argument.NotNull("x", x);
}
Apr 8, 2014 at 2:13 PM
Interesting, ashmind, thanks for pointing that out. This way of validating arguments, however, is completely different from the usual one, so I'm not sure I really like that. On the other hand, it might just take some getting used to. But anyway, having a real body for primary constructors would allow you to do all the things that you can do with regular constructors, including argument validation. So I'd clearly prefer that.

By the way, the syntax proposed by Timwi above is, in my opinion, in many ways superior to the current design in the Roslyn preview. It's not as closely aligned to the F# syntax as the current implementation, but I don't think that is a necessity. A slightly enhanced regular constructor syntax like the above seems to be a better fit for the overall design of C#.
Apr 8, 2014 at 5:06 PM
Having a constructor parameter this.X implicitly define a field X seems somewhat ad hoc; it would seem cleaner to say that if a class specifies constructor parameters in its signature, then any constructor which does not chain to another of the same class must have parameters of matching names and types (but possibly others as well). As another useful shorthand, it might also be useful to allow members specified in the "class signature" include storage class specifiers along with a property declaration, so
class Foo(int variableOnly, private int privateField, public int publicProperty {get;})
{
   int someOtherField = variableOnly*2;
   string name;
   Foo(string name, int variableOnly, int privateField, int publicProperty)
   {
     name = name;
    }
}
would define fields "privateField", "otherField", and "name" as well as a backing field for "publicProperty". The identifier "variableOnly" would not have an associated field, and would only be usable within field initializations or constructors. If no constructor was specified other than the type's signature, a default constructor would be constructed which was equivalent to one with matching signature and no body.
Apr 9, 2014 at 5:21 PM
Edited Apr 9, 2014 at 5:26 PM
I have to say I much prefer you proposal because it feels more like idiomatic C#... the current primary constructor syntax only works when no overloads are specified and is going to get confusing when mixed in with generics and constraints.

Not to mention the current syntax is completely unclear how it is going to work well with protected / private constructors.
class MyClass<T1, T2, T3> : BaseClass<T1, T2>, IInterface<T3>
where T1: IEnumerable<T1>, new()
where T2: IOtherInterface
(int foo, T2 bar, T3 Interesting)
{
  ...
}
Yikes.
class MyClass<T1, T2, T3> : BaseClass<T1, T2>, IInterface<T3>
where T1: IEnumerable<T1>, new()
where T2: IOtherInterface
{
   private int foo;
   protected T2 bar;
   public T3 Interesting;

   public MyClass(this.foo, this.bar, this.Interesting) : base(foo)
   {
   }
}
with your syntax we could keep on doing everything we are used to:
class MyClass<T1, T2, T3> : BaseClass<T1, T2>, IInterface<T3>
where T1: IEnumerable<T1>, new()
where T2: IOtherInterface
{
   private int foo;
   protected T2 bar;
   public T3 Interesting;

   protected MyClass(this.foo, this.bar, this.Interesting) : base(foo)
   {
   }

   public MyClass(this.Interesting, this.bar) : this(42, bar, Interesting) { ... }
}
Apr 9, 2014 at 6:45 PM
I like the current design of primary constructors better. This feature is entirely about syntax sugar- it's not going to force you to write your classes differently going forward or to rewrite old classes. For writing quick DTOs, the primary constructor syntax expresses the type much more succinctly than is currently possible. Obviously, it's not going to work for every class or every situation and could be abused. But the same could be said of any new feature added to the language.

The alternative you propose barely saves any verbosity. F# can declare a type on a single line- the closer we can get to that in C# without all the added verbosity, the better.
Apr 9, 2014 at 9:30 PM
Just to indulge my understanding we are talking about
class Point(int x, int y, private Color color)
{
    public int X{get;} = x;
    public int Y{get;} = y;
}

// Characters: 102
vs
class Point
{
    public int X{get;}
    public int Y{get;}
    private Color color;

    public Point (this.Y, this.Y, this.color) 
    {
    }
}

// Characters: 137
So a difference of 35 characters is worth introducing implicit variable capture and a second constructor syntax?

For a simple point with x, y both public which is about the simplest type worth having:
class Point
{
    public int X{get;}
    public int Y{get;}

    public Point (this.X, this.Y) {}
}
class Point(int x, int y)
{
    public int X{get;} = x;
    public int Y{get;} = y;
}
It would save 12 characters...

I would argue that such a small character difference is a fair price to pay for something that continues to read like the c# we are used to and brings all of the benefits.
Apr 9, 2014 at 9:36 PM
I just realised that this new syntax also lends itself well to intelisense... once you have defined the fields creating the constructor could be mostly intelisensed away...
cl Point{pu int X_Coordinate{get;}
pu int Y_Coordinate{get;}

pu P(th.X,th.Y){}}
Would probably be all you would have to type to get
class Point
{
    public int X_Coordinate{get;}
    public int Y_Coordinate{get;}

    public Point (this.X_Coordinate, this.Y_Coordinate) {}
}
Apr 9, 2014 at 9:42 PM
Edited Apr 9, 2014 at 9:45 PM
@dgkimpton
To extend your line of reasoning, then what's wrong with:
class Point
{
    public int X{get;}
    public int Y{get;}

    public Point (int x, int y) {X = x; Y = y;}
}

This is even more like the C# you're used to (in fact it's identical) and it's only 11 characters more. The whole point of adding the new primary constructor syntax is to allow very terse declarations of classes. Personally, I've always hated the decision to have constructors be declared by rewriting the type name (I think TypeScript gets it right here). Forcing you to continue to write the constructor body for no other reason than to "look like existing C#" seems very backwards looking to me.

And as far as Intellisense generating the constructor for you- R# can already generate a constructor that assigns all of the properties. It's nice but the language still fights against you by making you update things in three places as requirements change. I don't think "ease of autocompleting boilerplate I wish I didn't have to write" is a positive towards a feature.
Apr 9, 2014 at 9:48 PM
MgSam wrote:
This feature is entirely about syntax sugar
If constructor arguments were available to field initializers, that would be more than just syntactic sugar. It would allow a class to do something like:
protected int[] myArray = new int[size];
and have myArray initialized before the base constructor runs, something which is otherwise not possible. There is no technical impediment to a language making constructor parameters available to field initializers; the ability to do so should IMHO be the primary motivation behind going beyond the existing constructor design.
Apr 9, 2014 at 9:57 PM
Edited Apr 9, 2014 at 9:59 PM
I grant that the one thing the proposed syntax doesn't allow is the implicit capture of fields... but how about we look at it another way:

Implicit generation of a default constructor?
class Point
{
    public = int X_Coordinate{get;}
    public = int Y_Coordinate{get;}
    private = Color color;
}
would automatically generate a constructor in the same was as the current default constructor:
    public Point (int X_Coordinate, int Y_Coordinate, Color color)
    {
        this.X_Coordinate = X_Coordinate;
        this.Y_Coordinate = Y_Coordinate;
        this.color = color;
    }
now you have the ability to write an extremely terse definition if all you want to do is initialise some fields, and a much more expressive form if you also want to do more advanced initialisation. And all of them only require specifying the type once.

Exact syntax's could vary, some examples below:
class Point
{
    = public int X_Coordinate{get;}
    public int Y_Coordinate{get;}=;
    public = int Y_Coordinate{get;}
}
_It would be super cool if we could loose the superfluous ; inside the property scope as well _
class Point
{
    = public int X_Coordinate{get}
    = public int Y_Coordinate{get}
    = private Color color;
}
And maybe allow chaining?
class Point
{
    = public int X_Coordinate{get}, Y_Coordinate{get}
    = private Color color;
}
{edit}
Which for a traditional Point class would yield:
class Point
{
    = public int X{get}, Y{get}
}
Apr 9, 2014 at 10:12 PM
Edited Apr 9, 2014 at 10:23 PM
Combined with the originally introduced = at the end for validation and importing of static classes, you could get to something neat although admittedly a little harder to read because it is so terse.
public class Point
{
    = public int X {get} = Max(0, X);
    = public int Y {get} = Max(0, X);
    = private int R;
    = private int G;
    = private int B;

    public Color Color {get{return FromArgb(R,G,B);}}
    public int Area {get{return Y*X;}}
}
{edit}
To bring this back to the original proposal they combine nicely:
public class Point
{
    = public int X {get} = Max(0, X);
    = public int Y {get} = Max(0, X);
    private int R = 0;
    private int G = 0;
    private int B = 0;

    public Point(this.X, this.Y, this.R, this.G, this.B) {}

    public Color Color {get{return FromArgb(R,G,B);}}
    public int Area {get{return Y*X;}}
}
So now you have two constructors
new Point(3, 4);
new Point(3, 4, 255, 128, 0);


Just throwing some ideas out there.

I think personally that if you want validation of arguments there is nothing wrong with just defining the constructor... so for me I think the pinnacle might be something like:
public class Point
{
    = public int X {get}, int Y {get}
    private int R = 0, G = 0, B = 0;

    public Point(this.X, this.Y, this.R, this.G, this.B) {}

    public Color Color {get{return FromArgb(R,G,B);}}
    public int Area {get{return Y*X;}}
}
Using member initialisation would also easily allow for default parameter values:
public class Point
{
    = public int X {get}, int Y {get}
    = private int R = 0, G = 0, B = 0, A=255;

    public Color Color {get{return FromArgb(A,R,G,B);}}
    public int Area {get{return Y*X;}}
}