C# Language Design Notes for Aug 27, 2014 (Part II)

Topics: C# Language Design
Coordinator
Aug 29, 2014 at 12:40 AM
Edited Aug 29, 2014 at 12:42 AM
This is Part II. Part I is here.

Generated parameterless constructors

The current rule is that initializers are only allowed if there are constructors that can run them. This seems reasonable, but look at the following code:
struct S 
{
    string label = "<unknown>";
    bool pending = true;
    public S(){}
    …
}
Do we really want to force people to write that trivial constructor? Had this been a class, they would just have relied on the compiler-generated default constructor.

It is probably desirable to at least do what classes do and generate a default constructor when there are no other constructors. Of course we wouldn’t generate one when there are no initializers either: that would be an unnecessary (and probably slightly breaking) change over what we do today, as the generated constructor would do exactly the same as the default new S() behavior anyway.

A question though is if we should generate a parameterless constructor to run initializers even if there are also parameterful ones. After all, don’t we want to ensure that initializers get run in as many cases as possible?

This seems somewhat attractive, though it does mean that a struct with initializers doesn’t get to choose not to have a generated parameterless constructor that runs the initializers.

Also, in the case that there’s a primary constructor it becomes uncertain what it would mean for a parameterless constructor to run the initializers: after all they may refer to primary constructor parameters that aren’t available to the parameterless constructor:
struct Name(string first, string last)
{   
    string first = first;
    string last = last;
}
How is a generated parameterless constructor supposed to run those initializers? To make this work, we would probably have to make the parameterless constructor chain to the primary constructor (all other constructors must chain to the primary constructor), passing default values as arguments.

Alternatively we could require that all structs with primary constructors also provide a parameterless constructor. But that kind of defeats the purpose of primary constructors in the first place: doing the same with less code.

In all we seem to have the following options:
  1. Don’t generate anything. If you have initializers, you must also provide at least one constructor. The only change from today’s design is that one of those constructors can be parameterless.
  2. Only generate a parameterless constructor if there are no other constructors. This most closely mirrors class behavior, but it may be confusing that adding an explicit constructor “silently” changes the meaning of new S() back to zero-initialization. (The above diagnostic would help with that, though).
  3. Generate a parameterless constructor only when there is not a primary constructor and
    a. Still fall back to zero-initialization for new S() in this case
    b. Require a parameterless constructor to be explicitly specified
    This seems to introduce an arbitrary distinction between primary constructors and other constructors that prevents easy refactoring back and forth between them.
  4. Generate a parameterless constructor even when there is a primary constructor
    a. using default values and/or
    b. some syntax to provide the arguments as part of the primary constructor
    This seems overly magical, and again treats primary constructors more differently than was the intent with their design.

Conclusion

This is a hard one, and we didn’t reach agreement. We probably want to do at least option 2, since part of our goal is for structs to become more like classes. But we need to think more about the tradeoffs between that and the more drastic (but also more helpful?) approaches.

Definite assignments for imported structs

Unlike classes, private fields in structs do need to be observed in various ways on the consumer side – they cannot be considered entirely an implementation detail.

In particular, in order to know if a struct is definitely assigned we need to know if its fields have all been initialized. For inaccessible fields, there is no sure way to do that piecemeal, so if such inaccessible fields exist, the struct-consuming code must insist that the struct value as a whole has been constructed or otherwise initialized.

So the key is to know if the struct has inaccessible fields. The native compiler had a long-running bug that would cause it to check imported structs for inaccessible fields only where those fields were of value type! So if the struct had only a private field of a reference type, the compiler would fail to ensure that it was definitely assigned.

In Roslyn we started out implementing the specification, which was of course stricter and turned out to break some code (that was buggy and should probably have been fixed). Instead we then went to the opposite extreme and just stopped ensuring definite assignment of these structs altogether. This lead to its own set of problems, primarily in the form of a new set of bugs that went undetected because of the laxer rules.

Ideally we would go back to implementing the spec. This would break old code, but have the best experience for new code. If we had a “quirks” mode approach, we could allow e.g. the lang-version flag to be more lax on older versions. Part of migrating a code base to the new version of the language would involve fixing this kind of issue.

Conclusion

Unfortunately we do not have the notion of a quirks mode. Like a number of issues before, this one alone does not warrant introducing one – after all, it is a new kind of upgrade tax on customers. We should compile a list of things we would do if we had a quirks mode, and evaluate if the combined value would be enough to justify it.

Definite assignment for structs should be on that list. In the meantime however, the best we can do is to revert to the behavior of the native compiler, so that’s what we’ll do.
Aug 29, 2014 at 2:49 AM
I wrote a comment on the other message before seeing this one. I would favor requiring a struct have a parameterless constructor if it has field initializers, so that the field initializers have something clear to "attach to", with a couple of exceptions:
  1. If a field initializer simply sets a field to its type-default value as a means of expressly indicating that the field may be read without having been written with anything else, such an initializer wouldn't need to do anything that wouldn't already be done for default(T), thus wouldn't have to attach to anything.
  2. If a field initializer sets a field to a value which is passed to the primary constructor, then it should only be "expected" to run in cases where that primary constructor is called.
On a related note, would there be any technical difficulty with allowing compile-time constants to be created of structure types which have entirely public fields, or fields which are initialized directly from passed-in parameters? Given:
double Point3d(double x, double y, double z)
{
  public double X=x,Y=y,Z=z;
}
the value of new Point(1,2,3); should be recognizable as being the concatenation of the byte values of 1.0d, 2.0d, and 3.0d.
Sep 2, 2014 at 7:42 PM
I'd vote for option 2. I think with the diagnostic this is the best path forward. I think making structs behave more like classes is a good goal.

That being said- I think changing struct behavior should be pretty low priority- behind all the features you have listed on the feature status page. People generally already know how structs work and their quirks, I don't think it's worth spending precious design minutes slightly improving an area of the language that isn't used super-often and isn't a particularly big pain point.
Sep 3, 2014 at 6:49 AM
MgSam wrote:
... that isn't used super-often and isn't a particularly big pain point.
That might be your experience. I for one often use structs (game development) and I have often been annoyed that I can't define a default constructor, requiring some ugly work-arounds.

So I really appreciate this getting fixed. It's a small improvement, but an important one nonetheless. Now if only type parameters could be restricted to pointer types...
Sep 3, 2014 at 4:32 PM
Edited Sep 3, 2014 at 4:33 PM
Expandable wrote:
That might be your experience. I for one often use structs (game development) and I have often been annoyed that I can't define a default constructor, requiring some ugly work-arounds.
Out of curiosity, outside of the generic new T(), which existing compiled C# code won't handle correctly when type T is a struct with a parameterless constructor, in what cases is new structName() significantly better than structType.staticFactoryMethod(); or structType.staticFactoryMethod(out structType);? Note that a statement var foo = new structType(params); may be arbitrarily translated as either
var foo;
structType.ctor(out foo, params);
or
var temp=default(structType);
structType.ctor(out temp, params);
foo=temp;
but I don't see the fact that the compiler may sometimes use the former version as being a real "advantage".
Sep 3, 2014 at 4:51 PM
@supercat: So are you questioning the existence of constructors in general? You could do the same thing with classes as well. Are you suggesting that not using constructors is the preferred style? That static factory methods are "better" in some sense? Sure you can do that, and occasionally static factory methods are really useful. But constructors are how C# thinks that object construction should work, for both classes and structs. Its just that for structs, it didn't really work that way until now. But it will, if the proposed changes make it into the language. So is it "significantly better", as you ask? Probably not. But then again most features of the language are just syntactic sugar for something else.

The more I think about the proposed changes to struct construction, the more I come to like them. It makes structs indeed very much like classes. Sure the default constructor is not called when allocating an array of structs, but neither is the default constructor of classes. It won't change the fact that you have to know what you're doing when using structs (especially mutable ones), and the suggestion to use classes instead of structs by default won't change because of that. But if you decide to use structs, these changes are useful on more than one occasion.

I don't understand what you mean with "which existing compiled C# code won't handle correctly" and why you think I only need default constructors in conjunction with generics. Personally, I don't care about the new T() issue at all. I'll just recompile my entire code base with the new C# compiler; problem solved. And that's probably what's going to happen anyway to most libraries out there, especially ones imported via nuget.
Sep 4, 2014 at 8:42 AM
supercat wrote:
in what cases is new structName() significantly better than structType.staticFactoryMethod(); or structType.staticFactoryMethod(out structType);?
Object initializers. You cannot do that with factory methods, either class or struct.
Sep 4, 2014 at 8:59 AM
Edited Sep 4, 2014 at 9:00 AM
BachratyGergely wrote:
Object initializers. You cannot do that with factory methods, either class or struct.
Good point! I knew there was something else but I just couldn't think of it :) If they add event initializes (which is planned for C# 6, but there hasn't been any update on this yet, as far as I am aware), object creation via constructors will have even more benefits over static factory methods.
Sep 4, 2014 at 11:47 PM
BachratyGergely wrote:
Object initializers. You cannot do that with factory methods, either class or struct.
Won't the ability to declare variables within an expression allow the equivalent:
myStruct = (var foo=StructType.CreateDefault(); foo.x=this; foo.y=that; foo);
Of course, that would only be needed if a factory method didn't already set all the fields as desired; otherwise one could use:
myStruct = StructType.Create(x := this, y := that);
Nov 21, 2014 at 3:11 PM
Wouldn't such a feature be confusing and subject to abuse? Isn't having a sensible default value a requirement for structs?
I think it would be too easy for someone to write code like this and expect it to work fine:
public struct Month
{
    int m;
    public Month()   {  this.m = 1;  }
    public Month(int value)
    {
        Contract.Requires(value >= 1 && value <= 12);
        this.m = value;
    }
    public override string ToString()
    {
        // melt the cpu and nuke the device if month is 0
        ...
    }
    ...
}
The author of this struct would expect the value of m to be always between 1 and 12, which is of course not the case.