C# Language Design Notes for Aug 27, 2014 (Part I)

Topics: C# Language Design
Coordinator
Aug 29, 2014 at 12:41 AM

C# Design Notes for Aug 27, 2014

Notes are archived here.

This is Part I. Part II is here.

Agenda

The meeting focused on rounding out the feature set around structs.
  1. Allowing parameterless constructors in structs <allow, but some unresolved details>
  2. Definite assignment for imported structs <revert to Dev12 behavior>

Parameterless constructors in structs

Unlike classes, struct types cannot declare a parameterless constructor in C# and VB today. The reason is that the syntax new S() in C# has historically been reserved for producing a zero-initialized instance of the struct. VB.Net has always had an alternative syntax for that (Nothing) and C# 2.0 also added one: default(T). So the new S() syntax is no longer necessary for this purpose.

It is possible to define parameterless constructors for structs in IL, but neither C#, VB or F# allow you to. All three languages have mostly sane semantics when consuming one, though, mostly having new S() call the constructor instead of zero-initializing the struct (except in some corner cases visited below).

Not being able to define parameterless constructors in structs has always been a bit of a pain, and now that we’re adding initializers to structs it becomes outright annoying.

Conclusion

We want to add the ability to declare explicit public parameterless constructors in structs, and we also want to think about reducing the number of occurrences of new S() that produce a default value. In the following we explore details and additional proposals.

Accessibility

C#, VB and F# will all call an accessible parameterless constructor if they find one. If there is one, but it is not accessible, C# and VB will backfill default(T) instead. (F# will complain.)

It is problematic to have successful but different behavior of new S() depending on where you are in the code. To minimize this issue, we should make it so that explicit parameterless constructors have to be public. That way, if you want to replace the “default behavior” you do it everywhere.

Conclusion

Parameterless constructors will be required to be public.

Compatibility

Non-generic instantiation of structs with (public) parameterless constructors does the right thing in all three languages today. With generics it gets a little more subtle. All structs satisfy the new() constraint. When new T() is called on a type parameter T, the compiler should generate a call to Activator.CreateInstance – and in VB and F# it does. However, C# tries to be smart, discovers at runtime that T is a struct (if it doesn’t already know from the struct constraint), and emits default(T) instead!
public T M<T>() where T: new() { return new T(); }
Clearly we should remove this “optimization” and always call Activator.CreateInstance in C# as well. This is a bit of a breaking change, in two ways. Imagine the above method is in a library:
  1. The more obvious – but also more esoteric – break is if people today call the library with a struct type (written directly in IL) which has a parameterless constructor, yet they depend on the library not calling that parameterless constructor. That seems extremely unlikely, and we can safely ignore this aspect.
  2. The more subtle issue is if such a library is not recompiled as we start populating the world with structs with parameterless constructors. The library is going to be wrongly not calling those constructors until someone recompiles it. But if it’s a third party library and they’ve gone out of business, no-one ever will.
We believe even the second kind of break is relatively rare. The new() constraint isn’t used much. But it would be nice to validate.

Conclusion

Change the codegen for generic new T() in C# but try to validate that the pattern is rare in known code.

Default arguments

For no good reason C# allows new expressions for value types to occur as default arguments to optional parameters:
void M(S s = new S()){ … }
This is one place where we cannot (and do not) call a parameterless constructor even when there is one. This syntax is plain bad. It suggests one meaning but delivers another.

We should do what we can (custom diagnostic?) to move developers over to use default(S) with existing types. More importantly we should not allow this syntax at all when S has a parameterless constructor. This would be a slight breaking change for the vanishingly rare IL-authored structs that do today, but so be it.

Conclusion

Forbid new S() in default arguments when S has a parameterless constructor, and consider a custom diagnostic when it doesn’t. People should use default(S) instead.

Helpful diagnostic

In general, with this change we are trying to introduce more of a distinction between default values and constructed values for structs. Today it is very blurred by the use of new S() for both meanings.

Arguably the use of new S() to get the default value is fine as long as S does not have any explicit constructors. It can be viewed a bit like making use of the default constructor in classes, which gets generated for you if you do not have any explicit constructors.

The confusion is when a struct type “intends” to be constructed, by advertising one or more constructors. Provided that none of those is parameterless, new S() still creates an unconstructed default value. This may or may not be the intention of the calling code. Oftentimes it would represent a bug where they meant to construct it (and run initializers and so forth), but the lack of complaint from the compiler caused them to think everything was all right.

Occasionally a developer really does want to create an uninitialized value even of a struct that has constructors. In those cases, though, their intent would be much clearer if they used the default(S) syntax instead.

It therefore seems that everyone would be well served by a custom diagnostic that would help “clear up the confusion” as it were, by
  • Flagging occurrences of new S() where S has constructors but not a parameterless one
  • Offering a fix to change to default(T), as well as fixes to call the constructors
This would help identify subtle bugs where they exist, and make the developer’s intent clearer when the behavior is intentional.

The issue of course is how disruptive such a diagnostic would be to existing code. Would it be so annoying that they would just turn it off? Also, is the above assumption correct, that the occurrence of any constructor means that the library author intended for a constructor to always run?

Conclusion

We are cautiously interested in such a diagnostic, but concerned that it would be too disruptive. We should evaluate its impact on current code.

Chaining to the default constructor when there’s a parameterless one

A struct constructor must ensure that the struct is definitely assigned. It can do so by chaining to another constructor or by assigning to all fields.

For structs with auto-properties there is an annoying fact that you cannot assign to the underlying field because its name is hidden, and you cannot assign to the setter, because you are not allowed to invoke a function member until the whole struct is definitely assigned. Catch 22!

People usually deal with this today by chaining to the default constructor – which will zero-initialize the entire struct. If there is a user-defined parameterless constructor, however, that will not work. (Especially not if that is the constructor you are trying to implement!)

There is a workaround. Instead of writing
S(int x): this() { this.X = x; }
You can make use of the fact that in a struct, this is an l-value:
S(int x) { this = default(S); this.X = x; }
It’s not pretty, though. In fact it’s rather magical. We may want to consider adding an alternative syntax for zero-initializing from a constructor; e.g.:
S(int x): default() { this.X = x; }
However, it is also worth noting that auto-properties themselves are evolving. You can now directly initialize their underlying field with an initializer on the auto-property. And for getter-only auto-properties, assignment in the constructor will also directly assign the underlying field. So maybe problem just isn’t there any longer. You can just zero-initialize the auto-properties directly:
public int X { get; set; } = 0;
Now the definite assignment analysis will be happy when you get to running a constructor body.

Conclusion

Do nothing about this right now, but keep an eye on the issue.

Part II
Aug 29, 2014 at 2:37 AM
What would you think of the idea of requiring that every struct which uses field- or property initialization syntax to do anything other than set fields to their natural defaults be required to explicitly specify a public parameterless constructor? That would help make obvious the idea that default(T) is the same as new T() when there is no parameterless constructor, but has a different meaning when there is.
Aug 29, 2014 at 8:01 AM
What's the motivation for this feature?

Changing the meaning of new S() seems a pretty serious change; adding a diagnostic (I assume you mean a compiler warning?) will only help if the code is recompiled, and if the developer pays attention to warnings. But at least, if the code is not recompiled, it won't change the intended behavior of the code.

If the code is recompiled and the developer doesn't fix the warning, though, it will change the behavior. I agree that this is unlikely to cause issues in most cases, but in the past C# has always tried to avoid introducing changes that changed the behavior of existing code (except perhaps non-functional changes)...

In my experience, not being able to write a parameterless constructor for a struct has rarely been an issue. So is this feature really worth the risk of changing the behavior of existing code?
Aug 29, 2014 at 6:38 PM
Edited Aug 30, 2014 at 1:36 AM
Sounds very similar to my proposal of overridable null values. Eg the type (class/struct) can specify an instance to use as its default state.
It doesn't have to a default constructor, the only restriction is that has be T! (non null)

Also I thought struct already have a parameter less constructor, otherwise you could never create an instance of one.
Aug 29, 2014 at 6:49 PM
Changing the meaning of new S() seems a pretty serious change;
Unless one is linking with outside code written in a language which already allows parameterless constructors to be defined for structures, the only structures whose behavior will be affected will be those written after such abilities are added to C#. I see only two plausible ways by which such structures will be consumed by code which was written prior to the introduction of that feature:
  1. If the structures were modified to add a parameterless constructor after the consumer code was written. While such a scenario is hardly implausible, any problems would be the responsibility of the person modifying the structure--not the language designers.
  2. It is possible that new structure types may be passed to old code via new()-constrained generic type parameters. Under that scenario, older generic methods which were compiled in the days when C# wrongly replaced new T() with default(T) might fail when used with structures that break if zero-filled rather than constructed, but it would seem improbable that a structure would define a constructor but only work correctly if it was zero-filled.
What scenarios do you see where existing code would work correctly, and the changed behavior would cause it to work incorrectly?
Aug 31, 2014 at 12:24 AM
Who does really experience a pain from not being able to define a default constructor for a struct?

Is this feature really worth the subtleity of the difference between (new S[1])[0] and new S() ?
Sep 1, 2014 at 3:34 AM
erikkallen wrote:
Who does really experience a pain from not being able to define a default constructor for a struct?

Is this feature really worth the subtleity of the difference between (new S[1])[0] and new S() ?
That's "subtlety"? (new T[1])[0] is default(T), whether T is a class type or a structure type. The fact that the C# compiler sometimes allows new T() as an alternative syntax for default(T) in contexts like default parameter values where it can't actually call the constructor is a defect in the earlier behavior, not a fundamental problem with struct constructors.