This project is read-only.

Implicitly typed fields

Topics: C# Language Design, General
Oct 13, 2014 at 2:44 PM
Are there any plans for implicitly typed fields in C#?

This kind of code is quite anoying to write:
class Foo {
    SomeClass<string, AnotherClass<KeyValuePair<string, object>> fieldName = new SomeClass<string, AnotherClass<KeyValuePair<string, object>>();
}

Instead, would be cleaner as:
class Foo {
    fieldName = new SomeClass<string, AnotherClass<KeyValuePair<string, object>>();
}
Oct 13, 2014 at 4:54 PM
No, but we are considering "type inference on constructors", which would allow you to write
    SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass();
Oct 13, 2014 at 5:10 PM
nmgafter wrote:
No, but we are considering "type inference on constructors", which would allow you to write
    SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass();
How about just
SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new();
?
Oct 13, 2014 at 5:37 PM
Edited Oct 13, 2014 at 5:37 PM
nmgafter wrote:
No, but we are considering "type inference on constructors", which would allow you to write
    SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass();
This is less code but not very clear. Being able to use the same syntax from "var" would make the code much more readable.

I know type inference constructors solve some problems, but not this one.

Are there any cons for implicitly typed fields or is it just something the team haven't considered yet?
Maybe I'm just being naive, but I'd guess most of the functionality and implications have been solved when implicitly typed variables were implemented.

@PauloMorgado, I think the "new()" is even more confusing. =)
Oct 13, 2014 at 6:48 PM
Unlike local variable declarations, field declarations are part of the API shape of a class. As such the declaration (including the type) conveys information important to readers for non-local consumers of the API.

There are also difficulties with the semantics, since the initializing expression for one field in one class could reference another field in another class. It is not clear in what order the compiler should (be required to) resolve the initializing expressions, or how to give a clear diagnostic when things go awry.
Oct 13, 2014 at 9:03 PM
nmgafter wrote:
Unlike local variable declarations, field declarations are part of the API shape of a class. As such the declaration (including the type) conveys information important to readers for non-local consumers of the API.

There are also difficulties with the semantics, since the initializing expression for one field in one class could reference another field in another class. It is not clear in what order the compiler should (be required to) resolve the initializing expressions, or how to give a clear diagnostic when things go awry.
Can you give an example of that? Just cross references or some other cases?

It would make most sense for private fields and it would be safe for those.
Oct 13, 2014 at 9:10 PM
nmgafter wrote:
There are also difficulties with the semantics, since the initializing expression for one field in one class could reference another field in another class. It is not clear in what order the compiler should (be required to) resolve the initializing expressions, or how to give a clear diagnostic when things go awry.
An approach I've mentioned before would be to defined rules allowing var to be used with fields in cases where a local examination of the right-hand side would identify one candidate type, and have the compiler reject any expression yielding a type other than what the rule would require.

I would suggest allowing three scenarios:
  1. The right-hand side of the expression is a new expression. The most common case, and one VB.NET has always provided for.
  2. The right-hand side of the expression is the direct result of a typecast. This isn't as common a scenario, and many such cases would be better handled by simply specifying the cast-to side on the left half of the expression, but there are still many somewhat-common scenarios where this would be helpful because the right-hand side of the expression would require a narrowing cast even if the left-side type were specified.
  3. The right-hand side of the expression is a static method or property invocation in which the class of the member is specified and is the same as the return type of the member [e.g. var foo = MyClass.Create(someArgs); would be allowed if and only if the return type of MyClass.Create were MyClass]. Note that one would have to examine MyClass to know whether the return type was MyClass, but one wouldn't have to examine it to know that foo couldn't possibly be any other type.
Note that in some cases, as needs change, code written as:
var foo = MyClass.Create(someArgs);
might have to be rewritten as either:
var foo = (MyClass)(MyFactory.Create(someArgs));
or
MyClass foo = MyFactory.Create(someArgs);
I don't see the potential need for such rewriting as being a particular argument against the feature, however, since the nature of what is required should be pretty clear.
Oct 13, 2014 at 9:43 PM
supercat wrote:
An approach I've mentioned before would be to defined rules allowing var to be used with fields in cases where a local examination of the right-hand side would identify one candidate type, and have the compiler reject any expression yielding a type other than what the rule would require.
Do you really need var? Only declarations are allowed in that place. Just add the necessary access modifiers:
foo1 = 2;
private foo2 = "test";
protected foo3 = 2.5;
Oct 13, 2014 at 10:53 PM
Edited Oct 13, 2014 at 10:54 PM
PauloMorgado wrote:
Do you really need var? Only declarations are allowed in that place. Just add the necessary access modifiers:
foo1 = 2;
private foo2 = "test";
protected foo3 = 2.5;
I forgot to mention that I'd allow type inference of string literals and suffixed numeric literals, and maybe allow unprefixed literals of type Int32 and double only, but not inference of computations unless typecast. Otherwise it's too easy for someone who edits the value of a numeric literal to inadvertently change the type of the field associated with it.

With regard to var, it could probably be omitted if the access specifier was given, but I think it's helpful to make clear that code is defining a mutable field rather than a constant.
Oct 13, 2014 at 11:10 PM
supercat wrote:
With regard to var, it could probably be omitted if the access specifier was given, but I think it's helpful to make clear that code is defining a mutable field rather than a constant.
In terms of written C#, a constant is a field with the const` modifier. Where is the confusion here? It either is a constant or not.
Oct 13, 2014 at 11:23 PM
PauloMorgado wrote:
In terms of written C#, a constant is a field with the const` modifier. Where is the confusion here? It either is a constant or not.
Presently every declaration must either specify a type or var. The declaration private foo=23.4; does not match any presently-legal syntax. If it were a shorthand for an existing syntax it might most logically be shorthand for private double foo=23.4; but it would not be unreasonable for it to represent something else--perhaps something which cannot otherwise be specified.

For example, it might be helpful to be able to declare something like:
private mySize = 12.34;

double size1 = mySize;
float size2 = mySize;
System.Decimal size3 = mySize;
and have each variable be given the most precise representation of 12.34 that it is capable of holding.

In considering a potential new syntax, I think it's helpful to imagine what someone who didn't know that there was a new syntax for something, but didn't know any particulars, would expect it to mean. The existence of multiple possible useful meanings for private mySize = 12.34; would argue against having that as a syntax for field declarations.
Oct 13, 2014 at 11:30 PM
supercat wrote:
Presently every declaration must either specify a type or var.
Presently every variable declaration must either specify a type or var because it's possible at that same line that the statement is a variable declaration or a variable (previously declared) assignment.

That doesn't happen to fields. You can only declare fields and not assign values to already declared fields.
Oct 14, 2014 at 5:04 PM
nmgafter wrote:
Unlike local variable declarations, field declarations are part of the API shape of a class. As such the declaration (including the type) conveys information important to readers for non-local consumers of the API.
It looks like the bigger issue is using implicitly typed fields in the public API. But couldn't this be only for private fields?
This sounds restrictive, but in practice 99% of fields are private, so it would make the code cleaner in a lot of cases.
There are also difficulties with the semantics, since the initializing expression for one field in one class could reference another field in another class. It is not clear in what order the compiler should (be required to) resolve the initializing expressions, or how to give a clear diagnostic when things go awry.
This would be solved by the rule above. Any public/protected/internal field is part of the API, and then is required to declare the type.

This could also be used in conjunction with constructor type inference:

So, instead of:
class Foo {
    static List<string> initialList = new List<string> { "a", "b", "c" };
    List<string> list = new List<string>();
    SomeListHandler<IList<string>> handler = new SomeListHandler<IList<string>>(list);
    int maxValues = 10;
    string key = "abc";
}
simply write:
class Foo {
    static initialList = new List { "a", "b", "c" };
    list = new List<string>();
    handler = new SomeListHandler(list);
    maxValues = 10;
    key = "abc";
}
This is exactly "var" for private fields. Maybe it needs a "field" keyword in front to help the compiler, but would still be as useful as "var".

On a related note, TypeScript solves member types with this exact code, even for public ones with generics and external references. I know those are very different compilers, but they probably already solved this.
Oct 14, 2014 at 6:47 PM
nvivo wrote:
nmgafter wrote:
Unlike local variable declarations, field declarations are part of the API shape of a class. As such the declaration (including the type) conveys information important to readers for non-local consumers of the API.
It looks like the bigger issue is using implicitly typed fields in the public API. But couldn't this be only for private fields?
This sounds restrictive, but in practice 99% of fields are private, so it would make the code cleaner in a lot of cases.
There are also difficulties with the semantics, since the initializing expression for one field in one class could reference another field in another class. It is not clear in what order the compiler should (be required to) resolve the initializing expressions, or how to give a clear diagnostic when things go awry.
This would be solved by the rule above. Any public/protected/internal field is part of the API, and then is required to declare the type.

This could also be used in conjunction with constructor type inference:

So, instead of:
class Foo {
    static List<string> initialList = new List<string> { "a", "b", "c" };
    List<string> list = new List<string>();
    SomeListHandler<IList<string>> handler = new SomeListHandler<IList<string>>(list);
    int maxValues = 10;
    string key = "abc";
}
simply write:
class Foo {
    static initialList = new List { "a", "b", "c" };
    list = new List<string>();
    handler = new SomeListHandler(list);
    maxValues = 10;
    key = "abc";
}
This is exactly "var" for private fields. Maybe it needs a "field" keyword in front to help the compiler, but would still be as useful as "var".

On a related note, TypeScript solves member types with this exact code, even for public ones with generics and external references. I know those are very different compilers, but they probably already solved this.
The problem isn't a technical one. It can clearly be done through type inference of the result, just as var is handled. From what nmgafter's response requiring the field's type is intentional as requiring the developer to declare the type of the field as a part of the structure of the class.
Oct 14, 2014 at 6:53 PM
nmgafter wrote:
No, but we are considering "type inference on constructors", which would allow you to write
    SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass();
That is kind of awkward looking. What happens if you have both generic and non-generic versions of SomeClass? Would that depend on whether there could be ambiguous overloads of the constructors? Have you considered a Java7-esque "diamond operator" where you still use brackets but can omit the generic type parameters when they can be inferred, e.g.:
private SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass<>();
// or
private SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass<,>();
Also, would the inference be based on the variable or field into which the result is being assigned, as your example implies, or would it be based only on arguments to the constructor? I admit that I've never liked the feel to the "diamond operator" as it always felt backwards, having the result define the expression as opposed to var which is the other way around. But I think most things in Java are backwards.
Oct 14, 2014 at 6:56 PM
The problem isn't a technical one. It can clearly be done through type inference of the result, just as var is handled. From what nmgafter's response requiring the field's type is intentional as requiring the developer to declare the type of the field as a part of the structure of the class.
That doesn't make a lot of sense. What you are saying means "it was designed this way 10 years ago, we won't change that because it was the intention at the time".

Couldn't this be said about any feature being implemented in C# 6. Why infer types, if types were intended to be declared in the constructor? Why use "var" if the type was intended to be part of the declaration?

I hope this is not the case, otherwise it wouldn't make much sense to have a language design forum. =)
Oct 14, 2014 at 7:41 PM
Halo_Four wrote:
That is kind of awkward looking. What happens if you have both generic and non-generic versions of SomeClass? Would that depend on whether there could be ambiguous overloads of the constructors?
How about if the generic version includes an implicit widening operator from the non-generic version? I think that would be perfectly legal, in which case the proposed feature could change the meaning of existing legal code. What benefit would such a new syntax have versus simply allowing
protected var firstField = new MyType(args);
Such a statement would make it extremely obvious what the type of the field in question was. If the rules would require the type of the expression to be clearly implied by the right-hand side, I don't see what harm there would be in allowing the left-hand side to indicate that the right-hand side specifies the type.
Oct 14, 2014 at 9:01 PM
supercat wrote:
Halo_Four wrote:
That is kind of awkward looking. What happens if you have both generic and non-generic versions of SomeClass? Would that depend on whether there could be ambiguous overloads of the constructors?
How about if the generic version includes an implicit widening operator from the non-generic version? I think that would be perfectly legal, in which case the proposed feature could change the meaning of existing legal code. What benefit would such a new syntax have versus simply allowing
protected var firstField = new MyType(args);
Such a statement would make it extremely obvious what the type of the field in question was. If the rules would require the type of the expression to be clearly implied by the right-hand side, I don't see what harm there would be in allowing the left-hand side to indicate that the right-hand side specifies the type.
I was referring specifically to the constructor inference syntax noted by nmgafter, not inference of the field type itself (which appears to be a non-starter given his response.)

To reframe the question with a local, what would the following do?
List<string> list = new List();
Would that be legal and would the compiler infer the generic type argument based on the type to which it is assigned or would some arguments to the constructor be required?
List<string> list = new List(new[] { "foo", "bar" });
What would happen in either case if there was both a List class and a List<T> class?

The following is an example of Java7, where the compiler infers the generic type parameters based on the type to which the expression is assigned:
List<String> list = new ArrayList<>();
Oct 14, 2014 at 9:09 PM
nvivo wrote:
The problem isn't a technical one. It can clearly be done through type inference of the result, just as var is handled. From what nmgafter's response requiring the field's type is intentional as requiring the developer to declare the type of the field as a part of the structure of the class.
That doesn't make a lot of sense. What you are saying means "it was designed this way 10 years ago, we won't change that because it was the intention at the time".

Couldn't this be said about any feature being implemented in C# 6. Why infer types, if types were intended to be declared in the constructor? Why use "var" if the type was intended to be part of the declaration?

I hope this is not the case, otherwise it wouldn't make much sense to have a language design forum. =)
Well, I can't quite speak for nmgafter. I only meant to convey that there is nothing stopping the compiler from being able to do this today except for the explicit decision to not allow it. From what it sounds like the consideration is that the type of the fields are too important as a component of the structure of the type for it to be omitted and that someone reading the code should be able to quickly glean the type information just by glancing at it.

I would probably agree that if the field is assigned with an initializer that it either a literal or a constructor call that it does make sense to allow some form of shorthand, e.g.
private var s = "FOO"; // s is obviously a string
private var i = 1234; // i is obviously an int
private var d = 1234.56; // d is obviously a double
private var list = new List<string>(); // list is obviously a List<string>
But I probably wouldn't extend that to the results of arbitrary method calls since the type can't be as easily determined without looking at the definition of the method.
Oct 15, 2014 at 12:12 AM
Halo_Four wrote:
I was referring specifically to the constructor inference syntax noted by nmgafter, not inference of the field type itself (which appears to be a non-starter given his response.)
Having the type of the right-hand side inferred based upon the left-hand side would be totally unlike anything else I can think of in C# or Java. Note that the reason Java uses the diamond syntax is that the constructor neither knows nor cares what generic type it's being asked to construct--a situation totally unlike the one in .NET.

The stated reason for disallowing the use of var with class members is that in some situations it may not be clear what the inferred type should be. The natural remedy for that would be to allow use of var with class members in those particular situations where the type is perfectly clear. If the maintainers of C# really have some other reason for disallowing var, then the rest of the world will just have to keep on wasting time cluttering up their code with redundant type specifications.
To reframe the question with a local, what would the following do?
How about:
public class Evil<T>    {    }
public class Evil : Evil<int>    {    }
public static class EvilTest
{
    public static Evil<int> EvilInt = new Evil();
}
Not only do types Evil and the generic Evil<T> both exist, but the non-generic type specified on the right would be assignable to the generic one on the left.
Oct 15, 2014 at 12:26 AM
supercat wrote:
How about:
public class Evil<T>    {    }
public class Evil : Evil<int>    {    }
public static class EvilTest
{
    public static Evil<int> EvilInt = new Evil();
}
Not only do types Evil and the generic Evil<T> both exist, but the non-generic type specified on the right would be assignable to the generic one on the left.
I doubt this is a real problem. The compiler runs in several steps, type inference must be one of last ones. Just like extension methods, type inference should be chosen only if no concrete implementations exist.

In this case, there is a class Evil, so you cannot assign it to Evil<int> for the same reason you cannot do "string x = new object()".

But I agree the syntax is very strange. Constructor type inference should be used when initializing things with var. I would go as far as say that it should give a warning or error to have code like this:
SomeClass<string, AnotherClass<KeyValuePair<string, object>>> fieldName = new SomeClass();
Oct 15, 2014 at 12:37 AM
Edited Oct 15, 2014 at 12:40 AM
nvivo wrote:
I doubt this is a real problem. The compiler runs in several steps, type inference must be one of last ones. Just like extension methods, type inference should be chosen only if no concrete implementations exist.
The problem with that is that it creates a situation where code may have one of two very different behaviors depending upon what types the compiler can locate.
In this case, there is a class Evil, so you cannot assign it to Evil<int> for the same reason you cannot do "string x = new object()".
Did you not notice class Evil : Evil<int> ? I've tested the code and it works just fine, with the compiler storing a reference to an instance of non-generic class Evil into a storage location of its base type Evil<int>. For the compiler not to interpret the code in that fashion would represent a breaking change, but for it to interpret the code as public static Evil<int> EvilInt = new Evil<int>(); in the absence of non-generic Evil would mean that adding a class in any scope the compiler could see could silently alter the program's behavior.
But I agree the syntax is very strange. Constructor type inference should be used when initializing things with var.
Allowing var to be used for fields would match an existing syntax in such a way that if a typical programmer who was familiar with var was shown a field declaration using the syntax and informed that the particular declaration was legal, that person would have no trouble whatsoever guessing what it meant, nor what the type of the resulting field would be. Using a different syntax simply muddles things.
Oct 15, 2014 at 12:42 AM
supercat wrote:
Did you not notice class Evil : Evil<int> ? I've tested the code and it works just fine, with the compiler storing a reference to an instance of non-generic class Evil into a storage location of its base type Evil<int>.
I saw, didn't know it compiled. I thought you were just supposing. If that is the case, this is certainly a bug. Makes no sense to allow ambiguous code pass, C# usually always complain about that.
Oct 15, 2014 at 1:07 AM
nvivo wrote:
supercat wrote:
Did you not notice class Evil : Evil<int> ? I've tested the code and it works just fine, with the compiler storing a reference to an instance of non-generic class Evil into a storage location of its base type Evil<int>.
I saw, didn't know it compiled. I thought you were just supposing. If that is the case, this is certainly a bug. Makes no sense to allow ambiguous code pass, C# usually always complain about that.
It's perfectly legal. Evil and Evil<T> are two completely different types in .NET, and it's perfectly legal for a non-generic type to inherit from a generic type as long as it provides the generic type arguments. This is very different from Java where generics are type erasures and exist purely as metadata.

That's why I commented on the ambiguities that constructor inference could potentially cause. It's not all that uncommon to see two class types like that. For example, both Task and Task<T> exist.

As for type inference, if you were to do var x = new Evil(); then clearly x would be an Evil and not an Evil<int>, but x could be assigned to a different variable of type Evil<int> without a type cast or fear of an exception.
Oct 15, 2014 at 1:22 AM
Halo_Four wrote:
It's perfectly legal. Evil and Evil<T> are two completely different types in .NET, and it's perfectly legal for a non-generic type to inherit from a generic type as long as it provides the generic type arguments. This is very different from Java where generics are type erasures and exist purely as metadata.
I got the reference to string/object backwards but the rule still applies. It shouldn't be legal, this happens because for some reason, the type inference is being chosen before the real type. There must be a reason the compiler team choose this, but this should give an error at a later point in the compilation process.

If you declare 2 classes with the same name, but different namespaces, C# complains it doesn't know which one to choose. The same thing should happen here, the identifier is somewhat ambiguous.

If a rule exists, it should be that the real class has preference over type inference. But I'd go with a warning or error. This code shouldn't run like this.

You should fill an issue for this.
Oct 15, 2014 at 3:03 AM
nvivo wrote:
Halo_Four wrote:
It's perfectly legal. Evil and Evil<T> are two completely different types in .NET, and it's perfectly legal for a non-generic type to inherit from a generic type as long as it provides the generic type arguments. This is very different from Java where generics are type erasures and exist purely as metadata.
I got the reference to string/object backwards but the rule still applies. It shouldn't be legal, this happens because for some reason, the type inference is being chosen before the real type. There must be a reason the compiler team choose this, but this should give an error at a later point in the compilation process.

If you declare 2 classes with the same name, but different namespaces, C# complains it doesn't know which one to choose. The same thing should happen here, the identifier is somewhat ambiguous.

If a rule exists, it should be that the real class has preference over type inference. But I'd go with a warning or error. This code shouldn't run like this.

You should fill an issue for this.
But they don't have the same name. One is Evil and one is Evil<T>. The generic parameter is a part of the name. Actually the name of the type in the framework is the following:
Evil`1
If there was also Evil<T1, T2> that would be a different type still, with the framework name of:
Evil`2
This is quite intentional and not a bug. They cannot be considered ambiguous because you cannot refer to the type without the generic type parameters. Even if you want a raw type you'd have to use typeof(Evil<>) or typeof(Evil<,>) which is completely unambiguous. If you want the generic type by name through reflection you are required to use the framework name complete with back-ticks and the count of the generic type parameters.

IEnumerable and IEnumerable<T> are completely different interfaces. Task and Task<T> are completely different classes. Action and Action<T> and Action<T1,T2> and Action<T1,T2,T3> etc., are all different delegates.

This is very different from Java where List and List<?> and List<String> are all the same class because the generic type parameter is metadata that doesn't survive through to runtime and the underlying types are all List<Object> with zero runtime enforcement and zero runtime performance enhancements.

The only place where this becomes confusing is where constructor inference is involved, assuming that the generic parameters can be inferred without the need for constructor arguments. I've not seen this proposed feature really fleshed out and the only snippet example is using Tuple<> where constructor arguments were supplied that can be used to infer the generic type parameters.

None of this is an issue with type inference.
Oct 15, 2014 at 10:41 AM
Halo_Four wrote:
But they don't have the same name. One is Evil and one is Evil<T>. The generic parameter is a part of the name.
This is not relevant to the code semantic. We are discussing how the compiler should interpret tokens to produce the compilation result, not how that result is written in IL.

The token "Evil()" in that specific context may be interpreted in 2 ways:
  • A call to constructor of "Evil"
  • A call to constructor of Evil<int> by inferring the type from the declaration
From my point of view, it makes sense that the concrete class should take precedence over type inference, but from your test this is not what is happening. It may be intentional, just doesn't make much sense to me.

Anyway, this is why I think implicitly typed fields would be a better solution. Type inference solves another problem, and this type of code is just weird.
Oct 15, 2014 at 10:48 AM
Quick note: I just ran the code from @supercat in VS14 CTP4, and it is actually choosing the Evil concrete class. It seems type inference is not even in place yet. Either you are using another compiler version or this discussion doesn't even make sense.
Oct 15, 2014 at 11:43 AM
nvivo wrote:
Quick note: I just ran the code from @supercat in VS14 CTP4, and it is actually choosing the Evil concrete class. It seems type inference is not even in place yet. Either you are using another compiler version or this discussion doesn't even make sense.
We're discussing constructor type inference which is not a feature slated to be implemented in C# 6.0 but was mentioned by nmgafter earlier in this thread.