PrimitiveValueType and NumericValueType base classes

Topics: C# Language Design
Apr 30, 2014 at 5:34 PM
Edited Apr 30, 2014 at 5:37 PM
The following proposal is based on point 5 in this article: http://www.gamasutra.com/blogs/PedroGuida/20140113/208518/Regarding_the_future_of_C_.php

.NET's CLR treats structs in a special way, even though they have a base class: ValueType. So the idea is to have two new classes derived from ValueType to create new type of structs: PrimitiveValueType and NumericValueType.

PrimitiveValueType would inherit directly from ValueType, and would not allow declaring reference-type field members, only value-type fields’ members that do not hold reference-types.

In order to declare this kind of struct devs must use the word “primitive”, like:
public primitive MyNewPrimitiveType
{
   …
}
The benefit here is to open the door for generic pointers in future versions of C#. So, at compile time, it would be safe to have the following operation:
Public void Calculate1 (TPrimitive* myPrimitivePtr) 
   where TPrimitive : primitive
{
   …
}
Now, NumericValueType would inherit directly from PrimitiveValueType (and thus, indirectly from ValueType), so it would also allow to only declare value-type fields’ members that do not hold reference-types.

In order to declare this kind of primitive struct devs must use the word “number”, like:
public number MyNewNumbericType
{
   …
}
This would allow the implementation of numeric types like Percentage or Half (that is, half float).

And it would also bring a direct benefit related to generics. For example, devs could write operations like the following:
public TNumeric Calculate2(TNumeric number1, TNumeric number2, TNumeric number3)
   where TNumeric: numeric
{
    return number1 * number2 * number3;
}
Again, if generic pointers gets eventually implemented for C#, then Calculate1 operation could be written as:
Public void Calculate1 (TNumeric* myNumericPtr) 
   where TNumeric : number
{
   …
}
May 2, 2014 at 6:22 PM
Just a quick note: the request is a bit different from what I first suggested in the mentioned article.

The idea is to have both new kind of structs, being PrimitiveValueType a specification of ValueType and NumericValueType a specification of PrimitiveValueType, being declared the former with the word "primitive" and the latter with the word "number" (or "numeric", if you prefer).

The key difference between a primitive and a struct is that in the former it would not be allowed to declare member fields/properties for reference types, what in turn would allow to declare pointers safely, and, would allow to implement and additional feature to the language: generic pointers. Today, to declare a pointer for structs, the compiler first searches for reference types declared inside them, and only allows the pointer when no reference types are found. So, this could be extended to primitives (and thus, to numerics). For instance, a boolean could be redeclared as a primitive.

And a numeric would be a pointer that support basic math operations, so it would be safe to use them in generic calculations. Int, floats, decimal, and so on so forth, would be numerics, and as I said before, new ones could be implemented either by MSFT or the community, like Half and Percentage (to mention just a few).

Thoughts?
May 16, 2014 at 3:31 PM
Anyone?
May 16, 2014 at 4:14 PM
In general, all this is likely to require some level of support from the CLR. For example there's no way to encode additional generics constraints in metadata beyond the existing ones. Well, it may be possible by using modopt/modreq but that sounds like a hack. Also, it's not clear why would you want to add more pointer capabilities to a language that it is intended to be type safe.

In particular, the number stuff - just no. The fact that you define 'number' to also be a 'primitive' excludes any numeric-like types that aren't primitives - BigInteger in particular. 'number' is also problematic because it's vague. What operators does 'number' has? All the operators int has? All the operators float has?
May 16, 2014 at 5:19 PM
Edited May 16, 2014 at 5:46 PM
Finally! Someone ... thanks @mdanes for replying.

mdanes wrote:
it's not clear why would you want to add more pointer capabilities to a language that it is intended to be type safe.
I'm getting a bit tired of hearing the same thing over and over. Ditto when someone says "Use C++ for that".
If that would be the case then pointers should have been removed from the begining and still they were present since C# 1.0. Also, that rationale would have discouraged the .NET Team from adding reflection, dynamics, handlers and marshalling. Not to mention the upcoming .NET Native. Even Windows Phone 8 now allows unsafe contexts for .NET (unlike the Windows Phone 7).

As a game developer, dealing with pointers allows you to improve perf on special operations since you can, for example, get rid of index checking on arrays, avoid dual lookup steps on arrays holding structs (getting + setting), do stackallocs and pointer math to mention just a few features you can count on when you need to do advanced ops and or boost perf.

Plus, I'm not adding anything new. Just moving it from struct to the proposed primitive to make thinks a bit more clear. Today, you can declare a pointer to a struct until you find that it is true only when no reference types are declared as members of the struct (i.e: a member field). With my suggestion, any programmer will know from the begining that you can declare pointers only for primitives (not structs).

And with that, opening the door to generic pointers will also open the door to new possibilities.

Regarding numerics, like any proposal, it's something that could be refined until getting closed to a plausible state. Maybe, it could be also achieved not with numeric but by using other means to let the compiler know which operations the primitive is supporting.

As of the C# struct BigInteger, if it holds reference-type fields, maybe it could be redesigned so as to be declared as a primitive. For instance, BigInteger(byte*) instead of BigInteger(Byte[]).
May 16, 2014 at 5:38 PM
What I'd like to see to facilitate math operations and various other scenarios would be a means by which one could have one piece of source code expanded out using different types, in a fashion similar to C++ templates. While generics are generally superior to C++ templates, templates can do some useful things which generics cannot.

For example, if one is writing code to compute the determinant of a floating-point matrix, it would be helpful if the same piece of source code could serve as the guts for both float[,] and double[,]. It would be necessary to have separate methods in the CIL for the two types, but that shouldn't preclude the possibility of something like:
auto template [name, TMatrix, TResult] in { [DeterminantAsFloat, float, float], [DeterminantAsDouble. float, double], [Determinant, double, double] }
  TResult name(TMatrix[,] matrix)
  {
  }
and having the compiler auto-generate three methods with different identifier substitutions. If the debugger can support a separate view for "code that is executing" versus "current source file" [a feature which would also assist with "Edit & Continue", having the code show up in the former window in expanded form would help alleviate any confusion over what the expansion is doing.
May 16, 2014 at 5:43 PM
@supercat: that is indeed interesting.
May 16, 2014 at 5:48 PM
Ultrahead wrote:
As a game developer, dealing with pointers allows you to improve perf on special operations since you can, for example, get rid of index checking on arrays, avoid dual lookup steps on arrays holding structs (getting + setting), do stackallocs and pointer math to mention just a few features you can count on when you need to do advanced ops and or boost perf.
From the perspective of speed, safety, and semantics, it would have been immensely useful if .NET had included some forms of value-type arrays as a bona fide runtime feature (rather than a C# hack), so as to allow a class which needs a known-size collection of things which are addressable-by-index to avoid an extra level of indirection when accessing such things.

With regard to arrays of structures, those are already very efficient if the structures are properly designed. For structures which are meant to be used as aggregates (e.g. the coordinates of a point) the MSDN structure-design device is totally wrong, since it's designed for structures that are used as objects. If a structure has public fields, the statement someArray[23].X = 5; will write directly to that field without having to look at any other part of the structure. In VB.NET, the code
With someArray[index]
  .X = (.X + 3) And 255
  .Y = (.Y - 7) And 255
End With
will find the address of the structure once, and then use that computed address for both reads and both writes; the closest equivalent in C# would be:
void AdjustItem(ref Point pt)
{
  pt.X = (pt.X + 3) & 255;
  pt.Y = (pt.Y - 7) & 255;
}

...
AdjustItem(ref someArray[index]);
which the JIT might be able to in-line efficiently.
May 16, 2014 at 6:13 PM
Edited May 16, 2014 at 6:13 PM
supercat wrote:
From the perspective of speed, safety, and semantics, it would have been immensely useful if .NET had included some forms of value-type arrays as a bona fide runtime feature (rather than a C# hack), so as to allow a class which needs a known-size collection of things which are addressable-by-index to avoid an extra level of indirection when accessing such things.
We have fixed buffers, but they are bound to a few types.

supercat wrote:
... since it's designed for structures that are used as objects ...
Agreed. And that's why I use pointers whenever I can to avoid such case.
May 16, 2014 at 6:47 PM
Ultrahead wrote:
I'm getting a bit tired of hearing the same thing over and over. Ditto when someone says "Use C++ for that". If that would be the case then pointers should have been removed from the begining and still they were present since C# 1.0.
Well, it's only natural that you get "Use C++ for that" if you're trying to bring certain specific C++ features in C#. C++ has 2 large "problems" that scare people of, manual memory management and pointers. And you're trying to get more support for pointers in C#!
Also, that rationale would have discouraged the .NET Team from adding reflection, dynamics, handlers and marshalling. Not to mention the upcoming .NET Native. Even Windows Phone 8 now allows unsafe contexts for .NET (unlike the Windows Phone 7).
Reflection, dynamic and (event?) handlers don't compromise type safety, pointers do. Marshaling has to exists because otherwise you wouldn't be able to interact with the OS. C# pointers and unsafe contexts were primarily intended to be used for in interop scenarios, like marshaling.
As a game developer, dealing with pointers allows you to improve perf on special operations since you can, for example, get rid of index checking on arrays
That could be solved easily by asking the JIT team to provide an option to turn of array bounds checks in full trust. That's easy to implement and makes sense because I don't expect well written and tested code to depend on index out range exceptions.
Plus, I'm not adding anything new. Just moving it from struct to the proposed primitive to make thinks a bit more clear.
Well, you did. You asked for a new type that can be inserted in the value type hierarchy and additional generic constraints.
Regarding numerics, like any proposal, it's something that could be refined until getting closed to a plausible state. Maybe, it could be also achieved not with numeric but by using other means to let the compiler know which operations the primitive is supporting.
Yes, of course. Support for operators in generic methods has been requested for ages and there a few related threads about this already. It's a rather tough nut to crack, in particular because any good implementation is likely to require CLR support.
As of the C# struct BigInteger, if it holds reference-type fields, maybe it could be redesigned so as to be declared as a primitive. For instance, BigInteger(byte*) instead of BigInteger(Byte[]).
Obviously, that's not possible. Unless you suggest that BigInteger should manually manage the allocated byte*.

supercat wrote:
While generics are generally superior to C++ templates, templates can do some useful things which generics cannot.
Hmm, I have no idea how could generics be superior to C++ templates. Maybe because they're simpler to define and use but they're completely inferior to templates in regards to expressive power.
it would be helpful if the same piece of source code could serve as the guts for both float[,] and double[,]. It would be necessary to have separate methods in the CIL for the two types
No, there's no need for separate methods in this case. Even today it's perfectly possible to do this in CIL. The problem is that it isn't possible to put a constraint like "T : float, double" on a generic parameter.
In VB.NET, the code will find the address of the structure once and then use that computed address for both reads and both writes.
How do you know that? There's no telling what the JIT compiler will do.
the closest equivalent in C# would be
Assuming that the JIT compiler does its job then the following code should also be equivalent:
someArray[i].X = (someArray[i].X + 3) & 255;
someArray[i].Y = (someArray[i].Y - 7) & 255;
May 16, 2014 at 7:56 PM
mdanes wrote:
supercat wrote:
While generics are generally superior to C++ templates, templates can do some useful things which generics cannot.
Hmm, I have no idea how could generics be superior to C++ templates. Maybe because they're simpler to define and use but they're completely inferior to templates in regards to expressive power.
Using generics, one can create a List<Widget>, without anyone ever having to have the source code for both List<T> and Widget. Is that possible using C++ templates? If not, that's a MAJOR win for generics.

Also, it is possible to have a generic method which accepts a T and conditionally calls itself recursively with a Foo<T> or a Bar<T>. One could, if desired, produce a method which could accept any string consisting of Fs and Bs (e.g. "FFBBF") and would call a generic method Boz<T>, passing it a Foo<Foo<Bar<Bar<Foo>>>>>. Even though the number of types that could be passed to Boz is practically infinite, only the ones that were actually used would need to be created.
No, there's no need for separate methods in this case. Even today it's perfectly possible to do this in CIL. The problem is that it isn't possible to put a constraint like "T : float, double" on a generic parameter.
Operator and method overloads are bound by the compiler before CIL is produced. Even if the specified generic constraints existed, what kind of CIL could a compiler possibly produce for
bool NumberEquals<T1,T2>(T1 p1, T2 p2) where T1 in {int, long, float, double} where T2 in {int, long, float, double} 
{ return p1==p2; }
All sixteen combinations of operand types are legal with the == token (whether they should have been is another matter), but code will sometimes have to cast p1 to T2, or sometimes cast p2 to T1, and sometimes neither. I am unaware of anything in the CIL which could come even close to catching such semantics.
In VB.NET, the code will find the address of the structure once and then use that computed address for both reads and both writes.
How do you know that? There's no telling what the JIT compiler will do.
If the generated CIL only does the lookup once, why would the JIT do it more than once?
Assuming that the JIT compiler does its job then the following code should also be equivalent:

```C#
someArray[i].X = (someArray[i].X + 3) & 255;
someArray[i].Y = (someArray[i].Y - 7) & 255;
If the JIT happens to recognize that nothing in any of that code can change someArray or i, it might be able to recognize that someArray[i] is going to yield the same address all four times it's used, but in VB.NET the fact that the byref is only evaluated once is part of the specified semantics.
May 16, 2014 at 8:10 PM
Ultrahead wrote:
We have fixed buffers, but they are bound to a few types.
By my understanding (correct me if I'm wrong) fixed buffers aren't really a CIL feature; instead, something like:
// Assume `unsafe` context
struct boz {
    public fixed int foo[4];

    int getItem(int x) { return foo[x];}
}
would generate CIL which says boz has a four-word blob which should be labeled foo, and getItem() would compute the address of that blob, add four times the specified index, and dereference the resulting pointer. Nothing in the Runtime would have any understanding of what was going on beyond knowing that foo took four words of storage, and getItem was doing some unsafe pointer arithmetic.

Conceptually, there's no reason the CLR couldn't include fixed-size value arrays as a bona fide feature; I suspect the primary reason no such feature exists stems from a general antipathy toward value types in general. Still, given that one of the goals of .NET was supposed to be efficient inter-operation with COM and with API functions written in older languages, I would have thought allowing marshaled-by-reference on API structures that include fixed-sized arrays (without requiring the .NET code to e.g. replace e.g. int modes[4] with an int mode0,mode1,mode2,mode3;) would have been considered worthwhile.
May 16, 2014 at 8:55 PM
supercat wrote:
Using generics, one can create a List<Widget>, without anyone ever having to have the source code for both List<T> and Widget. Is that possible using C++ templates? If not, that's a MAJOR win for generics.
You need the source of List<T> but not the source of Widget. Yes, that fact that you need the source of List<T> can be considered a drawback but it also what makes templates powerful, they're not constrained by non language related features such as IL and metadata. You could add templates to C# but then you would run into the same issue, templates wouldn't be visible outside the assembly so if you want others to uses them you'd need to make the source available.
Also, it is possible to have a generic method which accepts a T and conditionally calls itself recursively with a Foo<T> or a Bar<T>. One could, if desired, produce a method which could accept any string consisting of Fs and Bs (e.g. "FFBBF") and would call a generic method Boz<T>, passing it a Foo<Foo<Bar<Bar<Foo>>>>>. Even though the number of types that could be passed to Boz is practically infinite, only the ones that were actually used would need to be created.
I have no idea what you're trying to say here.
Operator and method overloads are bound by the compiler before CIL is produced. Even if the specified generic constraints existed, what kind of CIL could a compiler possibly produce for... I am unaware of anything in the CIL which could come even close to catching such semantics.
Yes, it's not possible to do this for NumberEquals<T1, T2> but it can be done for NumberEquals<T>.
If the generated CIL only does the lookup once, why would the JIT do it more than once?
Because there's nothing to prevent it from doing that. The result of "lookup" needs to be stored somewhere, ideally in a register. If the register pressure is high it's not inconceivable that the JIT will decide to recalculate the address rather than spill the register to stack. I kind of doubt that JIT will do this but there's nothing that stops it.
If the JIT happens to recognize that nothing in any of that code can change someArray or i, it might be able to recognize that someArray[i] is going to yield the same address all four times it's used.
Do you see anything in those 2 lines of code that changes someArray or i? I don't.
By my understanding (correct me if I'm wrong) fixed buffers aren't really a CIL feature; instead, something like:
Fixed buffers are structs with a [StructLayout(Size = x)] attribute. The C# compiler generates unsafe pointer arithmetic to access them.
Conceptually, there's no reason the CLR couldn't include fixed-size value arrays as a bona fide feature; I suspect the primary reason no such feature exists stems from a general antipathy toward value types in general.
Could be but it's more likely that it's the increased complexity that this would add to the runtime. You'll need a new Array-like type that can represent these arrays and encode the length in it, much like the existing Array type can encode the dimensionality of a normal array. IL instructions that currently deal with normal arrays would need to be changed to also work with fixed arrays. You'll have to figure out to do about boxing, you probably don't want to box a int[1000]. But if you can't box then things like LINQ won't work on such arrays and that's inconsistent with normal arrays. In summary, it's a lot of trouble.
May 16, 2014 at 9:21 PM
Edited May 16, 2014 at 9:33 PM
mdanes wrote:
Well, it's only natural that you get "Use C++ for that" if you're trying to bring certain specific C++ features in C#.
Certain specific C++ features? Pointers are already present in C#.

mdanes wrote:
C++ has 2 large "problems" that scare people of, manual memory management and pointers. And you're trying to get more support for pointers in C#!
So, you're rationale is based on fear.

mdanes wrote:
Reflection, dynamic and (event?) handlers don't compromise type safety, pointers do. Marshaling has to exists because otherwise you wouldn't be able to interact with the OS. C# pointers and unsafe contexts were primarily intended to be used for in interop scenarios, like marshaling.
Being picky, even reference-type casting could compromise type-safety (not in the CLR of course but in your code). Remember "(type as anothertype).SomeOperation()"?

mdanes wrote:
That could be solved easily by asking the JIT team to provide an option to turn of array bounds checks in full trust. That's easy to implement and makes sense because I don't expect well written and tested code to depend on index out range exceptions.
If it were that easy, it could have been here since C# v1.0 ...

mdanes wrote:
Well, you did. You asked for a new type that can be inserted in the value type hierarchy and additional generic constraints.
I was refering to the possibilty of using pointers in C# which was already there from the begining.

mdanes wrote:
Yes, of course. Support for operators in generic methods has been requested for ages and there a few related threads about this already.
Well, there's one more request to add to the list, then.

mdanes wrote:
Obviously, that's not possible. Unless you suggest that BigInteger should manually manage the allocated byte*.
I don't know the implementation of BigInteger to state that redefining it is not possible. And I do neither know how it allocates passed data. Maybe there is no need to allocate a byte[] in the first place (ditto for byte*).
May 16, 2014 at 9:49 PM
Ultrahead wrote:
Certain specific C++ features? Pointers are already present in C#.
So, you're rationale is based on fear.
To be clear: I'm not against pointers. I'm just realistic and saying that such a request is not likely to get much traction in the community and in the language team.
Being picky, even reference-type casting could compromise type-safety (not in the CLR of course but in your code). Remember "(type as anothertype).SomeOperation()"?
That's not what type safety is about. Type safety means that you'll never be able to call anothertype.SomeOperation on an object that is not of type anothertype or derived from it. The exception you get doesn't indicate that the type safety has been compromised, on the contrary. Contrast this with C++ where such a call is possible, usually with ugly consequences.
If it were that easy, it could have been here since C# v1.0 ...
It is easy from a technical point of view. Convincing the CLR/JIT teams to do it may be another story. Though it may be easier than convincing the C# team to improve pointer support :)
I was refering to the possibilty of using pointers in C# which was already there from the begining.
So what's all that primitive and number stuff about? Those require CLR support.
I don't know the implementation of BigInteger to state that redefining it is not possible. And I do neither know how it allocates passed data. Maybe there is no need to allocate a byte[] in the first place (ditto for byte*).
Let's get real. BigInteger has to store the number somewhere and since the number has arbitrary precision that somewhere must be an array. There's no way around this.
May 16, 2014 at 10:18 PM
Edited May 16, 2014 at 10:29 PM
mdanes wrote:
... such a request is not likely to get much traction in the community and in the language team.
According to who?

mdanes wrote:
That's not what type safety is about. Type safety means that you'll never be able to call anothertype.SomeOperation on an object that is not of type anothertype or derived from it. The exception you get doesn't indicate that the type safety has been compromised, on the contrary.
I know what type safety is and never said that type-safety on the CLR is compromised. What I meant is that there is many ways a programmer could (accidentally or not) attempt to execute unavailable operations during runtime for the underlying type that will never get caught by the compiler. Take the DLR, if you prefer. If there is room for human error here, then why all the ranting about pointers? You can get memory problems in C#, like leaks, even by dealing erroneously with events.

mdanes wrote:
It is easy from a technical point of view. Convincing the CLR/JIT teams to do it may be another story. Though it may be easier than convincing the C# team to improve pointer support :)
Well, we shall see.

mdanes wrote:
So what's all that primitive and number stuff about? Those require CLR support.
And? That's your argument to say "No, we don't need this"? First, fear. Now, modifications to the CLR. Why are we suggesting new features? Following your arguments, let's stop here and let alone C# as it is.

mdanes wrote:
Let's get real. BigInteger has to store the number somewhere and since the number has arbitrary precision that somewhere must be an array. There's no way around this.
Well, then, if BigInteger stores an array, you cannot declare a pointer to it in the first place. And the main purpose of the proposal is to move all structs that can be treated as pointers to the suggested primitive type. So, nothing changes for BigInteger here. Therefore, BigInteger is out of the scope as a primitive unless it gets redefined (as it also is currently for pointers).
May 16, 2014 at 10:37 PM
mdanes wrote:
supercat wrote:
Using generics, one can create a List<Widget>, without anyone ever having to have the source code for both List<T> and Widget. Is that possible using C++ templates? If not, that's a MAJOR win for generics.
You need the source of List<T> but not the source of Widget. Yes, that fact that you need the source of List<T> can be considered a drawback but it also what makes templates powerful, they're not constrained by non language related features such as IL and metadata. You could add templates to C# but then you would run into the same issue, templates wouldn't be visible outside the assembly so if you want others to uses them you'd need to make the source available.
My idea would not be to have any class expose templates, but rather to simply have templates be a compiler feature which could be used to construct many similar methods without having to cut, paste, and edit each one, and without having to use an external preprocessing tool.
Also, it is possible to have a generic method which accepts a T and conditionally calls itself recursively with a Foo<T> or a Bar<T>.
I have no idea what you're trying to say here.
In C++, the compiler must be able to enumerate every type a program could ever use on any execution path, and code for every such type must be generated before any code can run. In .NET languages, generic types can come into existence while a program is running. The set of types a program uses may very well depend upon information received at run-time, and it's possible for the set of types a program would be capable of creating (given proper input) could make the number of electrons in the universe seem infinitesimal by comparison.
Operator and method overloads are bound by the compiler before CIL is produced. Even if the specified generic constraints existed, what kind of CIL could a compiler possibly produce for... I am unaware of anything in the CIL which could come even close to catching such semantics.
Yes, it's not possible to do this for NumberEquals<T1, T2> but it can be done for NumberEquals<T>.
How would one write a bool NumberEquals<T>(T p1, T p2); method which would behave the same as == for any numeric type without needing to have it use special-case code for System.Single and System.Double?
If the generated CIL only does the lookup once, why would the JIT do it more than once?
Because there's nothing to prevent it from doing that. The result of "lookup" needs to be stored somewhere, ideally in a register. If the register pressure is high it's not inconceivable that the JIT will decide to recalculate the address rather than spill the register to stack. I kind of doubt that JIT will do this but there's nothing that stops it.
Conceivable if the array reference and index are local fields and thus can be guaranteed not to be changed by some other thread. In the event that they do change, the JITter would have to ensure that the old values are used in generating all four lvalues.
If the JIT happens to recognize that nothing in any of that code can change someArray or i, it might be able to recognize that someArray[i] is going to yield the same address all four times it's used.
Do you see anything in those 2 lines of code that changes someArray or i? I don't.
No, but how far should the JITter be willing to look for common subexpressions it can reuse? If all the expressions are simple enough, the JITter may be able to find the optimization, but if e.g. someArray were a class field, and one of the expressions invoked a class method of any significant complexity, I would expect that the JITter would assume that someArray might have changed, and thus recompute the lvalue, rather than go to great lengths to ensure that it could not.
By my understanding (correct me if I'm wrong) fixed buffers aren't really a CIL feature; instead, something like:
Fixed buffers are structs with a [StructLayout(Size = x)] attribute. The C# compiler generates unsafe pointer arithmetic to access them.
That's basically what I thought. I consider that a C# hack more than a CIL feature.
Conceptually, there's no reason the CLR couldn't include fixed-size value arrays as a bona fide feature; I suspect the primary reason no such feature exists stems from a general antipathy toward value types in general.
Could be but it's more likely that it's the increased complexity that this would add to the runtime. You'll need a new Array-like type that can represent these arrays and encode the length in it, much like the existing Array type can encode the dimensionality of a normal array. IL instructions that currently deal with normal arrays would need to be changed to also work with fixed arrays. You'll have to figure out to do about boxing, you probably don't want to box a int[1000]. But if you can't box then things like LINQ won't work on such arrays and that's inconsistent with normal arrays. In summary, it's a lot of trouble.
From a type-system perspective, I would expect that e.g. a fixed array of four integers would basically be equivalent to public struct int4 { int v0,v1,v2,v3; }; the only necessary additional feature would be a CIL instruction which given an integer from (in this case) 0 to 3 would return a byref to v0, v1, v2, or v3. The idea wouldn't be that the things should be used for holding huge amounts of data, but rather to allow marshal-by-reference of types which contain arrays.
May 16, 2014 at 11:37 PM
supercat wrote:
How would one write a bool NumberEquals<T>(T p1, T p2); method which would behave the same as == for any numeric type without needing to have it use special-case code for System.Single and System.Double?
ldarg.0
ldarg.1
ceq
ret
Also works with int and long. At least in theory, it's possible that the verifier will reject such code due to the lack of a float/double generic constraint. In any case, this is doable in IL, your original example with T1, T2 isn't because it requires conversions that are normally defined by the language and not by the runtime.
Conceivable if the array reference and index are local fields and thus can be guaranteed not to be changed by some other thread. In the event that they do change, the JITter would have to ensure that the old values are used in generating all four lvalues.
Yes, that's correct. What can happen is that the array reference is in a register, the index is in another register and you need yet another register to keep the element address. In a loop the array and index will be needed the next iteration so it makes sense to be kept in registers. The calculated address, maybe, if it's used enough times. Besides, for small array elements such as int or long the address calculation can be folded into memory operand.
No, but how far should the JITter be willing to look for common subexpressions it can reuse?
Well, at least in the example you posted that shouldn't be a problem. If i is actually a complex expression you could store that in a variable for reuse. This simplifies the job for the JIT compiler without needed that ref trick.
but if e.g. someArray were a class field, ... I would expect that the JITter would assume that someArray might have changed
Nope, it doesn't unless the field is volatile.
From a type-system perspective, I would expect that e.g. a fixed array of four integers would basically be equivalent to public struct int4 { int v0,v1,v2,v3; }; ...
Mmm, that could work but it's not clear how is this better than the current approach.
May 17, 2014 at 12:38 AM
mdanes wrote:
ldarg.0
ldarg.1
ceq
ret
``` Also works with int and long. At least in theory, it's possible that the verifier will reject such code due to the lack of a float/double generic constraint. In any case, this is doable in IL, your original example with T1, T2 isn't because it requires conversions that are normally defined by the language and not by the runtime.
Would the verifier accept a ceq which could sometimes be invoked when two float values were on the stack, and sometimes invoked with two double values? I know that in Java, when multiple code paths can reach a place in the code, stack entries may only be used as the least-specific type they could possibly contain; if they could contain primitives, but aren't guaranteed to contain a particular primitive type, they can't be read at all. Even if in every code path that could reach the ceq the arguments would be of matching types, I wouldn't expect the verifier to accept that. What's needed isn't just a "numeric type" generic constraint, but a means by which the verifier and code generator could understand generic numeric types.
but if e.g. someArray were a class field, ... I would expect that the JITter would assume that someArray might have changed
Nope, it doesn't unless the field is volatile.
If one of the expressions called an instance method, then unless the JITter could examine all the code reachable from that method, it would have to assume any field might have changed, volatile or not.
From a type-system perspective, I would expect that e.g. a fixed array of four integers would basically be equivalent to public struct int4 { int v0,v1,v2,v3; }; ...
Mmm, that could work but it's not clear how is this better than the current approach.
With the current approach, the only "safe-code" way to allow marshal-by-reference on a structure which contains any sort of array is to write out the array as a sequence of fields, and then use a switch statement if it's necessary to access them by number. Doable, but rather clunky. The preference seems instead to use marshal-by-value, but that requires recopying everything on each transition between managed and unmanaged code, whether or not anything actually changed.

BTW, if you're familiar with CIL, I've long been curious: what would be the effect of having a class expose a method which was visible in places the types used as parameters or return values weren't? Neither C# nor VB.NET allows such a thing, but I can certainly imagine some useful semantics that could be defined for such scenarios.
May 17, 2014 at 1:42 AM
Edited May 17, 2014 at 1:47 AM
mdanes wrote:
Fixed buffers are structs with a [StructLayout(Size = x)] attribute. The C# compiler generates unsafe pointer arithmetic to access them.
That would be a solution to the BigInteger issue since you can use pointers for structs with fixed buffers and byte is a type accepted by this kind of structs. Unfortunately, the size of the buffer has to be set when declared. So, if C# allowed to declare the size of the buffer within the constructor (like with readonly fields), hack or not, then BigInteger immutable type could be redesigned as a "pure" struct (= one without reference-type fields), and therefore, as a primitive.

Say, something like:
public struct BigInteger
{
    private fixed byte _fixedBuffer[]; // Unfortunately, this cannot be done.

    public unsafe BigInteger(byte[] array)
    {
        _fixedBuffer.Lenght = array.Lenght; // To set the lenght if new is not allowed.

        // Do copy operations.
        fixed (byte* buffer = _fixedBuffer)
        {
              ...
        }
    }

    ...
}
If that could be done, then nothing would stop the .NET Team to declare it like:
public primitive BigInteger
{
    private fixed byte _fixedBuffer[]; // Unfortunately, this cannot be done.

    public unsafe BigInteger(byte[] array)
    {
        _fixedBuffer.Lenght = array.Lenght; // To set the lenght if new is not allowed.

        // Do copy operations.
        fixed (byte* buffer = _fixedBuffer)
        {
              ...
        }
    }

    ...
}
May 17, 2014 at 9:54 AM
supercat wrote:
Would the verifier accept a ceq which could sometimes be invoked when two float values were on the stack, and sometimes invoked with two double values?
Nope. And the compiler itself will likely barf on such code as well, it would have to clone the basic block because in general you need different native instructions for different type. Doable but weird.

Anyway, there's a lot that can be done without mixing types in such a way. You can do Sum, Average, Min, Max. You can do matrix multiplication or dot/cross product of 2 vectors. What doesn't really work is using non primitive types such as Decimal or BigInteger. Those aren't handled by IL instructions and instead you have to call operator overloads. And then the whole thing explodes. To call methods you need to find the methods so you need some kind of constraints. Operator overloads are static so they don't fit in interface constraints. And then people start asking for static interface members but those require even more CLR changes than a constraint such as 'number'. In summary, it's messy.

Templates would solve this easily but those have the problem you mentioned, the template source is needed. Personally I don't consider that a problem, probably due to the fact that I'm used to it, but I can see that people will be probably be baffled when they put a template in project A and they find that it can't be used in project B without copying the source code from project A.
If one of the expressions called an instance method, then unless the JITter could examine all the code reachable from that method, it would have to assume any field might have changed, volatile or not.
Probably. But in general you shouldn't assume that a field is accessed n-times in source it will also be accessed n-times in the generated native code. There may be less than n accesses but there shouldn't be more than n. For this reason you shouldn't attempt to do ref tricks like than one you have shown, it's better to measure the performance and look at the generated code before making such tweaks. You know what they say, premature optimization is the root of all evil. It's an exaggeration but it's certainly not an incorrect view.
With the current approach, the only "safe-code" way to allow marshal-by-reference on a structure which contains any sort of array is to write out the array as a sequence of fields, and then use a switch statement if it's necessary to access them by number. Doable, but rather clunky. The preference seems instead to use marshal-by-value, but that requires recopying everything on each transition between managed and unmanaged code, whether or not anything actually changed.
So what you really want is to have fixed arrays that don't require unsafe code? Yes, that would be nice, even in scenarios that aren't marshaling related. Roslyn itself might find this useful.
BTW, if you're familiar with CIL, I've long been curious: what would be the effect of having a class expose a method which was visible in places the types used as parameters or return values weren't? Neither C# nor VB.NET allows such a thing, but I can certainly imagine some useful semantics that could be defined for such scenarios.
Like a public method where some parameters are internal types? The verifier would very likely reject such a thing but otherwise it should work. This is a policy restriction, not a technical one. That said, I have no idea how you could call such a method from another assembly. You'd need to get your hands on a reference of that internal type but since it is internal you can't define such a reference.

Ultrahead wrote:
That would be a solution to the BigInteger issue since you can use pointers for structs with fixed buffers and byte is a type accepted by this kind of structs.
Problem is, the buffer isn't really fixed. It's fixed for a particular instance of BigInteger but different instances need different sizes. What you show in your last example isn't a fixed buffer because you're changing its size at runtime.
May 17, 2014 at 1:42 PM
Edited May 17, 2014 at 1:45 PM
@mdanes: it's not fixed at compile time but it gets fixed when the struct gets constructed as if it were a readonly field. Still unsafe and copying required but if eventually allowed it would help to redefine BigInteger as a primitive.
May 17, 2014 at 4:52 PM
mdanes wrote:
Templates would solve this easily but those have the problem you mentioned, the template source is needed. Personally I don't consider that a problem, probably due to the fact that I'm used to it, but I can see that people will be probably be baffled when they put a template in project A and they find that it can't be used in project B without copying the source code from project A.
Basically my limited idea for templates would be that whoever wrote the template would have to know at that time expansions would be needed, rather than being a means by which someone could release general-purpose code for others to use.
Probably. But in general you shouldn't assume that a field is accessed n-times in source it will also be accessed n-times in the generated native code. There may be less than n accesses but there shouldn't be more than n. For this reason you shouldn't attempt to do ref tricks like than one you have shown, it's better to measure the performance and look at the generated code before making such tweaks. You know what they say, premature optimization is the root of all evil. It's an exaggeration but it's certainly not an incorrect view.
Although I mentioned With as offering some of the advantages of pointers from a performance standpoint, I think of it more in terms of semantics: if the programmer's intention is to make repeated accesses to the same structure, it's better to use a language construct that says that, than to either use what happens to be the same expression multiple times and hope that (1) the JITter doesn't produce inferior code as a result, and (2) someone doesn't accidentally make some change to three of the four places while intending to change all four.
So what you really want is to have fixed arrays that don't require unsafe code? Yes, that would be nice, even in scenarios that aren't marshaling related. Roslyn itself might find this useful.
I consider that evidence of the concept's usefulness. As a feature, my instinct would be to view it as an lvalue version of a switch/case. Actually, I wouldn't be surprised if it could be done basically like that in CIL, since CIL has a byref type.

If one were to write in CIL the equivalent of:
ref int foo;
switch(index)
{
   case 0 : ref foo = ref child0; break;
   case 1 : ref foo = ref child0; break;
   case 2 : ref foo = ref child0; break;
   default: Throw new IndexOutOfRangeException(...);
  }
  foo = bigNastyExpression(foo, bar, boz, bozo); // Or whatever--operate on the thing identified by `foo`.
I wonder if that would work reasonably? If so, then it would be possible for C# to let one define an alias to a group of fields which would behave as though it had an indexed getter, and have C# translate indexed operations into code similar to the above.
BTW, if you're familiar with CIL, I've long been curious: what would be the effect of having a class expose a method which was visible in places the types used as parameters or return values weren't? Neither C# nor VB.NET allows such a thing, but I can certainly imagine some useful semantics that could be defined for such scenarios.
Like a public method where some parameters are internal types? The verifier would very likely reject such a thing but otherwise it should work. This is a policy restriction, not a technical one. That said, I have no idea how you could call such a method from another assembly. You'd need to get your hands on a reference of that internal type but since it is internal you can't define such a reference.
There are a couple of places I could see such a concept being useful if the verifier wouldn't reject it:
  1. There are times when it would be useful to have an assembly define a reference type which could be given to outside code and received back from outside code, but which could only identify objects created within the assembly. Abstract classes can be used for that in cases where none of the implementations would need to derive from anything else, but being able to use interfaces would alleviate that restriction.
  2. It would be helpful if there were a means by which a function could return a value which could be passed back to other members of its assembly (either as a normal method parameter, or by invoking a member thus causing it to be passed back as this) but which could not otherwise be persisted by the recipient. Presently, if an immutable class is implemented by wrapping the only extant reference to a mutable object (a common pattern) and has methods that allow code like:

    myCar = myCar.WithInteriorColor(desiredColor).WithSoundSystem(desiredSoundSystem).WithSeatCushions(desiredCushions);
it would be necessary for WithInteriorColor to make a copy of myCar, then WithSoundSystem to make a copy of the car received from WithInteriorColor, and WithSeatCushions to make a copy of the car received from WithSoundSystem. If the references returned from WithInteriorColor and WithSoundSystem could not be persisted outside the Car assembly, however, they could safely return the mutable backing object, and allow the later methods to use them. Later, when the result was cast back to Car, that would take the only reference anywhere in the universe to a CarGuts and wrap it in an immutable type, without having to make a defensive copy.

Knowledge that there exists exactly one reference to a particular object is a very useful thing to have; unfortunately, the only way for a piece of code to have such knowledge is in many cases for it to create the object itself and never expose it anywhere. I would like to see more cases in which code could assert in statically verifiable fashion(*) that a particular reference would be the only one anywhere in the universe that identifies its target, since in many cases that could lead to code which was both more efficient and more robust.

(*) In some cases, it would be necessary for code to declare that certain functions (e.g. virtual factory methods) are "presumed" to return the only reference to an object, so static verification wouldn't guard against deliberate rogue code, but would guard against accidental mistakes.

Actually, the idea of having an "only reference to object anywhere in the universe" should go a bit beyond type system kludgery, but it would be nice to have a means of achieving it in at least some cases.
May 18, 2014 at 11:36 AM
Edited May 18, 2014 at 7:03 PM
Ultrahead wrote:
it's not fixed at compile time but it gets fixed when the struct gets constructed as if it were a readonly field. Still unsafe and copying required but if eventually allowed it would help to redefine BigInteger as a primitive.
You can't have a value type with a size defined at runtime, the size must be known at (JIT) compile time and must be the same for all values of a given type. This isn't due to a limitation of the language or runtime, it's simple that values with different sizes wouldn't be usable. You couldn't have an array of BigInteger because array elements must have the same size. You couldn't put BigInteger in another struct/class because that too will have to have a variable size.

supercat wrote:
Although I mentioned With as offering some of the advantages of pointers from a performance standpoint, I think of it more in terms of semantics: if the programmer's intention is to make repeated accesses to the same structure, it's better to use a language construct that says that, than to either use what happens to be the same expression multiple times and hope that (1) the JITter doesn't produce inferior code as a result, and (2) someone doesn't accidentally make some change to three of the four places while intending to change all four.
I think already I expressed some opinions about With in another thread, it's a convenience feature first and foremost and what you're saying is a side effect. Proper solution would be allow ref locals and possible ref return types. Then you could write "ref int x = ref a[i], y = ref b[i];" and that's more than you can do with With. I'm tempted to ask for this feature but I'd rather see the JIT compiler being improved. In the C++ world it would quite unexpected for the compiler not to do the right thing about this kind of code. This is a relatively trivial, textbook optimization.
If one were to write in CIL the equivalent of:
Yes, you could do that. Needless to say, it's not very efficient. Ideally there would be a CIL instruction that allows you to load a field by (non constant) index from a type that contains only fields of the same type. That would be efficient and verifiable (though it will require runtime checks just like array accesses).
There are times when it would be useful to have an assembly define a reference type which could be given to outside code and received back from outside code...
Just make the type public and all its methods and constructor internal?
It would be helpful if there were a means by which a function could return a value which could be passed back to other members of its assembly (either as a normal method parameter, or by invoking a member thus causing it to be passed back as this) but which could not otherwise be persisted by the recipient. ...
Hmm, this seems related to immutability. That's a rather complex matter that can't be solved with such simple "workarounds". There's a research project at Microsoft which tries to deal with this kind of stuff, see this paper and this blog post. I would suggest to leave this here since it's not related in any way to this discussion. If you really want, start a separate discussion about it but I don't expect to see that stuff anytime soon in C#.
May 18, 2014 at 12:09 PM
Edited May 18, 2014 at 1:10 PM
@mdanes: I'd appreciate if you please spearate quotes from different forum members. In your post above, all the quotes from "Although I mentioned ..." and below correspond to @supercat. **mdanes wrote:**
You can't have a value type with a size defined at runtime, the size must be known at (JIT) compile time and must be the same for all values of a given type. This isn't due to a limitation of the language or runtime, it's simple that values with different sizes wouldn't be usable. You couldn't have an array of BigInteger because array elements must have the same size. You couldn't put BigInteger in another struct/class because that too will have to have a variable size.
If that's so, then why BigInteger is defined as a struct in the first place? Does it currently have a size at compile time? If so, problem solved? Use that size for the fix array of bytes in it. If not, then what's the big deal with what I'm proposing regarding BigInteger? Today you cannot use pointers with BigInteger, so you have been arguing against the proposal from a use case that is out of the scope even today. BigInteger would still be a struct (nothing changes here), but of course not a primitive (since as you have been stating, there is no way to redefine it so that it gets rid of reference-type fields and or it has a fix size at compile time).

Again: if you cannot use pointers with BigInteger today, all the arguments you have been posting against my suggestion of a primitive type equal zero since the main purpose of the type is related to pointers.

If you prefer we could be theorizing all day about why BigInteger can or cannot be converted to a struct with no references and the issues with its variable size. But please leave it outside the discussion of this proposal since it has nothing to do with it from the moment that you cannot use pointers for BigInteger (ditto for any other type that holds reference types).

I hope the .NET Team consider the proposed primitive value type for upcoming versions of the language.
May 18, 2014 at 1:19 PM
If that's so, then why BigInteger is defined as a struct in the first place? Does it currently have a size at compile time?
BigInteger is only a struct so that it gains the primary behavior of being assigned on the stack. That type cannot have a fixed size by definition since it has no limit to how long the value could be, regardless of any proposal to overhaul .NET's type system.

The whole unified type system is largely voodoo when it comes to the primitive types like byte, short, int, long, float and double. There are actually separate CIL opcodes responsible for dealing with those types, and that CIL must be known and emitted at runtime.
May 18, 2014 at 1:27 PM
Edited May 18, 2014 at 2:20 PM
Halo_Four wrote:
BigInteger is only a struct so that it gains the primary behavior of being assigned on the stack.
Structs are not always assigned to the stack. That depends on where they are defined in your code. A struct declared as a member field in a class goes in the heap along with its contaning object.

Halo_Four wrote:
That type cannot have a fixed size by definition since it has no limit to how long the value could be, regardless of any proposal to overhaul .NET's type system.
The limit is given by the resources available to the system, whether it is one phone or a cluster of super computers. But again, the characteristics that define a BigInteger has nothing to do with this proposal.

Struct? Checked. Variable size? Checked. Holds a reference type? Checked. Cannot use pointers? Checked. Has it something to do with my proposal, then? No.
May 18, 2014 at 6:00 PM
Non-boxed struct instances may be part of a heap allocation, but do not require a heap allocation of their own. The main features of making something like BigInteger a struct are: (1) structure-type variables can have usable default values, while class-type variables cannot; (2) While a BigInteger struct would have to have a reference-type field capable of identifying an immutable backing store for large numbers, it could also use additional fields in ways which might improve efficiency.

I don't know how BigInteger is actually implemented, but if I wee designing a LargeInteger struct, I might include an Int64 and a Uint64[] and specify that the numerical value was the sum of a two's-complement integer encapsulated in the Uint64[] (zero if null) and the integer stored in the Int64 field. I would expect that many instance of LargeInteger will actually be used to hold numbers which are small enough to fit in an Int64; no separate backing-store heap object would be required for those. Additionally, adding such a LargeInteger to a big LargeInteger would only require a new heap allocation if the sum of the two Int64 backing fields would be incapable of fitting in an Int64.

The fact that not all instances of a type would be able to fit inside a fixed amount of space does not imply that there would not be value in using a struct, and taking advantage of information contained therein to reduce the number of backing-store allocations required.
May 18, 2014 at 7:28 PM
Ultrahead wrote:
If that's so, then why BigInteger is defined as a struct in the first place?
supercat hinted at this in his last post but if that wasn't clear the answer is simpler: it's simply an optimization. An array must be used to store the actual number, that means a reference type that is allocated on the GC heap. You could make BigInteger a class but then it would yet another object that needs to be allocated on the GC heap. It turns out that there is no need for that and BigInteger can be made a struct.
Again: if you cannot use pointers with BigInteger today, all the arguments you have been posting against my suggestion of a primitive type equal zero since the main purpose of the type is related to pointers.
Again, your original post is in no way clear that it is all about pointers. Half of it presents NumericValueType and the 'number' generic constraints and I presented BigInteger as a type that looks like a number but doesn't fit your idea of number. This make your idea questionable, should we add 'number' because it solves certain problems but doesn't solve other similar problems or should we try to find something else that solves all the problems (namely, using operators in generics)?

P.S - sorry about the missing author names, I usually add them but I forgot this time. I've fixed that post.
May 18, 2014 at 7:55 PM
Edited May 18, 2014 at 7:57 PM
@mdanes: it's ok; thanks for fixing it :)

Well, when I mentioned primitive I also mentioned:
The benefit here is to open the door for generic pointers in future versions of C#. So, at compile time, it would be safe to have the following operation:
Public void Calculate1 (TPrimitive* myPrimitivePtr) 
   where TPrimitive : primitive
{
   …
}
And in my second post I added:
The key difference between a primitive and a struct is that in the former it would not be allowed to declare member fields/properties for reference types, what in turn would allow to declare pointers safely, and, would allow to implement and additional feature to the language: generic pointers.
And since you mentioned some devs are afraid of pointers and dealing with memory I though that was clear enough. But, anyway, now that it is clear to all let's focus on the suggestion. I strongly believe a primitive struct would add value to the language. And I have no problem if it's called something else instead of "primitive" to avoid semantic issues with number types like BigInteger.

Now, about the numeric type. Sometime ago, someone suggested on my blog adding new constraints for generics that let you specify math operations, like op+. Thus, my example of numerics would change to:
public TNumeric Calculate2(TNumeric number1, TNumeric number2, TNumeric number3)
   where TNumeric: op+, op*
{
    return number1 + number2 * number3;
}
So, why not having both? The numeric type for a common set of math operations and operation constraints to deal with a wider range of scenarios.
May 18, 2014 at 8:05 PM
supercat wrote:
I don't know how BigInteger is actually implemented
You can easily look it up in Reference Source. The relevant part is:
        // For values int.MinValue < n <= int.MaxValue, the value is stored in sign
        // and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits
        internal int _sign;
        internal uint[] _bits;
May 19, 2014 at 1:06 AM
svick wrote:
You can easily look it up in Reference Source. The relevant part is:
To what extent does anything in the reference source constitute a promise that a particular aspect of something's implementation won't change?

Also, I wonder what the performance implications are of using sign to represent just the sign of numbers when _bits is non-null, versus having it represent an offset? I'm not sure what fraction of operations on typically performed BigInteger would benefit, but those that would, could benefit enormously.
May 19, 2014 at 5:30 PM
Ultrahead wrote:
Now, about the numeric type. Sometime ago, someone suggested on my blog adding new constraints for generics that let you specify math operations, like op+. ... So, why not having both? The numeric type for a common set of math operations and operation constraints to deal with a wider range of scenarios.
Well, technically there's nothing that prevents us from having both but the 2 features overlap and that's something that people tend to avoid when designing languages. If you have both then it's inevitable that people get confused and start asking questions: what's the difference, which one's best, which one should I use?

And there's the 3rd option, the templates we talked above. If the language will ever get something like C++ templates then all this number/op+ stuff becomes mostly useless because templates solves this and more.

In short, if today a more restrictive feature gets added to the language and tomorrow a more powerful feature gets added the restrictive feature may become language baggage. For example, anonymous methods. Almost nobody uses that feature today because pretty much everyone uses lambda expressions.

Anyway, the only thing we can do in this regard is wait and see. The fact that pretty much all of this stuff requires CLR support means that we won't see this done in the short term.
May 19, 2014 at 5:55 PM
mdanes wrote:
Anyway, the only thing we can do in this regard is wait and see.
Agreed.
The fact that pretty much all of this stuff requires CLR support means that we won't see this done in the short term.
As usual, time will tell :)
May 19, 2014 at 8:14 PM
mdanes wrote:
And there's the 3rd option, the templates we talked above. If the language will ever get something like C++ templates then all this number/op+ stuff becomes mostly useless because templates solves this and more.
There is no plausible mechanism by which anything substantially resembling the .NET framework could ever allow an assembly to export templates with anything near the versatility that a language could support with non-exportable templates, without severe run-time costs. While it would be a good idea to make a template feature as versatile as conveniently practical, I see no reason to expect that a language would suddenly move in the style of full C++-style template support. Given that templates cannot plausibly be exportable beyond an assembly, their design should focus on features which will be useful despite that limitation.
Jun 7, 2014 at 9:50 AM
I thought about supporting numeric types in generic calculations. It will be good, if all numeric type will implement something like INumericCalculator<T> interface:
    public interface ICalculator<T>
    {
        T Add(T second);
        T Subtract(T second);
        T Multiply(T second);
        T Divide(T second);
        T Module(T second);
    }

    public interface IIntegerCalculator<T> : ICalculator<T>
    {
        T And(T second);
        T Or(T second);
        T Xor(T second);
    }
It should not need any big change in CLR - just in numeric type definition, and allow everyone create custom numeric types. Optionally C# compiler could map numeric operators to this interface methods, if class is implement it - but really we don't need it - we already have operators overload and this interface allow us use numeric type in generic. And if we will use this interfaces in generic constraints, we even can call this method without boxing.
Jun 7, 2014 at 5:47 PM
At present, it is not possible to overload operators to act upon interface types. If it were possible, there are many situations in which having a new version of a class implement an interface not implemented by an older one would break consumer code because doing so would cause operator bindings to become ambiguous. Even if rules existed to resolve such ambiguity, the double-dispatch nature of operator overloading would make it difficult to ensure that adding an interface wouldn't cause an operator to unexpectedly switch from one overload to another. Things would be further muddled by C#'s use of == as both an overloadable comparison operator and a non-overloaded reference comparison operator. If an interface were defined for which == was overloaded, then having new versions of a class implement that interface would change the meaning of existing code which used == for reference-equality testing upon that class.

All that being said, having each built-in numeric type implement ICalculateWith<T> for itself, as well as IAlwaysPrecicelyConvertibleTo<T>, IAlwaysRoughlyConvertibleTo<T>, IPossiblyPreciselyConvertibleTo<T>, and IPossiblyRoughlyConvertibleTo<T> [and maybe corresponding ...From versions as well] for all other numeric types (and possibly a new UniversalNumericPrimitive type), would make it possible to define a struct Calc<T> where T:ICalculateWith<T> which could use overloaded operators. In theory, the jitter should be able to act upon things of the Calc<T> type almost as well as things of the underlying primitive types.

With regard to UniversalNumericPrimitive, the type Decimal is 16 bytes, but doesn't use all 16; the next largest numeric primitive is 8 bytes; it would be possible to design a type which could accurately represent any value that a Double or Decimal could hold, and which would implement IPossiblyPreciselyConvertibleTo<double> and IAlwaysRoughlyConvertibleTo<double>, and both IPossiblyPreciselyConvertibleTo<decimal> and IPossiblyRoughlyConvertibleTo<decimal>. Operations involving double would convert to the nearest double and yield a double result. IPossiblyPreciselyConvertibleTo would disallow conversion in that case, since there would be no way of knowing if precision had been lost.
Aug 15, 2014 at 8:25 PM
Would love to hear the comments from the devs about this suggestion.
Sep 10, 2014 at 12:44 PM
I also have to agree in this discussion. There MUST be some numeric datatype in one of the next C#/.NET releases. For most of the things we are currently doing with image processing mathematical routines are highly necessary. I have spent many many hours to create generic methods for mathematical operations and casting because there was no other possibility. If I compare this routines there is at least 4000 lines for the primitive operations (+,-) and also 4000 lines for casting from one numeric type to another.

I also spent many hours on checking the performance doing things with "natural" C# methods without having to have pointers. There is a factor of 12! at maximum between conventional C# methods and pointer arithmetic.

C++ is NOT an option to solve the performance problems. I don't want to create a C++ library which is then also platform dependent only because I need to have more performance.

If C# should also make in the research field, features like the described above and in many other discussions is really a MUST and not an option. You can also read in in the C#/CLI book from Jeff Richter in the section of generics that many researchers have requested this feature and are tensly waiting that this feature will come finally up.
Sep 10, 2014 at 2:44 PM
msedi wrote:
I also have to agree in this discussion. There MUST be some numeric datatype in one of the next C#/.NET releases.
What would you think of the idea of having a syntax which would basically instruct the compiler to repeat a block of code once for each type from a list, in a fashion more like C++ templates than .NET generics? The .NET generics mechanism is in many ways superior to C++ templates, since it allows means that a List<T> is not limited to types which exist when the List<T> is compiled, but it cannot handle situations where different types require completely different CIL instructions. On the other hand, in cases where the number of types a method would need to work with would be very limited and would be known at compile time, having to compile a method separately for each one would not be a major problem.
Sep 10, 2014 at 4:01 PM
What would you think of the idea of having a syntax which would basically instruct the compiler to repeat a block of code once for each type from a list, in a fashion more like C++ templates than .NET generics?
Do you mean something like iterators? Let's say for data conversion I have tried the Array.ConvertAll and this was the slowest method of all 12 methods I have checked.
I think there are two things:
  1. having some mechanism to iterate quickly over one or more arrays of an numeric datatype
  2. having the mathematical operations on numeric datatypes.
I have tremendous requiredments on performance, but on the other hand I'm also very glad if my code is easier to read and costs a little bit of performance. My current tests have shown, that good looking C# Code with the mechanisms currently supported cost too much of performance that I'm willing to give up the bad code I had to write.

Do you have some example of what you exactly mean with compiler instruction and do you have some estimate of performance?

Thanks
Martin
Sep 10, 2014 at 5:31 PM
Edited Sep 10, 2014 at 5:32 PM
msedi wrote:
Do you mean something like iterators? Let's say for data conversion I have tried the Array.ConvertAll and this was the slowest method of all 12 methods I have checked.
See my May 16 11:38am post above for a better description of what I had in mind.
Oct 17, 2014 at 7:48 PM
Oct 17, 2014 at 8:48 PM
It's too bad Microsoft never seem to have liked the 80-bit type the x87 uses for calculation; if .NET had supported that, then methods that need to work with any numeric primitive could simply operate on type Extended [unlike double, which is incapable of storing most Int64 or UInt64 values outside the range +/- 9,007,199,254,740,992 an Extended type would have had no problem storing any Int64 or UInt64 value precisely].
Coordinator
Oct 19, 2014 at 6:21 AM
I'm struck by this discussion thread about numerics,
and also this one about calli https://roslyn.codeplex.com/discussions/570152
and then a load of related uservoice requests
GPGPU programming http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2730068-greatly-increase-support-for-gpu-programming-in-c
GPGPU programming again http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2265408-gpgpu-programming
.NET for xbox http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/4233646-allow-net-games-on-xbox-one
optimize maths functions http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2616469-optimize-math-functions
numeric interfaces that you mentioned darkman666 http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2372338-number-interface-for-int-short-double-float-etc
static interface methods http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2268973-virtual-static-method-in-interfaces-generics

There are several buckets...

(1) Write numeric methods that can operate on any numeric type. That's what we want INumeric for, and also static methods on interfaces. If it failed to let you actually use + and - and the other mathematical operators, it would be a failure. We've examined this issue numerous times since I've been present at LDMs. We never pursued it, partly because this didn't seem like the actual main pain point in writing numeric code: solving it would make some small cases better but not be generally useful. I mean, is it really useful to have a matrix library that works on both floats and doubles? Why not standardize on one floating point type throughout for your maths libraries? If you're using a maths library geared towards games, by all means let it use floats throughout. Swift is interesting because it lets you declare after-the-fact that any type implements an interface, a sort of "extension interface implementation" for existing types, but it does so at the cost of dynamic dispatch.

(2) Write faster interop code. Better buffers that need fewer copies, calli and function pointers, ability to pin memory more easily, seamless interop with native code, less marshalling, more efficient marshalling. Swift does a really good job at this.

(3) Generate fast code. This includes SIMD, which has already been added. But here I think we should be looking primarily at ProjectN.


The only perf code I've written (in both C++ and VB) is audio: sampling, decoding, FFT, playback. So I'm learning a lot more from reading these threads. It makes me want to see several concrete project examples, ones that people feel are good projects to make run fast, and run a bunch of proposals against this real world code to see if they have the payoff you'd hope.
Oct 21, 2014 at 7:31 PM
@lwischik To your first point, it seems kind of presumptuous for you guys to dismiss a highly requested use-case for a feature as not actually being useful. Certain types have common behaviors (addition), and in real-world code you can't always just say "always use a float". Thus, being able to lower upfront coding costs and maintenance costs by not writing the same algorithms multiple times for different types is a big win. When you're writing a library you want to provide the best possible API to serve clients that you possibly can.
Nov 17, 2014 at 8:24 PM
Edited Nov 17, 2014 at 8:33 PM
By the way, it is unfortunate this thread veered off of the main point, which was about a common interface for scalar numbers. I'm not sure pointers have much to do with that -- regardless, a separate thread.

I see this issue as the PRIMARY barrier to .NET becoming a mature numeric computation language. It does not make sense to write at least 4 versions of a Vector or Matrix library (float, double, Complex<float>, and Complex<double>) where the operations are so similar in 80% of the functions. We need to extend Generics so they can handle a 'where' clause that restricts T to INumber, for example.

lwischik wrote:
(1) Write numeric methods that can operate on any numeric type. That's what we want INumeric for, and also static methods on interfaces. If it failed to let you actually use + and - and the other mathematical operators, it would be a failure. We've examined this issue numerous times since I've been present at LDMs. We never pursued it, partly because this didn't seem like the actual main pain point in writing numeric code: solving it would make some small cases better but not be generally useful. I mean, is it really useful to have a matrix library that works on both floats and doubles? Why not standardize on one floating point type throughout for your maths libraries? If you're using a maths library geared towards games, by all means let it use floats throughout. Swift is interesting because it lets you declare after-the-fact that any type implements an interface, a sort of "extension interface implementation" for existing types, but it does so at the cost of dynamic dispatch.
If we had a common interface base class for scalar numbers such that byte, short, ushort, int, uint, long, ulong, float, double, Complex<float>, and Complex<double> derive from it, then it would be AWESOME for scientific library developers like me. You might have a separate one for integer types vs. floating types, but if you did, I'd have one common to them as well. You would need to support the following in my view:
  • Type descriptions: min, max, smallest_difference (i.e. smallest difference between two values -- 1 for integers, ~1e-X for floats)
  • Require implementation for all the common arithmetic operations and their associated operators: Add, +, Sub, -, Mult, *, Div, /, Mod, %, Abs
Notice that it is important to bring complex numbers into the mix, as they show up a lot in machine learning and signal processing applications.
Nov 18, 2014 at 12:46 AM
DrBrianWomack wrote:
Notice that it is important to bring complex numbers into the mix, as they show up a lot in machine learning and signal processing applications.
And also in videogames: quaternions.
Nov 18, 2014 at 2:17 PM
Edited Nov 18, 2014 at 2:18 PM
lwischik wrote:
(3) Generate fast code. This includes SIMD, which has already been added.
There is no ARM NEON support at the moment, which is kind of a must for the modern mobile world :).

Project use case: making Farseer, Chipmunk.NET and others even faster thus delivering a better user experience.
Oct 8, 2015 at 8:22 PM
This branch could really help to bring PrimitiveValueType base class to life: http://xoofx.com/blog/2015/09/27/struct-inheritance-in-csharp-with-roslyn-and-coreclr/
Oct 9, 2015 at 12:18 AM
Oct 9, 2015 at 4:06 AM