This project is read-only.

String Interpolation for C# (v2)

Topics: C# Language Design, VB Language Design
Oct 25, 2014 at 4:30 PM

String Interpolation for C# (v2)

An interpolated string is a way to construct a value of type String (or IFormattable) by writing the text of the string along with expressions that will fill in "holes" in the string. The compiler constructs a format string and a sequence of fill-in values from the interpolated string.

When it is treated as a value of type String, it is a shorthand for an invocation of
String.Format(string format, params object args[])
When it is converted to the type IFormattable, the result of the string interpolation is an object that stores a compiler-constructed format string along with an array storing the evaluated expressions. The object's implementation of
IFormattable.ToString(string format, IFormatProvider formatProvider)
is an invocation of
String.Format(IFormatProviders provider, String format, params object args[])
By taking advantage of the conversion from an interpolated string expression to IFormattable, the user can cause the formatting to take place later in a selected locale. See the section System.Runtime.CompilerServices.FormattedString for details.

Note: the converted interpolated string has precisely the same number of "holes" in the format string as there are expression holes in the interpolated string. Consequently the syntax for an interpolated string makes it an error to express certain strings that would be ambiguous in a composite format string. See the section escaping braces in Composite Formatting

Lexical Grammar

An interpolated string is treated initially as a token with the following lexical grammar:
interpolated-string:
    $ " "
    $ " interpolated-string-literal-characters "

interpolated-string-literal-characters:
    interpolated-string-literal-part interpolated-string-literal-parts
    interpolated-string-literal-part

interpolated-string-literal-part:
    single-interpolated-string-literal-character
    interpolated-string-escape-sequence
    interpolation

single-interpolated-string-literal-character:
    Any character except " (U+0022), { (U+007B), and } (U+007D)

interpolated-string-escape-sequence:
    ""
    {{
    }}

interpolation:
    { interpolation-contents }

interpolation-contents:
    balanced-text
    balanced-text : interpolation-format

balanced-text:
    balanced-text-part
    balanced-text-part balanced-text

balanced-text-part
    Any character except @, ", $, (, [, and {
    verbatim-string-literal
    @ identifier-or-keyword
    regular-string-literal
    interpolated-string
    ( balanced-text )
    [ balanced-text ]
    { balanced-text }
    delimited-comment
    single-line-comment

interpolation-format:
    literal-interpolation-format

literal-interpolation-format:
    interpolation-format-part
    interpolation-format-part literal-interpolation-format

interpolation-format-part
    Any character except ", :, {, }, and new-line-character
It is an error if the character following an interpolation is }.

This lexical grammar is ambiguous in that it allows a colon appearing in interpolation-contents to be considered part of the balanced-text, or as the separator between the balanced-text and the interpolation-format. This ambiguity is resolved by considering it to be a separator between the balanced-text and interpolation-format.

Syntactic Grammar

An interpolated-string token is reclassified, and portions of it are reprocessed lexically and syntactically, during syntactic analysis as follows:
  • If the interpolated-string contains no interpolation, then it is reclassified as a interpolated-string-whole.
  • Otherwise
    • the portion of the interpolated-string before the first interpolation is reclassified as an interpolated-string-start terminal;
    • the portion of the interpolated-string after the last interpolation is reclassified as an interpolated-string-end terminal;
    • the portion of the interpolated-string between one interpolation and another interpolation is reclassified as an interpolated-string-mid terminal;
    • the balanced-text of each interpolation-contents is reprocessed according to the language's lexical grammar, yielding a sequence of terminals;
    • the colon in each interpolation-contents that contains an interpolation-format is classified as a colon terminal;
    • each interpolation-format is reclassified as a regular-string-literal terminal; and
    • the resulting sequence of terminals undergoes syntactic analysis as an interpolated-string-expression.
expression:
    interpolated-string-expression

interpolated-string-expression:
    interpolated-string-whole
    interpolated-string-start interpolations interpolated-string-end

interpolations:
    single-interpolation
    single-interpolation interpolated-string-mid interpolations

single-interpolation:
    interpolation-start
    interpolation-start : regular-string-literal

interpolation-start:
    expression
    expression , expression

Semantics

An interpolated-string-expression has type string, but there is an implicit conversion from expression from an interpolated-string-expression to the types System.IFormattable and System.Runtime.CompilerServices.FormattedString. By the existing rules of the language (7.5.3.3 Better conversion from expression, bullet 1), the conversion to string is a better conversion from expression than either of these types.

An interpolated-string-expression is translated into an intermediate format string and object array which capture the contents of the interpolated string using the semantics of Composite Formatting. If treated as a value of type string, the formatting is performed using string.Format(string format, params object[] args) or equivalent code. If it is converted to System.IFormattable or System.Runtime.CompilerServices.FormattedString, an object of type System.Runtime.CompilerServices.FormattedString is constructed using the format string and argument array, and that object is the value of the interpolated-string-expression.

The format string is constructed of the literal portions of the interpolated-string-start, interpolated-string-mid, and interpolated-string-end portions of the expression. Each interpolated-string-escape-sequence is replaced by a single copy of the doubled character.

The evaluation order needs to be specified.

The definite assignment rules need to be specified.

single-interpolation Semantics

This section should describe in detail the construction of a format item from a single-interpolation, and the corresponding element of the object array.

If an interpolation-start has a comma and a second expression, the second expression must evaluate to a compile-time constant of type int, which is used as the alignment of a format item.

Notes

The compiler is free to use any overload of String.Format in the translated code, as long as doing so preserves the semantics of calling string.Format(string format, params object[] args).

Examples

The interpolated string
$"{hello}, {world}!"
is translated to
String.Format("{0}, {1}!", hello, world)
The interpolated string
$"Name = {myName}, hours = {DateTime.Now:hh}"
is translated to
String.Format("Name = {0}, hours = {1:hh}", myName, DateTime.Now)
The interpolated string
$"{{{6234:D}}}"
is an error, because "It is an error if the character following an interpolation is }."

If you want to format something in the invariant locale, you can do so using this helper method
public static string INV(IFormattable formattable)
{
    return formattable.ToString(null, System.Globalization.CultureInfo.InvariantCulture);
}
and writing your interpolated strings this way
   string coordinates = INV("longitude={longitude}; latitude={latitude}");

System.Runtime.CompilerServices.FormattedString

The following platform class is used to translate an interpolated string to the type System.IFormattable.
namespace System.Runtime.CompilerServices 
{ 
    public struct FormattedString : System.IFormattable 
    {
        private readonly String format;
        private readonly object[] args;
        public FormattedString(String format, params object[] args)
        {
            this.format = format;
            this.args = args;
        }
        public String Format => this.format;
        public object[] Args => this.args;
        string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
        {
            return String.Format(formatProvider, format, args);
        }
    } 
} 

Changes since v1

  1. Interpolated strings are now more like verbatim strings than ordinary strings:
    • They may occupy multiple lines and contain newlines
    • The only characters that require escaping are ", {, and } which are escaped by doubling
    • The number of fill-ins in the format string is guaranteed to be the same as the number of interpolations
  2. System.Runtime.CompilerServices.FormattedString changes:
    • FormattedString is now a struct type.
    • There is a conversion from an interpolated string expression to FormattedString
    • Added properties Format and Args to access the compiler-generated composite format string and fill-in values
Marked as answer by nmgafter on 10/25/2014 at 8:30 AM
Oct 25, 2014 at 4:31 PM
This draft reflects some proposed changes. It is not yet the consensus of the language design groups that we should make such changes. I've written the spec revision so that we can discuss the issues in the context of a concrete proposal.
Marked as answer by nmgafter on 10/25/2014 at 8:31 AM
Oct 25, 2014 at 6:26 PM
That's funny, I suggested this exact syntax about a week or so ago as an addition to the escape-sequence notation. The idea at the time was that the $ prefix would indicate an interpolated verbatim string whereas \{x} would be how you express interpolation within a normal string.

Any consideration for handling both?
// produces identical strings
var s1 = "The file location is:  \"\\\\\{server}\\\{path}\\\{file}.\{ext}\" {\{size} bytes}";
var s2 = $"The file location is:  ""\\{server}\{path}\{file}.{ext}"" {{{size} bytes}}";
Of course only supporting the prefixed flavor in VB.NET makes sense since that language has no history of escape sequences and has always used doubling-up as a form of escaping specific characters.
Oct 26, 2014 at 4:38 PM
nmgafter wrote:
An interpolated-string-expression has type string, but there is an implicit conversion from expression from an interpolated-string-expression to the types System.IFormattable and System.Runtime.CompilerServices.FormattedString. By the existing rules of the language (7.5.3.3 Better conversion from expression, bullet 1), the conversion to string is a better conversion from expression than either of these types.
Does this guarantee that var s = $"{hello}, {world}!"; is not ambiguous between string and System.Runtime.CompilerServices.FormattedString as type for s?

(If yes, the approach like this should perhaps be considered for lambdas as well, as for now lambda expression's type is ambiguous between Func<...> and Expression<Func<...>>.)
Oct 26, 2014 at 7:56 PM
Looks promising, though I have some questions about some corner cases.

Can I do something like this?
"{hello}, {world}!".ToString(null, CultureInfo.InvariantCulture)

"{hello}, {world}!".ToInvariant()
Can the compiler pick up extension methods for both String and IFormattable? Would Intellisense pick up both? If there are multiple matches which would be picked: string, IFormattable or FormattedString?

nmgafter wrote:
It is an error if the character following an interpolation is }.
So that means if (for whatever reason) I have to add a } directly after the hole I would write:
$"{{{6234:D}{"}"}"    // results in "{6234}"
and if I want a } in the format specifier (again purely academic) I would write:
$"{6234.ToString("{0.00}")}"    // results in "{6234.00}"
Would it cause problems if in $"{{{6234:D}}}" the first closing brace is always assumed to be the end of the inerpolation? Or was this restriction (no } after the interpolation end) put into place to prevent accidents when migrating from String.Format and all it's ambiguities?


The following expression
$"{{w}}"
can be both
interpolated-string-escape-sequence single-interpolated-string-literal-character interpolated-string-escape-sequence
and
interpolation
=> { interpolation-contents }
=> { balanced-text }
=> { balanced-text-part }
=> { { balanced-text } }
Maybe the { balanced-text } rule should not be allowed on the first balanced-text-part in balanced-text, only on subsequent ones.
Oct 26, 2014 at 11:30 PM
Why the change from regular strings to verbatim strings?
Oct 27, 2014 at 3:02 AM
I strongly suggest reserving $ for those who want to extend C# syntax personally. Use @@ instead (@@"...").
Oct 27, 2014 at 11:18 AM
Knat wrote:
I strongly suggest reserving $ for those who want to extend C# syntax personally. Use @@ instead (@@"...").
Why? What is the background of your suggestion? Is there any advantage in the proposed change?
Oct 27, 2014 at 1:33 PM
Edited Oct 27, 2014 at 7:44 PM
BachratyGergely wrote:
Can I do something like this?
``` C#
"{hello}, {world}!".ToString(null, CultureInfo.InvariantCulture)

"{hello}, {world}!".ToInvariant()
No. Implicit conversions are never considered for extension methods.
(Just as Extension(this long x) will not work for 1.Extension())
Can the compiler pick up extension methods for both String and IFormattable? Would Intellisense pick up both? If there are multiple matches which would be picked: string, IFormattable or FormattedString?
No, the spec says the result is always string but with implicit conversions to IFormattable and FormattedString.
If a string overload exists it will be used (after the call to string.Format()).

Don't know what happens if you have no string overload but both Func(IFormattable) and Func(FormattedString).
In normal code you are not allowed to define implicit conversions to an interface so there is some compiler magic going on.

My guess is that the rule will be
  1. MyFunc(string)
  2. MyFunc(FormattedString)
  3. MyFunc(IFormattable)
I would prefer the result to be a FormattedString with an implicit conversion to string (and IFormattable of course).

Unfortunatly that might confuse developers who do
    var str = $"{x} {y}";
    var foo = str.Length; // or some other string member
or
    var x = new MutableReferenceTypeWithToString();
    var str = $"{x}";
    x.y = 20;
   Console.WriteLine(str);
(changing var to string solves both)

Not sure why that is a big issue. Personally I never have code like
var str = string.Format(...);
unless I intend to use the result in a method call directly after.
i.e. as a more readable version of
    MyMethod(abc, string.Format(...), def);
Oct 27, 2014 at 6:32 PM
Edited Oct 27, 2014 at 6:34 PM
PauloMorgado wrote:
Why the change from regular strings to verbatim strings?
From the section

Changes since v1

  1. Interpolated strings are now more like verbatim strings than ordinary strings:
    • They may occupy multiple lines and contain newlines Useful because they may occupy multiple lines and contain newlines
    • The only characters that require escaping are ", {, and } which are escaped by doubling Useful because \ is a common use case, and it makes the interpolated holes easier to read.
    • The number of fill-ins in the format string is guaranteed to be the same as the number of interpolations Useful because we believe it will be expected
  2. System.Runtime.CompilerServices.FormattedString changes:
    • FormattedString is now a struct type. Useful because you can avoid an object creation if you use this as a target type
    • There is a conversion from an interpolated string expression to FormattedString Ditto
    • Added properties Format and Args to access the compiler-generated composite format string and fill-in values We believe they will be useful to some clients, for example for localization.
Oct 27, 2014 at 11:51 PM
nmgafter wrote:
PauloMorgado wrote:
Why the change from regular strings to verbatim strings?
From the section

Changes since v1

  1. Interpolated strings are now more like verbatim strings than ordinary strings:
    • They may occupy multiple lines and contain newlines Useful because they may occupy multiple lines and contain newlines
    • The only characters that require escaping are ", {, and } which are escaped by doubling Useful because \ is a common use case, and it makes the interpolated holes easier to read.
    • The number of fill-ins in the format string is guaranteed to be the same as the number of interpolations Useful because we believe it will be expected
  2. System.Runtime.CompilerServices.FormattedString changes:
    • FormattedString is now a struct type. Useful because you can avoid an object creation if you use this as a target type
    • There is a conversion from an interpolated string expression to FormattedString Ditto
    • Added properties Format and Args to access the compiler-generated composite format string and fill-in values We believe they will be useful to some clients, for example for localization.
Yes. But, why was that decision made?
Oct 28, 2014 at 12:12 AM
PauloMorgado wrote:
Yes. But, why was that decision made?
Read message two in this thread.
Oct 28, 2014 at 1:27 AM
nmgafter wrote:
PauloMorgado wrote:
Yes. But, why was that decision made?
Read message two in this thread.
This one?

nmgafter wrote:
This draft reflects some proposed changes. It is not yet the consensus of the language design groups that we should make such changes. I've written the spec revision so that we can discuss the issues in the context of a concrete proposal.
So, the change was to move away from from concensus? :)

My initial take on this was the it was that a prefix should be used. It would allow to combine that one with the verbatim one (@).

But some empahiss was that this should be used for non final user texts but prgramatic strings (like keys). And for that verbatim didn't make much sense and escaped holes started makin more sense.

But then I started to move away from escaped holes.

Is the verbatim decision to allow escaping holes with double curly brackets?

Wold you consider having the $ modifier applied to regular strings ($"...") and verbatim strings ($@"...")?
Oct 28, 2014 at 2:05 AM
PauloMorgado wrote:
nmgafter wrote:
PauloMorgado wrote:
Yes. But, why was that decision made?
Read message two in this thread.
This one?

nmgafter wrote:
This draft reflects some proposed changes. It is not yet the consensus of the language design groups that we should make such changes. I've written the spec revision so that we can discuss the issues in the context of a concrete proposal.
So, the change was to move away from from concensus? :)
This is not a "change", it is a proposal. The other proposed specs remain the same. There is not yet consensus on any particular specification.
Is the verbatim decision to allow escaping holes with double curly brackets?
This doesn't reflect a decision. It reflects a proposal. In this proposal if you want a { character in the string you write {{ inside an interpolated string. That is the same convention used in String.Format.
Would you consider having the $ modifier applied to regular strings ($"...") and verbatim strings ($@"...")?
Yes, we will discuss that possibility.
Oct 28, 2014 at 2:08 AM
nmgafter wrote:
PauloMorgado wrote:
Would you consider having the modifier applied to regular strings ($"...") and verbatim strings ($@"...")?
Yes, we will discuss that possibility.
I think that would make most everyone happy.
Oct 28, 2014 at 2:11 AM
How do you imagine escaping { should work in a $"..." string? How should it work in a $@"..." string? Should it be doubled up for both of them?
Oct 28, 2014 at 2:31 AM
nmgafter wrote:
How do you imagine escaping { should work in a $"..." string? How should it work in a $@"..." string? Should it be doubled up for both of them?
I wouldn't change anything on this version of the spec to cover $@"...". The verbatinish way of escaping double quotes was kept from verbatim strings and extended to curly braces.

For $"..." I would keep the way things are escaped: the \ in regular strings (\n, \x20, \u0020). It's not the way string.Format escapes it's holes, but that's not C#, that's string.Format. One might feel tempeted to escape the holes the same way string.Format, but there's no need for that because they are different things.
Oct 28, 2014 at 2:35 AM
PauloMorgado wrote:
nmgafter wrote:
How do you imagine escaping { should work in a $"..." string? How should it work in a $@"..." string? Should it be doubled up for both of them?
I wouldn't change anything on this version of the spec to cover $@"...". The verbatinish way of escaping double quotes was kept from verbatim strings and extended to curly braces.

For $"..." I would keep the way things are escaped: the \ in regular strings (\n, \x20, \u0020). It's not the way string.Format escapes it's holes, but that's not C#, that's string.Format. One might feel tempeted to escape the holes the same way string.Format, but there's no need for that because they are different things.
OK, then, for verbatim interpolated strings we guarantee the same number of fill-ins as interpolations, and for non-verbatim interpolated strings we allow the compiler to insert its own interpolations in the result?
Oct 28, 2014 at 2:39 AM
nmgafter wrote:
PauloMorgado wrote:
nmgafter wrote:
How do you imagine escaping { should work in a $"..." string? How should it work in a $@"..." string? Should it be doubled up for both of them?
I wouldn't change anything on this version of the spec to cover $@"...". The verbatinish way of escaping double quotes was kept from verbatim strings and extended to curly braces.

For $"..." I would keep the way things are escaped: the \ in regular strings (\n, \x20, \u0020). It's not the way string.Format escapes it's holes, but that's not C#, that's string.Format. One might feel tempeted to escape the holes the same way string.Format, but there's no need for that because they are different things.
OK, then, for verbatim interpolated strings we guarantee the same number of fill-ins as interpolations, and for non-verbatim interpolated strings we allow the compiler to insert its own interpolations in the result?
Now you lost me!

I just meant that, for regular strings, you escape {s with \{. I'm not sure if you even need to escape the }s. If the hole is closed, that the closing curly brace is just a closing curly brace.
Oct 28, 2014 at 5:52 AM
PauloMorgado wrote:
I just meant that, for regular strings, you escape {s with \{. I'm not sure if you even need to escape the }s. If the hole is closed, that the closing curly brace is just a closing curly brace.
Unfortunately that doesn't work well with String.Format. What code would be generated by $"\{{6234:D}}" ? Before you send your response, please try it to make sure it produces "{6234}".

See the section "Escaping Braces" in Composite Formatting if this seems confusing.
Oct 28, 2014 at 10:23 AM
nmgafter wrote:
PauloMorgado wrote:
I just meant that, for regular strings, you escape {s with \{. I'm not sure if you even need to escape the }s. If the hole is closed, that the closing curly brace is just a closing curly brace.
Unfortunately that doesn't work well with String.Format. What code would be generated by $"\{{6234:D}}" ? Before you send your response, please try it to make sure it produces "{6234}".

See the section "Escaping Braces" in Composite Formatting if this seems confusing.
It can't be done. And that proves that, the fact that I was not in bed at the time I made that statement, doesn't mean I wasn't already asleep. :)

The truth is that I was not aware the fact that string.Format worked that way. I had given a quick look at that part of the spec but, because I never needed to use it, I didn't know it was broken i that way. It has to be broken. There's now other way to reason about it.

You (and anyone that has read that spec) know that the generated code must be:
string output = string.Format("{0}{1:D}{2}", new object[] {"{", 6234, "}"});

Now, the easy way would be to escape the closing curly brace: $"\{{6234:D}\}". But you (Roslyn) are doing that and it depends on how you handle the holes. IF you handle the holes by tracking if you're inside one or not, when you find a } outside the hole, you know that you have to handle it.
Oct 28, 2014 at 12:04 PM
And this is primarily why I like the syntax I mentioned above (which is really just a combination of this proposal with a previous proposal). It would provide interpolation in both normal C# strings and in verbatim C# strings with syntax that is consistent in both cases.
// produces identical strings
var s1 = "The file location is:  \"\\\\\{server}\\\{path}\\\{file}.\{ext}\" {\{size} bytes}";
var s2 = $"The file location is:  ""\\{server}\{path}\{file}.{ext}"" {{{size} bytes}}";
This does have the one disadvantage of not being able to determine if the string contains interpolation at a glance that the prefix provides, but since the C# escaped strings will probably tend to be shorter (due to not being able to contain embedded newlines) I don't think that would pose a problem.
Oct 28, 2014 at 1:52 PM
Edited Oct 28, 2014 at 1:53 PM
I think change 2 is fine, but change I'm not a fan of change 1 parts 1 and 2. I'd prefer the normal escaping syntax + backslash-escaping for curly brackets that was proposed earlier for the following reasons:
  • I find doubling up characters is harder to read than backslash escaping.
  • Having all the normal escape sequences in a string is useful (newlines, tabs, etc).
  • A language feature shouldn't be tied to the specific .NET framework implementation detail of String.Format.
  • As mentioned in this discussion, the double curly braces method of escaping in String.Format is fundamentally broken. The compiler should use normal escaping syntax and be smart enough to work around this limitation of String.Format.
If you want verbatim strings to allow interpolation, then make an interpolated verbatim string in addition to the regular string interpolation: $@"{foo}".
Oct 28, 2014 at 10:28 PM
If you want to see how string.fornat works. Have at my S.F.D. Project on codeplex. It started as a direct remplementation of string.format that was modified to inckyse diagnostics.

Hole args id (index \ identifier) and alignment is handle by string.format.
The format args is passed to IFormatProvider of the type referred to in the arghole.

Note string.format only reports the first problem in the string. S.F.D. Reports all of them, including the format arg of the common core types.

Why can't it be supported via library extension methons?

String String.InterpolateWith(this string interpolationText, param Object[] args)

Same grammar used as String.Format except the Arghole.Identifier is a identifier.
Errors and issues can reported back via diagnosic analysers.

Isn't that the point of roslyn?
Oct 29, 2014 at 12:33 PM
Will the compiler optimize boxings?
Oct 29, 2014 at 5:32 PM
Edited Oct 29, 2014 at 6:12 PM
I might be putting stick into a fire now...but...

why isn't it done like that?

string helloWorld = "dadadudasdasd"

$"AHahahah that is {helloWorld} hehehaha" // rewrite this into
"Hahahah that is "+(helloWorld).ToString()+" hehehaha";
or sb.Concat("Hahahah that is ", helloWorld.ToString(), " hehehaha")

that way also opens up such scenarioes:

$"AHahahah that is {helloWorld.ToUpper()} hehehaha"
$"AHahahah that is {x+y} hehehaha"

It might be statically typed that way.
Oct 29, 2014 at 5:41 PM
arekbal wrote:
I might be putting stick into a fire now...but...

why isn't it done like that?

string helloWorld = "dadadudasdasd"

$"AHahahah that is {helloWorld} hehehaha" // rewrite this into
"Hahahah that is "+helloWorld.ToString()+" hehehaha";
or sb.Concat("Hahahah that is ", helloWorld.ToString(), " hehehaha")

that way also opens up such scenarioes:

$"AHahahah that is {helloWorld.ToUpper()} hehehaha"
$"AHahahah that is {x+y} hehehaha"

It might be statically typed that way.
It was mentioned I think during the previous proposal that optimizations for these cases could be handled, such as if all of the holes contain string expressions, but those optimizations would still behave identically as if the interpolation were interpreted as a call to String.Format instead. Using String.Format allows for the user of format specifiers as well as now enables this functionality of resolving to an IFormattable or FormattedString.
Oct 29, 2014 at 5:52 PM
arekbal wrote:
I might be putting stick into a fire now...but...
The .ToString() method of many types use the current culture for formatting. By providing a way to target-type IFormattable, we enable the programmer to select the culture in which the data is formatted. For your example it doesn't make a difference.
Oct 29, 2014 at 6:40 PM
You guys are right about using String.Format. I am aware of ToString overloads and formats. I used ToString as a starting point.

My concern here is only about the whole string interpolation thingy not being aware of params and their validity during compilation.

If you can make it work such that it blows during compilation because code in brackets is invalid expression(not even mentioning string formatters) then I would be happy.

Cheers.
Oct 29, 2014 at 6:44 PM
arekbal wrote:
You guys are right about using String.Format. I am aware of ToString overloads and formats. I used ToString as a starting point.

My concern here is only about the whole string interpolation thingy not being aware of params and their validity during compilation.

If you can make it work such that it blows during compilation because code in brackets is invalid expression(not even mentioning string formatters) then I would be happy.

Cheers.
The expressions would be validated at compile-time. The compiler would translate them into String.Format calls and the expressions turned into actual code.
var person = new Person();
string s = $"Hello {person.FristName}!";
would be converted to:
var person = new Person();
string s = string.Format("Hello {0}!", person.FristName); // compiler error, FristName does not exist!
This is done behind the scenes and the compiler would report that the compile-error is occurring within the expression embedded within the string.
Oct 29, 2014 at 6:49 PM
Thanks for clarification!!!
Nov 10, 2014 at 1:17 AM
madrian wrote:
No. Implicit conversions are never considered for extension methods.
(Just as Extension(this long x) will not work for 1.Extension())
 
Indeed, but it's not exactly the same. If I understand the proposal correctly, an interpolation expression doesn't have a type of its own; depending on context, it will be either string or IFormattable (similar to lambda expressions, which have no type of their own without context). So it's not really an implicit conversion, since the expression doesn't have a type yet.

Anyway, it would be really nice if we could use extension methods on interpolation expressions, because of this:
string coordinates = INV($"longitude={longitude}; latitude={latitude}");
(from Neal's example; I added the '$', which I assume he had just forgotten)
Typically, INV will actually be a method in a utility class, since it will be used in many places. So the code will actually look like this:
string coordinates = StringUtils.Invariant($"longitude={longitude}; latitude={latitude}");
It would be much better if we could write it like this:
string coordinates = $"longitude={longitude}; latitude={latitude}".Invariant();
Nov 10, 2014 at 10:07 PM
tom103 wrote:
madrian wrote:
No. Implicit conversions are never considered for extension methods.
(Just as Extension(this long x) will not work for 1.Extension())
 
Indeed, but it's not exactly the same. If I understand the proposal correctly, an interpolation expression doesn't have a type of its own
quoting the spec...

Semantics

An interpolated-string-expression has type string, but there is an implicit conversion from expression from an interpolated-string-expression to the types System.IFormattable and System.Runtime.CompilerServices.FormattedString. By the existing rules of the language (7.5.3.3 Better conversion from expression, bullet 1), the conversion to string is a better conversion from expression than either of these types.
Nov 11, 2014 at 12:03 AM
nmgafter wrote:
tom103 wrote:
madrian wrote:
No. Implicit conversions are never considered for extension methods.
(Just as Extension(this long x) will not work for 1.Extension())
 
Indeed, but it's not exactly the same. If I understand the proposal correctly, an interpolation expression doesn't have a type of its own
quoting the spec...

Semantics

An interpolated-string-expression has type string, but there is an implicit conversion from expression from an interpolated-string-expression to the types System.IFormattable and System.Runtime.CompilerServices.FormattedString. By the existing rules of the language (7.5.3.3 Better conversion from expression, bullet 1), the conversion to string is a better conversion from expression than either of these types.
OK, so I guess no extension methods for interpolated strings... too bad.
Nov 11, 2014 at 5:24 PM
@tom103 You can sort of think of an interpolated string as inheriting from string.
Nov 11, 2014 at 5:37 PM
AdamSpeight2008 wrote:
@tom103 You can sort of think of an interpolated string as inheriting from string.
Yes, so extension methods for String will work directly on interpolated strings, but extension methods for IFormattable will not. That's what I meant when I said "no extension methods for interpolated string", but my wording was inaccurate ;)
Nov 20, 2014 at 8:41 PM
It's not clear to me if this replaces v1 or is an alternative to v1, but after reading the three discussions I will join the group that supports both { and $" for C# interpolated strings, though I think I would prefer having both $" and $@" so that non-verbatim interpolated strings are accessible in two ways.
Nov 20, 2014 at 11:18 PM
NetMage wrote:
It's not clear to me if this replaces v1 or is an alternative to v1, but after reading the three discussions I will join the group that supports both { and $" for C# interpolated strings, though I think I would prefer having both $" and $@" so that non-verbatim interpolated strings are accessible in two ways.
I'm working on v3 now, which supports both interpolated "ordinary" strings like

$"Hello, \"{name}\"\n"

and interpolated verbatim strings

$@"Hello, ""{name}""
"

Nov 21, 2014 at 12:58 AM
nmgafter wrote:
NetMage wrote:
It's not clear to me if this replaces v1 or is an alternative to v1, but after reading the three discussions I will join the group that supports both { and $" for C# interpolated strings, though I think I would prefer having both $" and $@" so that non-verbatim interpolated strings are accessible in two ways.
I'm working on v3 now, which supports both interpolated "ordinary" strings like

$"Hello, \"{name}\"\n"

and interpolated verbatim strings

$@"Hello, ""{name}""
"

Neat. I assume that in the first case you'd have no need to escape { or } but in the latter case you'd double-em up? Good to see escape sequences coming back.

Any particular reasoning for requiring prefixes in both cases? Just to immediately call attention to the fact that it's an interpolated string? You guys should also get the Visual Studio guys to add a separate font/color display item for interpolated strings also, just like they do for verbatim strings.
Nov 21, 2014 at 4:33 AM
That sounds good,but I think I wasn't clear - I'd like to see the { in non-prefixed strings supported as well - a little Perl tmtowtdi.
Nov 21, 2014 at 7:30 AM
NetMage wrote:
That sounds good,but I think I wasn't clear - I'd like to see the { in non-prefixed strings supported as well - a little Perl tmtowtdi.
Any particular reason why?
Nov 21, 2014 at 10:13 AM
In the VS2015 preview, am I correct that the FormattedString in System.Runtime.CompilerServices.FormattedString is not implemented there ?
And, that the $ prefix is not included yet ?
Nov 21, 2014 at 2:37 PM
nmgafter wrote:
namespace System.Runtime.CompilerServices 
{ 
    public struct FormattedString : System.IFormattable 
    {
        private readonly String format;
        private readonly object[] args;
// [skipped]
        public object[] Args => this.args;
        string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
        {
            return String.Format(formatProvider, format, args);
        }
    } 
} 
Is it intended that Args is returned as a mutable collection? Wouldn't it be better to return a copy or a read-only wrapper?
Nov 21, 2014 at 4:54 PM
VladD wrote:
Is it intended that Args is returned as a mutable collection? Wouldn't it be better to return a copy or a read-only wrapper?
What would be better would be if the Framework were to define an immutable array type. Given the lack of such a thing, I would suggest having public readable properties for object this[int] and int ArgsLength as well as Object[] GetArgsArray() and void CopyToArgsArray(Object[] dest) methods. Structures should only expose properties of array types in cases where the structure is expected to identify an array rather than encapsulate its contents (not the case here), and members which return defensive copies of arrays should be methods rather than properties.
Nov 23, 2014 at 4:48 AM
Edited Nov 23, 2014 at 4:49 AM
supercat wrote:
What would be better would be if the Framework were to define an immutable array type.
But there is ImmutableArray<T>.
Nov 23, 2014 at 12:13 PM
r_keith_hill wrote:
But there is ImmutableArray<T>.
If so, I wonder why the reflection-related methods still have no high-performant counterparts with ImmutableArray as return type.
Nov 23, 2014 at 4:57 PM
Edited Nov 23, 2014 at 4:57 PM
nmgafter wrote:
namespace System.Runtime.CompilerServices 
{ 
    public struct FormattedString : System.IFormattable 
    {
        private readonly String format;
        private readonly object[] args;
// [skipped]
        public object[] Args => this.args;
        string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
        {
            return String.Format(formatProvider, format, args);
        }
    } 
} 
Why no access to the ArgHoles?
Also it would be better to cache the result of the call to String.Format.
private string _Cached_ = default<String>;

string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
{
  if( _Cached_ == null ) { _Cached_ = String.Format(formatProvider, format, args); }
  return _Cached_ ;
}
Nov 23, 2014 at 7:46 PM
AdamSpeight2008 wrote:
Also it would be better to cache the result of the call to String.Format.
private string _Cached_ = default<String>;

string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
{
  if( _Cached_ == null ) { _Cached_ = String.Format(formatProvider, format, args); }
  return _Cached_ ;
}
IFormattable f = $"Today is {DateTime.Today.DayOfWeek:dddd}";
Console.WriteLine(f.ToString(String.Empty, new CultureInfo("pt-PT")));
Console.WriteLine(f.ToString(String.Empty, new CultureInfo("en-US")));
Nov 23, 2014 at 8:55 PM
AdamSpeight2008 wrote:
Why no access to the ArgHoles?
I'm not sure what that means. That is what Args is.
Also it would be better to cache the result of the call to String.Format.
private string _Cached_ = default<String>;

string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
{
  if( _Cached_ == null ) { _Cached_ = String.Format(formatProvider, format, args); }
  return _Cached_ ;
}
It is possible for the "current locale" to be different each time you call String.Format, so we'd have to cache that too. And make the cache thread-safe. That is getting pretty heavyweight for something that started out as a small struct.
Nov 24, 2014 at 1:05 PM
I was slightly confused, thought this was for compile-time not runtime. ArgHole = Argument Hole "X;= { 0. -5: C3 } " the ArgHole would be { 0 , -5 : C3}
Nov 24, 2014 at 5:27 PM
r_keith_hill wrote:
But there is ImmutableArray<T>.
There are a number of ways in which a type like ImmutableArray<T> cannot be as good as a "special" immutable-array type could be with Runtime and language support.
  1. A "special" immutable-array type could support covariance (so a "reference to immutable array of Animal" could hold a reference to an array of Cat"
  2. Both ImmutableArray<T> and Array<T> could derive from a common ReadableArray<T> type, allowing code which needed to read from a supplied array to accept either a mutable or immutable array.
  3. A "special" immutable-array type could allow elements to be accessed using array-access instructions, rather than method calls, allowing for more efficient operation.
  4. If there were an immutable-array class type which--like Array--held the contents of the array itself, references to such arrays could be passed directly as type Object without boxing.
  5. In many cases, a normal pattern for creating an immutable array holding desired data is to populate a mutable array with the data, and produce an immutable array object with a copy of all the data from the mutable, and then destroy all reference to the mutable array. This could be made much more efficient with compiler/verifier/runtime support for an ImmutableArrayBuilder<T> structure which could not be assigned, passed by value, stored in a field, or boxed, but had a "special" ability to write the encapsulated array. Those restrictions would make it safe for an ImmutableArrayBuilder<T> to support a Freeze method would would read and invalidate the array reference stored therein, and then return the read reference.
At present, if code wants to create an instance of a generic an arbitrary-sized immutable-array type which doesn't have runtime support, it will be essentially impossible to avoid creating at least two new object instances, typically a mutable array into which the data will be placed, and a second copy of the array which would be created by the wrapper object. Further, it will be necessary to either construct an immutable-array-wrapper class object or else have the immutable-array wrapper be subject to boxing.

Because of all these limitations, building an immutable array once rather than constructing a defensive copy of data each time it's requested will only result in a cost savings if the same data would be requested multiple times; if data's only requested once, returning a mutable array will be cheaper.
Nov 25, 2014 at 12:34 AM
supercat wrote:
There are a number of ways in which a type like ImmutableArray<T> cannot be as good as a "special" immutable-array type could be with Runtime and language support.
In my opinion, it must be just IReadOnlyList<T>, with whatever unspecified specific implementation.
  1. A "special" immutable-array type could allow elements to be accessed using array-access instructions, rather than method calls, allowing for more efficient operation.
I think jitter can usually inline access to the underlying data, right?
  1. In many cases, a normal pattern for creating an immutable array holding desired data is to populate a mutable array with the data, and produce an immutable array object with a copy of all the data from the mutable, and then destroy all reference to the mutable array. This could be made much more efficient with compiler/verifier/runtime support for an ImmutableArrayBuilder<T> structure which could not be assigned, passed by value, stored in a field, or boxed, but had a "special" ability to write the encapsulated array. Those restrictions would make it safe for an ImmutableArrayBuilder<T> to support a Freeze method would would read and invalidate the array reference stored therein, and then return the read reference.
But this doesn't apply to our case! In our case, we can have an immutable list built once in the constructor, so the reference to it could be safely given out in the accessor without fear that a malicious client will tamper with the underlying data.
At present, if code wants to create an instance of a generic an arbitrary-sized immutable-array type which doesn't have runtime support, it will be essentially impossible to avoid creating at least two new object instances, typically a mutable array into which the data will be placed, and a second copy of the array which would be created by the wrapper object. Further, it will be necessary to either construct an immutable-array-wrapper class object or else have the immutable-array wrapper be subject to boxing.

Because of all these limitations, building an immutable array once rather than constructing a defensive copy of data each time it's requested will only result in a cost savings if the same data would be requested multiple times; if data's only requested once, returning a mutable array will be cheaper.
We've got already the first mutable array with all the data, so this part is for free. So, we need only one array copy + one wrapper allocation. For handing out a mutable copy each time, we need a fresh array copy each time, with potentially unbounded (dependent only on the user) size. So even in case of just one request, we have an overhead of only one immutable wrapper creation. What we win is constant getter time and clear semantics, which is important as well.
Nov 25, 2014 at 12:58 AM
We discussed the safety/performance trade-off in FormattedString (which we're renaming to FormattableString) possibly producing a fresh copy of the array when returning it from the Args property, so as to protect it from mutation (or using an immutable type such as ImmutableArray<object>). In the end we decided it wasn't worth it. Don't use APIs that mutate the Args.
Nov 25, 2014 at 5:12 PM
VladD wrote:
In my opinion, it must be just IReadOnlyList<T>, with whatever unspecified specific implementation.
IMHO, an interface called IReadOnlyList<T> should promise that exposing a reference to any legitimate implementation should not enable the recipient of the reference to alter the sequence encapsulated thereby (if T is a mutable type, the recipient of the reference could mutate the items contained within the list, but could not change the identities of items in the list). I'd much rather have seen IReadableList<out T> with the following members:
IReadableList<T> Snapshot();
IFullyMutableList<T> AsMutable();
IReadOnlyList<T> AsReadOnly();
IImmutableList<T> AsImmutable();
The first would return the implementing object if immutable, or else return a new mutable list with a copy of its contents. The second would return the implementation itself if mutable, or a new mutable list with a copy of its contents. The third would return the implementing object if it was read-only (or immutable), or else would return at its convenience a read-only wrapper or an immutable copy. The fourth would return the implementing object if immutable, or else return a new immutable copy. Such a design would make it possible for code which receives a list of arbitrary type to do whatever needs to be done with it to make it useful while avoiding most redundant copy operations. As it is, there's no way for code which receives a reference to an IReadOnlyList and wants a "permanent" copy of its contents to know whether it needs to make one, or whether simply holding the reference will suffice.
  1. A "special" immutable-array type could allow elements to be accessed using array-access instructions, rather than method calls, allowing for more efficient operation.
I think jitter can usually inline access to the underlying data, right?
Sometimes, but generally not with a collection of structures. Given struct Point3d {public float X,Y,Z;} Triangle3d {public Point3d A, B, C}; ImmutableArray<Triangle3d> arr, reading arr[2].A.X as a property would require making a temporary copy of all members of MyImmutableArray[2], then making a temporary copy of field A of that temporary copy, and then reading field X of that. It might in theory be possible for the JITter to figure out how to reduce all that to an array const-lvalue lookup and field dereference, but I wouldn't expect it to do so.

Further, the only way an operation like Array.Copy could be performed efficiently when the source is an ImmutableArray would be if Array and ImmutableArray were in the same assembly, or if one type made its internals visible to the other and the other type knew how to use them.
But this doesn't apply to our case! In our case, we can have an immutable list built once in the constructor, so the reference to it could be safely given out in the accessor without fear that a malicious client will tamper with the underlying data.
Built in whose constructor? If a type like ImmutableArray is going to promise immutability, it must know that its backing store has never been exposed to any code that might mutate it in future. If the backing store is a mutable array, that would imply that either the ImmutableArray constructor made the array itself, or else that it has some means of knowing that the array came from code that hasn't and won't exposed it to anything that might modify it in future. Using IReadOnlyList<T> might make sense, though if IReadableList<T> (as defined above) existed, that would have been better for reasons discussed below.
Because of all these limitations, building an immutable array once rather than constructing a defensive copy of data each time it's requested will only result in a cost savings if the same data would be requested multiple times; if data's only requested once, returning a mutable array will be cheaper.
We've got already the first mutable array with all the data, so this part is for free. So, we need only one array copy + one wrapper allocation. For handing out a mutable copy each time, we need a fresh array copy each time, with potentially unbounded (dependent only on the user) size. So even in case of just one request, we have an overhead of only one immutable wrapper creation. What we win is constant getter time and clear semantics, which is important as well.
The only way you win constant "getter" time is if the underlying type uses as its backing store an array whose contents are never going to change. If it doesn't, then the first "get" following a change will require constructing a new array. If the type uses a field to cache a reference to the array it returned, a second "get" could return a reference to the same object; as a consequence, if two or more "get" operations are performed without any modifications having occurred between them, and if the class exposing the data used a field to cache the reference, it would represent a "win".
Dec 3, 2014 at 11:48 PM
PauloMorgado wrote:
NetMage wrote:
That sounds good,but I think I wasn't clear - I'd like to see the { in non-prefixed strings supported as well - a little Perl tmtowtdi.
Any particular reason why?
Ignoring the previous it makes typing easier :), I see it as maintenance changes versus new code.

If you are working on existing code that contains standard strings and you want to use string interpolation, having to prefix the string and then escape the existing braces in the string with backslashes or doubling them up (for a verbatim string) seems error prone. OTOH, if you are writing new code where you know you intend to use string interpolation, a prefix and not having to escape each hole is a nicer writing and reading experience.

I will resurrect supercat's dead syntax proposal and suggest moving interpolation expressions outside the string for regular stings, and leaving them inside the string for prefixed strings.
var greeting = "Hello, "(name)"";
The closing is unfortunate, but the hole is indicated by the double character "( and closed by )" - following the expression is an optional format specifier. A verbatim string would look exactly the same in this case (they already treat " as a meta character anyway). Since we use parens for the escape hole, we use them for prefixed holes as well:
var greeting = $"Hello, (name)";
Obviously \( and \) would quote parens in a prefixed templated string. Templated verbatim strings would be indicated by combining prefixes and verbatim templated strings wouldn't be allowed.
var fullname = $@"(drive):\(path)\(filename).txt";
var fullname2 = @""(drive)":\"(path)"\"(filename)".txt";
Not sure why you wouldn't just use concatenation at the beginning of the fullname2 definition though:
var fullname2 = drive + @":\"(path)"\"(filename)".txt";
This does add another character to escaped holes but should handle most situations, though those creating lisp interpreters or horn clause analyzers may find it painful.
var ans = $"\((car).\((cdr)\)\)";
var ans2 = "\("(car)".("(cdr)"))";
A leading paren in a regular string can't be the start of a hole, but it feels to me like it should be escaped anyway.
Dec 4, 2014 at 10:54 AM
NetMage wrote:
If you are working on existing code that contains standard strings and you want to use string interpolation, having to prefix the string and then escape the existing braces in the string with backslashes or doubling them up (for a verbatim string) seems error prone.
I wouldn't be concerned about this too much. Regardless of final syntax, I expect VS to ship with some diagnostic and code fix to help with convertion to the new syntax.
Jan 28, 2015 at 8:37 AM
Edited Jan 28, 2015 at 8:38 AM
Invariant vs Current culture is very common scenario. Can we introduce unique prefix to distinguish invariant vs current, rather than using helper methods?
To me, for example, $i"{this.willBeFormatted} as Invariant" + $", but {this.willBeFormatted} as Current" seems a very good option
Jan 28, 2015 at 3:46 PM
@LostTheBlack Unfortunately we ran out of runway to consider design changes a couple of months ago.
Jan 30, 2015 at 9:58 PM
Edited Jan 30, 2015 at 9:59 PM
LostTheBlack wrote:
Invariant vs Current culture is very common scenario. Can we introduce unique prefix to distinguish invariant vs current, rather than using helper methods?
To me, for example, $i"{this.willBeFormatted} as Invariant" + $", but {this.willBeFormatted} as Current" seems a very good option
It's really a shame that there has historically not been a clearer line drawn between formatting for human consumption versus machine consumption. If there had been e.g. separate format specifiers for "Short-form of month in current culture" and "Three-character month string chosen from [Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec] regardless of culture", then code which used the latter format specifier wouldn't have to worry about culture settings. As it is, the present design invites bugs in code which generates what are supposed to be machine-readable strings.
Jan 30, 2015 at 10:17 PM
nmgafter wrote:
We discussed the safety/performance trade-off in FormattedString (which we're renaming to FormattableString) possibly producing a fresh copy of the array when returning it from the Args property, so as to protect it from mutation (or using an immutable type such as ImmutableArray<object>). In the end we decided it wasn't worth it. Don't use APIs that mutate the Args.
What's the current status of this feature? In VS2015 CTP 5, I can see that it uses the final syntax ($"hello {name}"), but the implicit conversion to FormattableString doesn't seem to work, and I can't find this type anywhere... There are a few references to it in the Roslyn code, but that's all. Apparently, in the unit tests, its source is added manually; I tried to copy the FormattableString and FormattableStringFactory classes from there, but the compiler ignores them. Is this feature implemented at all? Do I need to build the compiler from the latest source code to use it?
Jan 30, 2015 at 10:23 PM
FormattableString is something that is going to be included in mscorlib. It is not a type that is going to be defined in the compiler libraries. I can check but I don't think the framework has this type in CTP5.
Jan 30, 2015 at 10:29 PM
jmarolf wrote:
FormattableString is something that is going to be included in mscorlib. It is not a type that is going to be defined in the compiler libraries. I can check but I don't think the framework has this type in CTP5.
No, it doesn't. But I thought I could add it myself in my own source code, the same way it was possible to use extension methods when targeting .NET 2 by creating the ExtensionAttribute class in our own code. I tried, but it didn't seem to work... Does the class have to be in mscorlib?
Jan 30, 2015 at 10:40 PM
Your approach would work, its just that the FormattableString piece isn't something the compiler understands in CTP5. I know its not there for VB. I believe C# narrowly missed it as well. If you want to try that out, you'll need to build from source.
Jan 30, 2015 at 11:02 PM
jmarolf wrote:
Your approach would work, its just that the FormattableString piece isn't something the compiler understands in CTP5. I know its not there for VB. I believe C# narrowly missed it as well. If you want to try that out, you'll need to build from source.
OK, thanks for the clarification
Mar 19, 2015 at 12:27 AM
Edited Mar 19, 2015 at 12:27 AM
I thought about things like this when I wrote the Regex Engine for the Micro Framework. (http://www.codeproject.com/Articles/386890/String-Manipulation-in-the-NET-Micro-Framework)

I would definitely rather see this feature than Primary Constructors although I feel that there are still various other things which could be enhanced, (Events, Recursion, Property Syntax and Intrinsic[s])

If this work is done then it would also make sense to provide a built in way to convert from a Literal String to an Escaped String and the other way around.

The functionality is already (mostly) present in the Regex Engine anyway.

http://referencesource.microsoft.com/#System/regex/system/text/regularexpressions/RegexParser.cs,4070f9c78dfd04a7

Escape and Unescape.
Mar 19, 2015 at 1:53 AM
We are no longer using codeplex for Roslyn. Please take the discussion to https://github.com/dotnet/roslyn