This project is read-only.

String Interpolation: Will interpolation with pluralization be handled?

Topics: APIs, C# Language Design, VB Language Design
Jan 13, 2015 at 5:44 PM
Edited Jan 14, 2015 at 3:09 AM
It was pointed out to me that C# 6.0 is headed towards using an interpolation syntax I really liked a few years ago, with the $ formatting hint. Very cool! Back then I also liked the idea of using a string.Interpolate method that could do things like handle pluralization in the interpolation in case a language is more complex than English and the string may change substantially based on a count of items.

In English for example, it seems like you can do this:
// This will look odd if the age as an integer is 0 or 1 due to the grammar and pluralization rules of the English language
var s = $"{product.Count} products" ;
or
// This works in English in some cases, but it depends on the usage and placement of the text. In other languages with more complex pluralization the entire string changes.
var s = $"{product.Count} product(s)" ;
Here is an example is Russian translating N product or N products. Notice how the string for product or products changes not just by 0 or 1 as Russian has more complex pluralization rules.

0 продуктов
1 продукт
2 продукты
5 продуктов

It would be relatively powerful to bake this in/add to the interpolation design in the future in a way that is standard and familiar for all .NET framework developers. Of course something like this could be done as an extension method or with custom code, but a uniform method for this in the framework would have immense benefits for people doing internationalization and trying to display "friendly"/"natural language usage" strings across platforms (e.g. web, OS/Windows apps, etc.)

Here are just some possible examples (other languages and platforms support similar pluralization concepts):
public class PluralizationItem
{
     public string Token { get; set; }
     public string Number { get; set; } // This could perhaps be Count, but in practice Count tends to be a calculated/read-only type property in most code bases.
     public string ReplacementValue { get; set; } 
}
var p = string.Interpolate(s, IEnumerable<PluralizationItem> pluralizations);
or
public class PluralizationReplacementItem
{
     public string Token { get; set; } // This could perhaps also be "Key"
     public IEnumerable<PluralizationReplacementPair> Replacements { get; set; } 
}

public class PluralizationReplacementPair
{
     public string Number { get; set; }
     public string ReplacementValue { get; set; } 
}
// Interpolate overload of Interpolate(template, IEnumerable<PluralizationReplacementItem> ); 
var p = string.Interpolate(s, PluralizationReplacement replacementValues);
or
public class PluralizationItem
{
     public string Number { get; set; } // This could perhaps be Count, but in practice Count tends to be a calculated/read-only type property in most code bases.
     public string ReplacementValue { get; set; } 
}
// Interpolate overload of Interpolate(template, tokenOrKey, replacementPluralizations); 
var p = string.Interpolate(s, "products", IEnumerable<PluralizationItem> pluralizations);
Jan 14, 2015 at 11:56 AM
I don't think this is something the language could/should care about.

This would be a library's responsibility.
Jan 15, 2015 at 1:54 PM
I agree with Paulo that when you start getting into internationalization scenarios that the responsibility really should fall onto framework and tooling. Even with the proposed FormattedString class I think that String interpolation in the compiler will be fairly limited in what it can accomplish.

I've always mentioned that I thought that it would be very helpful for Visual Studio to expand on the Resources RESX designer to support easier localization through a form of interpolation. Typically the current method is to embed the composite format string as a resource item which is read out and the onus of formatting falls on either the consumer or a separate helper class. What I would like to see is when embedding a composite formatting string (either with indexed or named argument holes) that the Resources designer allows the developer to specify metadata associated with each of those holes, specifically the type (defaulting to System.String) and a name (if not supporting named argument holes). The ResXFileCodeGenerator would then generate methods for those resources accepting the argument hole values as parameters. This would make it much easier to use resource files and encourage good practices.

e.g. Greetings: "Hello {name}!" -> string greetings = Resources.Greetings("World!");

I've finally gotten around to submitting this concept as a suggestion to UserVoice: Support String Interpolation in Resources Project Items

I do like your idea of supporting pluralization in the formatted resource. I've always liked how AngularJS had approached the plural forms by the form of a JSON object describing the replacement value for each number of group name (e.g. one, few). I added a comment to the UserVoice suggestion specifically mentioning plural forms with some rough ideas as to how it would be accomplished. However, I quickly realized that even in my contrived example, which is only modestly more complicated than yours, that it immediately runs into a common issue with internationalization in that more of the phrase is affected by a change in quantity.
Jan 16, 2015 at 1:29 AM
PauloMorgado wrote:
I don't think this is something the language could/should care about.

This would be a library's responsibility.
I agree, but think for that reason that language features like this should make it easy for code to have control over what library functions are used.
Jan 25, 2015 at 4:49 PM
I think that the resource editor (.resx files) is definitely the wrong place to be working with localization. That should all be done through Multilingual App Toolkit (MAT), and its workflow. It works with industry-standard XLIFF files. So ideas on pluralization should be done through that.

I've only localized one app in a really serious way but I found that this kind of pluralization was way at the bottom of the problems I faced. Much more difficult was how to get formatting in the localized strings (bold, italic); how to cope with translators who can't use tools and who produce "messy" translations; how to deal with the UI when the translations sometimes take twice as much space, especially on a space-constrained phone app or in a Metro-style "typography-first" design.
Jan 26, 2015 at 10:38 PM
lwischik wrote:
I think that the resource editor (.resx files) is definitely the wrong place to be working with localization. That should all be done through Multilingual App Toolkit (MAT), and its workflow. It works with industry-standard XLIFF files. So ideas on pluralization should be done through that.

I've only localized one app in a really serious way but I found that this kind of pluralization was way at the bottom of the problems I faced. Much more difficult was how to get formatting in the localized strings (bold, italic); how to cope with translators who can't use tools and who produce "messy" translations; how to deal with the UI when the translations sometimes take twice as much space, especially on a space-constrained phone app or in a Metro-style "typography-first" design.
Perhaps. That tool just generates out the individual satellite resx files and depends on the built-in resource project item and code generation in order to create the backing class so I don't think that invalidates anything that I've said. It might move some of the metadata support, but you'd still need the existing ResXFileCodeGenerator tool to play along.

Either way, my point is that i18n "interpolation" is better solved by tooling support. I think vanilla Visual Studio does very little to help a developer in discovering the proper path towards i18n and the more barriers there are in the way the less likely a developer will take those issues into consideration. You're absolutely correct that the actual translations are probably the least of the concerns.