This project is read-only.

Use readonly keyword for immutable classes/structs

Topics: C# Language Design
May 15, 2014 at 9:21 AM
Edited May 15, 2014 at 9:32 AM
At first, I thought of suggesting a new keyword "immutable" but then I thought: "why not using readonly?"

The idea is to allow adding the word readonly when defining a class/struct to enforce immutability and let the compiler check that the class/struct is indeed immutable by only allowing setting fields with public read access (internal/protected?) through constructors.
public readonly struct Vector2fi
{
    ...

    public Vector2fi(float x, float y) { ... }

    ...

    public float X { get; private set; }

    public float Y { get; set; } // The compiler will show an error here

    ...

    public void DoSomeMath()
    {
        this.X += 2f * this.Y; // The compiler will show an error here
    }

    ...
}
And along with this suggestion: https://roslyn.codeplex.com/discussions/543883

... this could be rewritten as:
public readonly primitive Vector2fi // or even numeric
{
    ...

    public Vector2fi(float x, float y) { ... }

    ...

    public float X { get; private set; }

    public float Y { get; private set; } // Ok

    ...

    public float DoSomeMath()
    {
        return this.X + 2f * this.Y; // Ok
    }

    ...
}
... which could be useful to work with pointers so as to avoid index checking on arrays.
May 15, 2014 at 3:06 PM
Edited May 15, 2014 at 11:38 PM
While I can see some value in this proposal I think that in practice this might be too limiting. For example, in many immutable systems the implementation isn't actually fully immutable. For example, consider a tree whose nodes are populated lazily. Once a node is constructed and returned, it's value must never change so that the observed behavior is immutable.

However, the implementation could absolutely employ a mutable design. Sometimes that's even the only way to construct immutable data structures. For example, let's say you need an immutable tree with the feature for accessing the parent node in an efficient way. One simple way to implement this is like this:
public sealed class Tree
{
    private readonly TreeNode _root;

    // Mutable field that stores the mapping from parent to child
    // NOTE: The implementation promises to assign this field in a thread-safe manner.
    private IReadOnlyDictionary<TreeNode, TreeNode> _parentFromChild;

    public Tree(TreeNode root)
    {
        _root = root;
    }

    public TreeNode GetParent(TreeNode node)
    {
        if (_parentFromChild == null)
            Interlocked.CompareExchange(ref _parentFromChild, GetParents(_root), null);

        TreeNode parent;
        _parentFromChild.TryGetValue(node, out parent);
        return parent;
    }

    private static IReadOnlyDictionary<TreeNode, TreeNode> GetParents(TreeNode root)
    {
        // ...
    }

    public TreeNode Root { get { return _root;} }
}
May 15, 2014 at 5:11 PM
Proper support for immutability would require having a type system which could distinguish between "unrestricted reference to Foo" and "restricted reference to Foo", and allow a "unrestricted access to DerivedFoo" to be downcast as either "unrestricted access to Foo" or "restricted access to DerivedFoo", even though the latter types are unrelated. I'm not sure whether there would be any practical way to extend the .NET runtime to accommodate such a concept, since generic types are at present presumed to be castable to Object, but casting a restricted reference to an unrestricted Object would be prohibited.

Otherwise, if the only intention is to impose shallow immutability by forbidding writes to fields after construction is complete, I'm not quite clear in what ways readonly would be considered inadequate aside from the present lack of a means of declaring read-only auto-properties--an omission which should be fixed in any case.
May 15, 2014 at 5:31 PM
Edited May 15, 2014 at 10:01 PM
@terrajobst: I use interlocked a lot, so I'm familiarized with that use-case scenario. However, in your example the mutable field is private and not meant for public access whereas my suggestion is meant for fields with public access (and maybe for internal; not quite sure yet for protected ones).

Having said that, for those cases where mutability should be allowed, they could be marked with a specific Attribute, like say: [Mutable] when "readonly" is present on the class/struct.

Side note: I guess you meant "node" instead of "child" on the GetParent method.
May 15, 2014 at 11:44 PM
Edited May 15, 2014 at 11:48 PM
Side note: I guess you meant "node" instead of "child" on the GetParent method.
Yes. If only this site would use Roslyn as the code editor ;-)
my suggestion is meant for fields with public access
I see; I missed that. My assumption was that the readonlyness would be viral, along the lines what @supercat said. In that case it seems the readonly modifier doesn't do much; it would only enforce a shallow version of immutability and at best support the developer with expressing the intent in the scope of the class.

As an aside, despite the fact that I'm listed as a developer, I'm actually not a member of the Roslyn team. So don't take my comments as an official statement from the C#/VB team. I'm working on the .NET Framework team, so my comments are written from a consumption standpoint.
May 16, 2014 at 1:59 AM
Edited May 16, 2014 at 3:21 PM
terrajobst wrote:
Yes. If only this site would use Roslyn as the code editor ;-)
Lol. Indeed.
May 17, 2014 at 4:53 AM
Edited May 17, 2014 at 5:00 AM
An interesting paper to read regarding a similar subject is Uniqueness and Reference Immutability for Safe Parallelism by Joe Duffy et al of Microsoft Research.
http://homes.cs.washington.edu/~csgordon/papers/oopsla12.pdf
Extended version of above paper: http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf

It describes some interesting solutions regarding a readonly concept and Immutability.

I'm wondering if the C# team has kept up with the Joe Duffy's work?
May 17, 2014 at 7:52 PM
bpschoch wrote:
An interesting paper to read regarding a similar subject is Uniqueness and Reference Immutability for Safe Parallelism by Joe Duffy et al of Microsoft Research.
http://homes.cs.washington.edu/~csgordon/papers/oopsla12.pdf
Extended version of above paper: http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf
I've only looked briefly at the paper, but it seems to capture a major concept I've been irked about for a long time. The majority of reference-type variables exist not for the purpose of identifying an object, but rather for the purpose of encapsulating the object identified thereby; this requires that all use of the variable must obey one of two rules, but there is no convention to indicate which of the rules the variable is supposed to follow.

For a variable to usefully encapsulate a value, it is necessary that the only the owner of the variable be able to change the value encapsulated therein. This can be achieved either by
  1. Ensuring that the object identified by the variable is never changed by anyone, or
  2. Ensuring that the object is owned exclusively by the owner of the variable, and no outside references will be used in any ways the owner does not expect.
Some types which hold arrays adhere to the first pattern; others adhere to the second. Unfortunately, there is nothing in the type system--or even in field naming conventions--which would distinguish between the two usages.

If an object has an field arr of type int[] which encapsulates the sequence {1,2,3} and it wishes for that field to encapuslate {2,2,3}, there are two ways that could be accomplished:
arr[0] = 2;
or
var temp = (int[])arr.Clone();
temp[0] = 2;
arr = temp;
Which approach is correct would depend upon which of the above rules is being followed.

I would posit that if there were a means by which variables and generic-type arguments could specify what sort of semantics the references contained therein were supposed to have, then issues of "deep cloning" versus "shallow cloning" would largely evaporate (if an object has a field which encapsulates an unsharable value, the corresponding field in a proper clone should identify a clone of that value; if a field encapsulates an immutable value or identifies a sharable entity, the corresponding clone in a proper clone should identify the same object; if a field identifies an unsharable entity, then unless "manual" cloning code can resolve the situation, the containing object is unclonable.

BTW, equality testing could also be handled largely automatically if virtual methods existed to test definitions of equivalence:
1. Two objects are equivalent if there is nothing that any other object could do to make them not equivalent.

2. Two objects are value-equal if the only way they could become not value-equal would be if code that holds a reference to them changes them.
Much of the murkiness around Equals is a result of the fact that each definition of equivalence is useful in different cases; I believe that if objects could test them separately, 99% of ambiguity would be eliminated.

How do those concepts sound?