This project is read-only.

Provide access to the binder

Topics: APIs
Apr 5, 2014 at 1:59 PM
For more advanced C# language processing scenarios (in my case: Translation of C# to Javascript), access to BoundExpression (and its subtypes) is needed. Can you please consider opening up these types?
Apr 5, 2014 at 3:41 PM
Are there specific questions that you are unable to answer with a SemanticModel? If so, we would love to hear about them. We have resisted opening access to bound nodes so far because it would add a huge new API surface to explain, document, version, etc.
Apr 5, 2014 at 4:07 PM
Edited Apr 5, 2014 at 4:08 PM
As I understand it, there is no straight-forward way of determining that given these declarations:
class A {
    public static int operator+(B b) {
    }
}
class B {}
class C {
    public static implicit operator int(C c) {
    }
}
A a;
B b;
C c;

void F(int a, int b = 12, int c = 0, int d = 0) {}
the meaning of the statement
long d = a + b - c;
means
(long)(a.op_Addition(b) - c.op_Implicit<int>())
Or that
F(3, c: 2, b: 4)
means
F(3, 4, 2, 0)
(with the exception of parameter evaluation order)
Apr 5, 2014 at 4:21 PM
Additionally, I think that if you are writing a tool that works on the semantic level you want to operate on a semantic tree as opposed to a syntactic one. For example, you probably don't want to have to write different code to handle
obj["name"]
and
obj.$name
because the meaning is the same.
Apr 5, 2014 at 4:36 PM
Or, given
class C {
    int A;
    static int B;

    void M() {
        A = 2;
        B = 3;
    }
}
that the "A = 2" statement actually means "this.A = 2".

If I think more (or even try to port my current codebase to Roslyn from NRefactory), I will probably find more examples of where a semantic tree opposed to a syntactic one is useful.
Apr 5, 2014 at 5:27 PM
And, while I realize I could probably do all this myself, it also means that everyone else doing semantic analysis of C# programs need to do the same, leading potentially to every such tool having its own version of the binder. Most of those implementations will be more buggy than Roslyn's one, leading to a quality decrease in the kind of tools that Roslyn should enable.

As the API currently is, I think it is clearly inferior to NRefactory for doing semantic analysis.
Apr 7, 2014 at 11:22 AM
Edited Apr 7, 2014 at 11:25 AM
I'm currently using NRefactory in a tool that cross-compiles shaders written in C# to GLSL and HLSL. I was considering to port that tool to Roslyn. If I remember correctly, Anders Hejlsberg mentioned at Build that Roslyn is well suited for such tasks. But I see the same problems as erikkallen above with the current public API of Roslyn. How do you write cross-compilers using Roslyn? Or was Anders not aware that the necessary API for writing cross-compilers is internal?
Apr 11, 2014 at 4:58 PM
What's the status on this? Has it been silently determined to not be done, or are you considering it?
Apr 12, 2014 at 1:05 AM
Hey Erik,

Sorry for slow response here. In answer to your question it has been a long standing design decision not to expose the bound tree. The shape of the bound nodes is actually pretty fragile compared to the shape of the syntax nodes and can be changed around a lot depending on what scenarios need to be addressed. We might store something somewhere one day for performance and remove it the next for compatibility or vice versa. The concern is that once the shape of the bound tree becomes an API contract is significantly inhibits our ability to evolve the language or our implementation strategies. Also, we've found it a great forcing function for evolving the more targeted APIs to be more useful.

For example, for this code:
long d = a + b - c;
meaning
(long)(A.op_Addition(a, b) - C.op_Implicit<int>(c))
(I think there's an operator missing somewhere in there) This is actually discoverable with the current API. If you acquire a SemanticModel (our abstraction over the bound trees) and call GetTypeInfo passing in the expression "a + b - c" you will get back a TypeInfo object which tells you the kind of conversion being performed if any, the type of the expression, and the type the expression is being converted to. The implicit conversion to long is discoverable there. If you call GetTypeInfo on the expression 'c' the TypeInfo object would show that a user-defined conversion from type C to int was being performed and the IMethodSymbol for C.op_Implicit<int> would also be part of the Conversion object. And if you pass in the expression "a + b" to the SemanticModel.GetSymbolInfo method you'll get back the IMethodSymbol for A.op_Addition.

Regarding your example with the named parameters, there is an item on our backlog to add an API which, given an ArgumentSyntax will return the IParameterSymbol it corresponds to. Today you could do it by first walking up to the containing InvocationExpression, passing it to GetSymbolInfo to get the symbol for the method being invoked and them matching the parameters manually by name or position (tedious, we know).

It may not have been implemented in the preview but the specification does call for GetSymbolInfo on obj.$name to return the symbol for the indexer being used (consistent with the VB quick info experience in VS2013 RTM for hovering over obj!name). I wouldn't be surprised if the Find All References engine uses this to consider obj.$name a reference to the indexer.

To understand the meaning of A = 2 one would pass the IdentifierNameSyntax expression "A" into GetSymbolInfo would return the IFieldSymbol for C.A which would reveal that it was not a static field from which it could be inferred that the ThisParameter is the implicit receiver.

These are the strategies on which our entire IDE is built and we use it for all of our refactorings. It is perhaps a little less direct than you would intuit but for now it is the API surface area we believe it is wisest to expose at this time. If there are shortcomings in these APIs or additional APIs you need which are not here let us know and we'll see how we can accommodate. That said when and whether to expose it is a constant ongoing decision we reweigh periodically an the future is not written in stone.

Regards,

Anthony D. Green | Program Manager | Visual Basic & C# Languages Team
Apr 12, 2014 at 8:25 AM
Thanks for the detailed response, Anthony, especially for explaining how the SemanticModel should be used. Obviously, some documentation is missing here. Anyway, you're saying that in your opinion, the SemanticModel should provide all necessary information to write a cross-compiler from C# to some other target language? Would it be possible to write the later stages of Roslyn with access to the SemanticModel only? If that's the case, I can live with the fact that getting the required information is a bit more complicated for the moment, given that it allows you greater flexibility in your design.
Apr 12, 2014 at 9:39 AM
@ADGreen: First of all, thank you for your reply.

Perhaps it is because I have worked with NRefactory for years, but I do Roslyn's API for doing these things very quirky. For example, even with the API you have know shown me, it is still my responsibility to, in every single place, ensure that proper conversions are applied, information which would be immediately obvious in a semantic model.

And, while it is very impressive that all IDE features are built on top of the public API, I feel that this is not too much so since the API has clearly been implemented with IDEs in mind (I find the AST API to be incredibly solid; good job there!). But, in order to ensure that Roslyn is generally useful to any conceivable use of a C# language service, I think you need to consider refactoring the CodeGen namespace to only use public symbols. If (and only if) such a refactoring wouldn't duplicate much of the functionality in the binder, I will agree that my opinion on this subject is just due to me being used to NRefactory and I will change my mind.

Perhaps there could be some middle ground where a semantic tree is exposed, but as a hierarchy consisting of nothing but method calls (to potentially synthesized methods)? That wouldn't increase the documentation/backwards compatibility surface area very much (you need the synthesized methods for the current API anyway, and only one additional class would be required for the model).
Apr 30, 2014 at 12:54 AM
Just for information on the process, if I implemented a function like the one you are describing
Regarding your example with the named parameters, there is an item on our backlog to add an API which, given an ArgumentSyntax will return the IParameterSymbol it corresponds to.
Would you accept a patch with it (provided of course the patch quality is good enough)? Is someone already working on this? Do you have any tips?
Apr 30, 2014 at 5:53 PM
I don't think we can promise to accept a patch, but we'd definitely consider it. Nobody is working on this right now.