Why is LINQ not exploited in the Code?

Topics: General
Apr 9, 2014 at 3:35 PM
Edited Apr 9, 2014 at 3:39 PM
To the team,

I think you're doing an amazing job, creating a compiler. As a competent programmer and looking through the source-code for the VB compiler I see lots of opportunity to use LINQ, wasn't.
Eg Any
        Friend Function Any(kind As SyntaxKind) As Boolean
            Dim i As Integer
            For i = 0 To Me._count - 1
                If (Me._nodes(i).Value.RawKind = kind) Then
                    Return True
                End If
            Next i
            Return False
        End Function
Could have been written as:-
Friend Function Any(kind As SyntaxKind) As Boolean
 Return Me._nodes.Any(Function(n) n.Value,RawKind = kind)
End Function
Seems strange (to me). not to utilise it. Could you provide some of the rationale behind this?


Also there is potential to reduce the line count if, single line IF statements where used;-
 If (Me._nodes(i).Value.RawKind = kind) Then Return True
instead of the three line version
If (Me._nodes(i).Value.RawKind = kind) Then
    Return True
End If
Apr 9, 2014 at 3:57 PM
In the contribution guidelines you can read that LINQ is forbidden in compiler hot paths. Linq causes GC which kill performance.
Marked as answer by TomasMatousek on 4/9/2014 at 4:48 PM
Developer
Apr 9, 2014 at 4:46 PM
Apr 9, 2014 at 7:33 PM
That may explain the LINQ but the 3-Line IF?
Coordinator
Apr 10, 2014 at 2:05 AM
Hey all,

In answer to the first question, as others have suggested, the reason we don't exploit LINQ inside of the compiler is because LINQ, through the creation of lambdas and IEnumerables/IEnumerators, results in allocations which create GC pressure which impacts performance. Even though we do use struct enumerators and our own immutable valuetype collections it's still very easy to accidentally fallback on LINQ and end up boxing away all of those savings. Given that the compilers are the central component in the entire IDE and are accessed very frequently (read: every keystroke) it's important to keep their implementations as lightweight as possible. In the IDE codebase (which is not on CodePlex) we do make use of LINQ in all the places you'd expect, only avoiding it when profiling guides us to do so.

As for the 3-line IF, yes they could be made shorter. I think the most obvious origin for this is that both the VB and C# compilers were originally written in C++ and half of our codebase is maintained in C# and in both C++ and C# developers are trained (violently, if necessary) to not write their IFs this way:

if (condition) consequence;

or this way:

if (condition)
consequence;
but instead like this:

if (condition)
{
consequence;
}

This is the practice because in C# and C++ it's easy for developers to misread the structure of the code and possibly introduce bugs by adding an extra line of code with the wrong indentation and mistaking it for being included under the IF. Of course, in VB due to the enforcement of indentation by the editor such bugs wouldn't happen but after years of training in C-like languages the habit carries over. I, for one, do use the single-line If form in my own code (particularly for guard statements) but some on the team do find them sometimes hard to read, especially if the condition is very long and the consequence is scrolled off screen so we've mostly avoided them. As with all large codebases actively developed by a team, peace is only maintained through the delicate ceasefire known as "coding conventions" (don't even get us started on the "Great 'var'/type-inference battle of 2010" or the "Import alias skirmish of 2011"). I don't actually believe the single-line IF is currently forbidden by any of our current conventions, I just think those of us in "the resistance" are too terrified to risk bringing the wrath of the code reviewers down upon us to brave using them out in the open :) In fact, I fear I've said too much already!

Regards,

Anthony D. Green | Program Manager | Visual Basic & C# Languages Team
Apr 10, 2014 at 3:24 PM
You know I had a sneaky suspicion that it'll be down the C style programmers, who worry about their dangling bits. (chough Apple cough)
So the C-Style coders are mortals as well (confused by a one-line if !?), and only profess to be the one true god of programming style.

The condition is to long why not use implicit line continuations? (You could also switch off the pretty printer, and align the code manually)
If (condition0) AndAlso
   (condition1) AndAlso
   (condition2) Then consequence
On the LINQ aspect.

Array.Exists wasn't used because _Nodes isn't an array. But let's say it was.
If Array.Exists( Me._Nodes, Function(n) n.Value.RawKind = kind) Then
This would result in a allocation because of the Lambda Function?

Are the "coding conventions" available? Eg for submission / pull requests.
Developer
Apr 10, 2014 at 7:59 PM
Edited Apr 10, 2014 at 8:00 PM
You can look at the How to Contribute page and scroll down to the Coding Conventions section.
Developer
Apr 12, 2014 at 2:50 AM
Edited Apr 12, 2014 at 2:50 AM
On the LINQ aspect.

Array.Exists wasn't used because _Nodes isn't an array. But let's say it was.
If Array.Exists( Me._Nodes, Function(n) n.Value.RawKind = kind) Then
This would result in a allocation because of the Lambda Function?
Yes, it captures the variable 'kind' in a object instance that hosts the delegate. So each time the block of code surrounding the if statement is executed, an object is allocated.
May 16, 2014 at 12:26 PM
Always including “End If” and explicit “{ }” on if etc means a lot of merger errors in a source code control system produces compiler errors rather then runtime bugs.
May 16, 2014 at 3:55 PM
Edited May 16, 2014 at 3:56 PM
Suppose the programmer could annotate the use of LINQ method such that the compiled result was the longer inline version.
Friend Function Any(kind As SyntaxKind) As Boolean
 Return Me._nodes.Any<Inline>(Function(n) n.Value,RawKind = kind)
End Function
generate this instead.
        Friend Function Any(kind As SyntaxKind) As Boolean
            Dim i As Integer
            For i = 0 To Me._count - 1
                If (Me._nodes(i).Value.RawKind = kind) Then
                    Return True
                End If
            Next i
            Return False
        End Function
May 16, 2014 at 5:00 PM
iansk wrote:
Always including “End If” and explicit “{ }” on if etc means a lot of merger errors in a source code control system produces compiler errors rather then runtime bugs.
I can certainly see how improper merging of code with C-style if statements would result in code which is semantically wrong but still passes compilation. To what extent can that happen with VB.NET-style If Then? For that matter, to what extent could it happen in C# if there was a rule that allowed the non-brace form only if the conditional code was written on the same line as the if?

Also, with regard to merging errors, I wonder whether it would be advantageous to require the use of line continuation characters in VB, and make the merger tool regard groups of connected lines as a single line?
May 22, 2014 at 10:48 AM
Edited May 22, 2014 at 10:51 AM
@AdamSpeight2008, take a look at LinqOptimiser, it does things like this for you.

The optimisation you talk about is one of the ones that are listed on the main page, the others are:
  • Lambda inlining
  • Loop fusion
  • Nested loop generation
  • Anonymous Types-Tuples elimination
  • Specialized strategies and algorithms
There's some performance numbers available https://github.com/nessos/LinqOptimizer/wiki/Performance.