JIT: Expand loop detection in if-conversion#82170
Conversation
The BB weight approach does not detect unnatural loops. Fix dotnet#82106
|
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsThe BB weight approach does not detect unnatural loops. Fix #82106 Let's see how costly this is...
|
|
Surprisingly, not too impactful. cc @dotnet/jit-contrib PTAL @a74nh @AndyAyersMS |
| // Don't optimise the block if it is inside a loop | ||
| // When inside a loop, branches are quicker than selects. | ||
| // Detect via the block weight as that will be high when inside a loop. | ||
| if (m_startBlock->getBBWeight(m_comp) > BB_UNITY_WEIGHT) | ||
| { | ||
| JITDUMP("Skipping if-conversion inside loop (via weight)\n"); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Would it be better to leave this "cheap" weight check first in the function, and only insert the "expensive" check here, after the long sequence of potentially expensive (appearing) checks above?
There was a problem hiding this comment.
I actually tried this locally, but I did not see any impact from moving the cheap check.
There was a problem hiding this comment.
An alternative option would be to put all of the checks into a function bool canIfConvert().
Then optIfConvert would be:
if (!m_comp->compStressCompile(Compiler::STRESS_IF_CONVERSION_INNER_LOOPS, 25))
{
if (m_startBlock->getBBWeight(m_comp) > BB_UNITY_WEIGHT)
{
if (verbose && canIfConvert()) JITDUMP("Skipping if-conversion inside loop (via weight)\n");
return false;
}
if (m_comp->optReachable(m_finalBlock, m_startBlock, nullptr))
{
if (verbose && canIfConvert()) JITDUMP("Skipping if-conversion inside loop (via FG walk)\n");
return false;
}
}
if (!canIfConvert()) return false;
.... do the if conversion
There was a problem hiding this comment.
It seems like that would do the optReachable check before identifying the candidate which is what I wanted to avoid, as the flow-graph walk is expensive. Also, it is using m_finalBlock which is found as part of identifying the candidate. I moved the other check to simplify the code (as it does not have noticeable impact).
Of course if the flow graph walk was prohibitively expensive we could just compute reachability sets in this phase, but since the TP impact of this approach didn't turn out too bad I think we can leave it like this.
There was a problem hiding this comment.
ah, didn't spot the use of final block, ok, agreed, leave as is.
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThe BB weight approach does not detect unnatural loops. Fix #82106 For more context, see #82106 (comment)
|
|
I'd be curious to see any diffs or performance changes for Arm64. Otherwise looks good to me. |
Let's see during perf triage. FWIW, the benchmark I looked at originally regressed with if-conversion on arm64 as well: https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_arm64_ubuntu%2020.04%2fSystem.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern%3a%20%22aei%22%2c%20Options%3a%20Compiled).html Here's a benchmark that improved a bit on arm64 when else-conditions were added to if-conversion, but that this PR is affecting: https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_arm64_ubuntu%2020.04%2fBilinearTest.Interpol_Scalar.html I'm not sure there is so much we can do about this without investing in much more expensive/complicated analysis (that would probably be easier if the pass was still at a point where we had SSA). I assume the existing logic actually did mean to check for any kind of loop construct? |
The BB weight approach does not detect unnatural loops.
Fix #82106
For more context, see #82106 (comment)