Skip to content

Ensure DecomposeRotate correctly orders parameters.#86066

Merged
tannergooding merged 4 commits intodotnet:mainfrom
tannergooding:fix-86027
May 11, 2023
Merged

Ensure DecomposeRotate correctly orders parameters.#86066
tannergooding merged 4 commits intodotnet:mainfrom
tannergooding:fix-86027

Conversation

@tannergooding
Copy link
Member

@tannergooding tannergooding commented May 10, 2023

This resolves #86027

LSH_HI emits itself as the following, which operates as (tgt << amt) | (upper >>> (32 - amt))

mov tgt, upper
shld tgt, lower, amt

RSH_LO emits itself as the following, which operates as (tgt >>> amt) | (upper << (32 - amt))

mov tgt, lower
shrd tgt, upper, amt

LSRA and CodeGen were incorrectly swapping the operands for GT_RSH_LO which hasn't been caught because the morph logic almost exclusively generates GT_ROL. That is, RotateRight(x, 1) was generating a GT_ROL(x, 63) node, which means the GT_ROR path was effectively unused.

This was changed in .NET 8 when support for recognizing these as intrinsic on import was added and so RotateRight was imported directly as GT_ROR

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 10, 2023
@ghost ghost assigned tannergooding May 10, 2023
@ghost
Copy link

ghost commented May 10, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This resolves #86207

LSH_HI emits itself as the following, which operates as (tgt << amt) | (upper >>> (32 - amt))

mov tgt, upper
shld tgt, lower, amt

RSH_LO emits itself as the following, which operates as (tgt >>> amt) | (upper << (32 - amt))

mov tgt, lower
shrd tgt, upper, amt

LSRA and CodeGen were incorrectly swapping the operands for GT_RSH_LO which hasn't been caught because the morph logic almost exclusively generates GT_ROL. That is, RotateRight(x, 1) was generating a GT_ROL(x, 63) node, which means the GT_ROR path was effectively unused.

This was changed in .NET 8 when support for recognizing these as intrinsic on import was added and so RotateRight was imported directly as GT_ROR

Author: tannergooding
Assignees: tannergooding
Labels:

area-CodeGen-coreclr

Milestone: -

Comment on lines +4848 to +4849
inst_Mov(targetType, tgtReg, regHi, /* canSkip */ true);
inst_RV_RV_IV(ins, emitTypeSize(targetType), tgtReg, regLo, count);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual operands here are all swapped around. This is generating result = (upper << count) | (lower >>> (32 - count)) for example.

For DecomposeRotate when count < 32, this ends up being the following (which is correct):

result.lower = (value.lower << count) | (value.upper >>> (32 - count))
result.upper = (value.upper << count) | (value.lower >>> (32 - count))

This is because DecomposeRotate passes it in as hiCopy, loOp1 and then as loCopy, hiOp1.

Ideally this would be swapped so that its passed in as loOp1, hiCopy then hiOp1, loCopy with codegen doing:

    regNumber tgtReg = tree->GetRegNum();
    assert(regHi != tgtReg);

    inst_Mov(targetType, tgtReg, regLo, /* canSkip */ true);
    inst_RV_RV_IV(ins, emitTypeSize(targetType), tgtReg, regHi, count);

However, swapping these + LSRA was leading to failures and I didn't bother investigating further since its already working "as is".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, swapping these + LSRA was leading to failures

I wonder if that is because you need to change LIR emitted for the ROR (the order of args/clones)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought I had gotten that all when testing, but it's possible I missed something when trying. It's likely not worth extra effort to solve though since this is all working now.

[Fact]
public static int TestRotateLeft1()
{
long[] firstOp = new long[] { 7822136809956075968, 1 };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably nothing prevents us from folding firstOp[0] to a constant and then perform constant folding in VN so the actual instruction will never be emitted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the bug repros in T0 and Debug, so it should get hit in general unless we start doing those foldings there as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds that we waste TP in Tier0 to recognize this pattern

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug repro'd in T0 because we recognize long.RotateRight in import. It's a primitive intrinsic which is trivial to recognize, much like the other ones we recognize in T0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a primitive intrinsic which is trivial to recognize

Ah ok, I though it's about that bigger one in morph that recognizes plain shifts

@tannergooding
Copy link
Member Author

Hmmm. More tests are failing that I didn't see locally. I must've messed something up here.

@tannergooding tannergooding merged commit c13325f into dotnet:main May 11, 2023
@tannergooding tannergooding deleted the fix-86027 branch May 11, 2023 19:10
@ghost ghost locked as resolved and limited conversation to collaborators Jun 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid ROL/ROR logic when accessing array element on x86

2 participants