Lowering Vector512() methods : comparison + shift#84942
Lowering Vector512() methods : comparison + shift#84942tannergooding merged 4 commits intodotnet:mainfrom
Conversation
7cde562 to
34fb88b
Compare
34fb88b to
00ced07
Compare
00ced07 to
8b7ec63
Compare
|
cc @dotnet/jit-contrib |
|
There's some merge conflicts that need to be resolved, let me know if you need any assistance with them. |
8b7ec63 to
8eed177
Compare
Have resolved them |
ad69641 to
46b5933
Compare
…corresponding *ANY(), *ALL() and ConditionaSelect() ShiftLeft, ShiftRight, ShiftRightArithmetic
| // TODO-XArch-CQ: It's a non-trivial amount of work to support these | ||
| // for floating-point while only utilizing AVX. It would require, among | ||
| // other things, inverting the comparison and potentially support for a | ||
| // new Avx.TestNotZ intrinsic to ensure the codegen remains efficient. | ||
| assert(compIsaSupportedDebugOnly(InstructionSet_AVX2)); | ||
| intrinsic = NI_Vector256_op_Equality; |
There was a problem hiding this comment.
For other reviewers, this is just a copy/paste of an existing comment that used to be shared across the total of GE/GT/LE/LT
It's actually a lot easier for us to handle this today if we wanted to and is maybe a small/easy win we can do for .NET 8 (of course in a separate PR), so having the logic duplicated now will make doing that a bit simpler
|
Manually triggered replay to try to get past infra issue: https://dev.azure.com/dnceng-public/public/_build/results?buildId=245191&view=results |
Thanks @BruceForstall .Looks like it passed on rerun. Do let me know if any other changes are needed or this is good to go. |
| intrinsic = NI_Vector512_op_Equality; | ||
| } | ||
| else if (simdSize == 32) | ||
| if (simdSize == 32) |
There was a problem hiding this comment.
just curious, why we decided to check for 32 , then 64 and then (implicit) 16 or less? Is it because Vector256 is more common and should hit that condition first from TP perspective?
There was a problem hiding this comment.
Yep. Initially when we had throughput issues, this was one of the things we tried. Making check for 32 the first(being the most common case)
kunalspathak
left a comment
There was a problem hiding this comment.
A suggestion and a question. Feel free to do it in follow-up PR or including it with the next PR.
Co-authored-by: Kunal Pathak <[email protected]>
|
Resolved conflict with multiply pr |
|
Resolved conflict with #85070 |
Includes following
Comparison :
LessThan(),LessThanOrEqual(),GreaterThan(),GreaterThanOrEqual()+corresponding*ANY(),*ALL()andConditionalSelect()Arithmetic :
ShiftLeft,ShiftRight,ShiftRightArithmeticOpen : Some cases for
ConditionalSelect()uses blend. Skipping for now. Have some other thoughts here anyway : usingVPTERNLOG()??@dotnet/avx512-contrib