Improving the SIMD codegen for SIMD12 load/store by tannergooding · Pull Request #80083 · dotnet/runtime

tannergooding · 2022-12-31T21:10:48Z

This adds support for containment on TYP_SIMD12 loads/stores and improves the codegen to require less temporary registers and use better instructions when available.

Improved Load:

- lea      rax, bword ptr [rcx+30H]
- vmovss   xmm0, dword ptr [rax+08H]
- vmovsd   xmm1, qword ptr [rax]
- vshufps  xmm1, xmm0, 68
+ vmovsd   xmm0, qword ptr [rcx+30H]
+ vinsertps xmm0, dword ptr [rcx+38H], 2

Improved Store:

- vmovsd   qword ptr [rdx], xmm1
- vpshufd  xmm0, xmm1, 2
- vmovss   dword ptr [rdx+08H], xmm0
+ vmovsd   qword ptr [rdx], xmm0
+ vextractps dword ptr [rdx+08H], xmm0, 2

Combined this saves 9 bytes of codegen and improves the PerScore by 1.5

Total diffs are all relatively similar. Emitting vmovsd + vinsertps or vmovsd + vextractps and removing now unnecessary lea in favor of containing them.

ghost · 2022-12-31T21:11:00Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

This adds support for containment on TYP_SIMD12 loads/stores and improves the codegen to require less temporary registers and use better instructions when available.

Improved Load:

- lea      rax, bword ptr [rcx+30H]
- vmovss   xmm0, dword ptr [rax+08H]
- vmovsd   xmm1, qword ptr [rax]
- vshufps  xmm1, xmm0, 68
+ vmovsd   xmm0, qword ptr [rcx+30H]
+ vinsertps xmm0, dword ptr [rcx+38H], 2

Improved Store:

- vmovsd   qword ptr [rdx], xmm1
- vpshufd  xmm0, xmm1, 2
- vmovss   dword ptr [rdx+08H], xmm0
+ vmovsd   qword ptr [rdx], xmm0
+ vextractps dword ptr [rdx+08H], xmm0, 2

Combined this saves 9 bytes of codegen and improves the PerScore by 1.5

Total diffs are all relatively similar. Emitting vmovsd + vinsertps or vmovsd + vextractps and removing now unnecessary lea in favor of containing them.

Author:	tannergooding
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

tannergooding · 2023-01-03T18:59:00Z

CC. @dotnet/jit-contrib, this should be ready for review.

Gives some small size savings for x64 (~2k bytes in fullopts and ~0.5k bytes in minopts) and a small TP win on x64

tannergooding · 2023-01-04T23:28:18Z

/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop

azure-pipelines · 2023-01-04T23:28:44Z

Azure Pipelines successfully started running 3 pipeline(s).

…ng register

tannergooding · 2023-01-05T16:38:33Z

/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop

azure-pipelines · 2023-01-05T16:38:53Z

Azure Pipelines successfully started running 3 pipeline(s).

tannergooding · 2023-01-05T20:16:02Z

Fixed the jitstress failure. Results in ~3.3k savings on x86/x64 and a -0.01% TP improvement

src/coreclr/jit/emitxarch.cpp

TIHan

This looks good to me. There seems to be a failure for ARM though, but it looks like just one test case at the moment.

tannergooding · 2023-01-05T21:21:46Z

There seems to be a failure for ARM though, but it looks like just one test case at the moment.

It's an unrelated/existing GC timeout. I've retriggered it and it should pass on rerun.

src/coreclr/jit/simdcodegenxarch.cpp

tannergooding · 2023-01-06T02:59:36Z

/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop

azure-pipelines · 2023-01-06T03:00:03Z

Azure Pipelines successfully started running 3 pipeline(s).

Improving the SIMD codegen for SIMD12 load/store

1c5d0fa

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 31, 2022

ghost assigned tannergooding Dec 31, 2022

tannergooding added 3 commits December 31, 2022 14:23

Apply formatting patch

375bf44

Ensure the right index is used for insertps

4a60feb

Ensure emitIns_SIMD_* is used for insertps to handle the non-rmw form

6d66374

build-analysis bot mentioned this pull request Jan 1, 2023

Tracking issue for CI build timeouts #76454

Closed

Fix the input size flag for extractps

c0fd1d3

tannergooding marked this pull request as ready for review January 2, 2023 05:41

tannergooding mentioned this pull request Jan 2, 2023

Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

Merged

runfoapp bot mentioned this pull request Jan 2, 2023

Long Running Test: Interop/MonoAPI/MonoMono/PInvokeDetach/PInvokeDetach.sh #73040

Closed

tannergooding added 2 commits January 5, 2023 07:41

Merge remote-tracking branch 'dotnet/main' into better-simdcodegen

f95db7d

Fix an issue where the second half of a TYP_SIMD12 store used the wro…

7ede98d

…ng register

TIHan reviewed Jan 5, 2023

View reviewed changes

src/coreclr/jit/emitxarch.cpp Show resolved Hide resolved

TIHan reviewed Jan 5, 2023

View reviewed changes

SingleAccretion reviewed Jan 5, 2023

View reviewed changes

src/coreclr/jit/simdcodegenxarch.cpp Show resolved Hide resolved

src/coreclr/jit/simdcodegenxarch.cpp Show resolved Hide resolved

src/coreclr/jit/simdcodegenxarch.cpp Show resolved Hide resolved

tannergooding force-pushed the better-simdcodegen branch from 5c4afad to ca3ada1 Compare January 5, 2023 22:16

Ensure relocatable handles for TYP_SIMD12 load/stores are not contained

7bee874

tannergooding force-pushed the better-simdcodegen branch from ca3ada1 to 7bee874 Compare January 5, 2023 23:12

Ensure arm32 can build

799b029

TIHan approved these changes Jan 6, 2023

View reviewed changes

runfoapp bot mentioned this pull request Jan 6, 2023

Infra improvements for Helix #68176

Closed

tannergooding merged commit 87312f7 into dotnet:main Jan 6, 2023

ghost locked as resolved and limited conversation to collaborators Feb 5, 2023

tannergooding deleted the better-simdcodegen branch July 1, 2025 14:40

Conversation

tannergooding commented Dec 31, 2022

Uh oh!

ghost commented Dec 31, 2022

Uh oh!

tannergooding commented Jan 3, 2023

Uh oh!

tannergooding commented Jan 4, 2023

Uh oh!

azure-pipelines bot commented Jan 4, 2023

Uh oh!

tannergooding commented Jan 5, 2023

Uh oh!

azure-pipelines bot commented Jan 5, 2023

Uh oh!

tannergooding commented Jan 5, 2023

Uh oh!

Uh oh!

TIHan left a comment

Choose a reason for hiding this comment

Uh oh!

tannergooding commented Jan 5, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding commented Jan 6, 2023

Uh oh!

azure-pipelines bot commented Jan 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants