Skip to content

[Codegen] Create CombineSourceLayoutTransformation pass for MapGatherOp#23165

Merged
Abhishek-Varma merged 10 commits intoiree-org:mainfrom
Abhishek-Varma:avarma_map_gather_combine_layout
Feb 10, 2026
Merged

[Codegen] Create CombineSourceLayoutTransformation pass for MapGatherOp#23165
Abhishek-Varma merged 10 commits intoiree-org:mainfrom
Abhishek-Varma:avarma_map_gather_combine_layout

Conversation

@Abhishek-Varma
Copy link
Contributor

@Abhishek-Varma Abhishek-Varma commented Jan 16, 2026

-- This commit creates a separate pass iree-codegen-combine-source-layout-transformation
adding support of folding relayout operations into map_gather op.
-- The following ops' folding are being supported through changes in this PR :-
1. linalg::CopyOp
2. linalg::TransposeOp
3. tensor::ExpandShapeOp
4. tensor::CollapseShapeOp
5. tensor::ExtractSliceOp
6. tensor::PadOp
-- Consequently getPaddingValue and createIdentityMapGather member methods were added to MapGatherOp.
-- The implementations for the iree-codegen-combine-source-layout-transformation pass reside in the same file as MapScatter's relayout folding implementations (which has been renamed as iree-codegen-combine-result-layout-transformation).

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review all the code yet, but I have a first high-level question: do we want to make both transformation (gather and scatter) be in a single pass? They look orthogonal to me, and it is not clear that which folding is prioritized.

The other high level comment is passing bool-ish argument (i.e., your new enum) to functions. It makes this much more difficult to follow (especially in what is already difficult code). This comment depends on the first mega comment about if we want to split it into two passes or not.

@nirvedhmeshram
Copy link
Contributor

-- Unlike map_scatter, we don't allow tensor.pad op in the relayout op chain

I think we want to allow this as the use case mentioned in #23038 has pad op in it?

@hanhanW
Copy link
Contributor

hanhanW commented Jan 21, 2026

Hey @Abhishek-Varma I did not get a chance to review it today. I'll prioritize it tomorrow. You can continue working on other changes and use stacked PR features. (or just send the PR out and ask me/others to review the last commit.)

hanhanW

This comment was marked as abuse.

@hanhanW
Copy link
Contributor

hanhanW commented Jan 21, 2026

I don't follow why a review comment is generated by LLM agents. I hided it and deleted the app.

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the core logic yet, but can we clean up first? I think some of them are not used at all, and it is adding tech debt and overheads to reviewers.

@Abhishek-Varma Abhishek-Varma force-pushed the avarma_map_gather_combine_layout branch from 7b5bfa1 to 6821f91 Compare January 29, 2026 06:28
@Abhishek-Varma Abhishek-Varma changed the title [Codegen] Add support of map_gather in CombineLayoutTransformation for Dispatch Scope [Codegen] Create CombineLayoutTransformationForMapGather pass Jan 30, 2026
@Abhishek-Varma Abhishek-Varma force-pushed the avarma_map_gather_combine_layout branch from 9dc047e to 7e3b728 Compare January 30, 2026 09:08
@Abhishek-Varma
Copy link
Contributor Author

Abhishek-Varma commented Jan 30, 2026

Hi @hanhanW.

I've now changed the algo. I'm creating identity map_gather at the result of load_from_buffer and now iteratively fusing in consumer relayout ops instead of the earlier algo of creating an identity map_gather at the end of a relayout op chain. It's more intuitive now.

I've also refactored the code because most of the foldings followed a similar structure - so reviewing it will be easier now!

Locally, I've included this pass at

funcPassManager.addPass(
createCombineLayoutTransformationPass(combineLayoutOptions));
and have switched off the temporary workaround flag iree-codegen-test-c-promtion :-

I used the above setup to test the original regression with the repro dispatch and I confirm it resolves the issue :-

real    0m10.902s
user    0m10.810s
sys     0m0.163s

NOTE: I confirmed first by simply switching off the iree-codegen-test-c-promtion flag - the compilation of the repro dispatch was taking > 10 minutes. This was done as a way to "revert" the temporary workaround first and confirm the issue exists. Then I followed the above setup and observed that with the pass introduced through this PR the compilation takes place successfully.

I've even verified the output of the dispatch. Even isolated a few cases like transpose op and verified the e2e correctness too.

I'm adding the output.mlir and input.mlir for reference that the pass can handle complex relayout chains.

So, once this PR merges - I can raise another PR that removes the temporary workaround and adds this pass in the addGPUTileAndFusePassPipeline.

CC: @MaheshRavishankar @nirvedhmeshram @jtuyls @Max191

… Dispatch scope

-- This commit adds support of map_gather op in CombineLayoutTransformation in
   Dispatch scope.
-- Unlike map_scatter, we don't allow tensor.pad op in the relayout op chain
   for map_gather. This is because map_scatter has a known output buffer at the
   end of the chain, while map_gather only has a known source buffer at the
   beginning of the relayout ops chain.

Signed-off-by: Abhishek Varma <[email protected]>
Signed-off-by: Abhishek Varma <[email protected]>
Signed-off-by: Abhishek Varma <[email protected]>
Signed-off-by: Abhishek Varma <[email protected]>
Signed-off-by: Abhishek Varma <[email protected]>
Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. I left one main comment about the tensor.pad handling, and then mostly nits.

One more high level question: Have you verified numerical correctness with any e2e examples using the CombineLayoutTransformationForMapGather pass yet?

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. I was thinking if I should ask refactoring, or defer it when we're evalulating if we want to collapse two passes or not. Moving them to the same file and keep them two separate pass looks good to me.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma Abhishek-Varma force-pushed the avarma_map_gather_combine_layout branch from 35808c2 to 48f1f33 Compare February 3, 2026 12:42
@Abhishek-Varma
Copy link
Contributor Author

One more high level question: Have you verified numerical correctness with any e2e examples using the CombineLayoutTransformationForMapGather pass yet?

Hi @Max191 - Yes, I did in fact check e2e correctness as reported here. I verified the repro's output as well as a standalone transpose op example.

And today I also checked a bunch of complex relayout op chains' e2e correctness too.

Here's what I did :-

  1. Formed a complex chain relayout ops.
  2. Applied the current pass.
  3. Added the output to e2e map_gather test already checked in along with corresponding check.expect* ops with what tensor values are expected.

The tests passed! :D

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just some nits. I think we over-abstract the "factor" functions.

@hanhanW
Copy link
Contributor

hanhanW commented Feb 6, 2026

The PR title and description need to be updated.

@Abhishek-Varma Abhishek-Varma changed the title [Codegen] Create CombineLayoutTransformationForMapGather pass [Codegen] Create CombineSourceLayoutTransformation pass for MapGatherOp Feb 9, 2026
Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one final nit. lgtm

Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work! Just a couple nits.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma Abhishek-Varma enabled auto-merge (squash) February 10, 2026 13:34
@Abhishek-Varma Abhishek-Varma merged commit 1320245 into iree-org:main Feb 10, 2026
57 checks passed
MaheshRavishankar pushed a commit to MaheshRavishankar/iree that referenced this pull request Feb 24, 2026
…Op (iree-org#23165)

-- This commit creates a separate pass
`iree-codegen-combine-source-layout-transformation`
    adding support of folding relayout operations into map_gather op.
-- The following ops' folding are being supported through changes in
this PR :-
    1. `linalg::CopyOp`
    2. `linalg::TransposeOp`
    3. `tensor::ExpandShapeOp`
    4. `tensor::CollapseShapeOp`
    5. `tensor::ExtractSliceOp`
    6. `tensor::PadOp`
-- Consequently `getPaddingValue` and `createIdentityMapGather` member
methods were added to `MapGatherOp`.
-- The implementations for the
`iree-codegen-combine-source-layout-transformation` pass reside in the
same file as `MapScatter`'s relayout folding implementations (which has
been renamed as `iree-codegen-combine-result-layout-transformation`).

Signed-off-by: Abhishek Varma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants