Skip to content

Initial plumbing for inner_tiled on CPU with data-tiled MMA attribute.#23494

Merged
bjacob merged 4 commits intoiree-org:mainfrom
bjacob:inner_tiled_cpu
Mar 3, 2026
Merged

Initial plumbing for inner_tiled on CPU with data-tiled MMA attribute.#23494
bjacob merged 4 commits intoiree-org:mainfrom
bjacob:inner_tiled_cpu

Conversation

@bjacob
Copy link
Collaborator

@bjacob bjacob commented Feb 17, 2026

This is just the initial round of plumbing to get to the point where one can at least build and validate a inner_tiled op on CPU, meaning with a kind parameter that is a CPU-specific data-tiled MMA layout attr with an intrinsic enum that designates a SIMD intrinsic.

Most of this was written by AI.

@bjacob bjacob requested review from Max191 and jtuyls February 17, 2026 19:01
@bjacob bjacob marked this pull request as ready for review February 17, 2026 19:01
@bjacob bjacob requested a review from hanhanW as a code owner February 17, 2026 19:01
Copy link
Contributor

@egebeysel egebeysel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting stuff so I self-appointed myself to review if you don't mind :) I don't really have much comments regarding the static cases - it LGTM from that front.

Though I'm actively working on scalable vector ISAs (SVE, SME, RVV) and have some concerns about the current semantics of the operation to model them. I left my thoughts as comments, but if I'm not mistaken - the operation currently does not allow them at all.

I'm willing to work on this and bolt in the missing pieces myself - I think it's important that we do this early rather than later after switching to use this op instead of the mmt4d. Though I'm not sure how that would play out with the current infrastructure built around the inner_tiled op. Let me know what you think!

Also, more of a question rather than a review, but what is the incentive of switching from mmt4d to inner_tiled for the CPU pipeline? Is it more to unify the data-tiling work for the CPU and GPU pipelines or is there an effective semantic gap between these op for the CPU?

@egebeysel
Copy link
Contributor

cc @banach-space

@bjacob
Copy link
Collaborator Author

bjacob commented Feb 20, 2026

Thanks @egebeysel for all the comments. We had a conversation offline and then here: https://discord.com/channels/689900678990135345/1456679894812590326/1474440363199434843

The summary is that we agree that further changes are needed anyway, but the present PR lays some groundwork that is useful regardless. I would like to go ahead and get this merged just to take it off of my stack of WIP PRs.

Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interface implementations and such lgtm

Copy link
Contributor

@egebeysel egebeysel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the offline discussion. LGTM as well :)

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few nits. I don't know what the final picture is, but I'm sure that it is going to be great. I'm looking forward to it, and thanks for starting the work!

bjacob added 2 commits March 2, 2026 17:36
Signed-off-by: Benoit Jacob <[email protected]>
Signed-off-by: Benoit Jacob <[email protected]>
@bjacob bjacob force-pushed the inner_tiled_cpu branch from 8a338cd to b1be72a Compare March 2, 2026 18:53
bjacob added 2 commits March 2, 2026 18:59
Signed-off-by: Benoit Jacob <[email protected]>
Signed-off-by: Benoit Jacob <[email protected]>
@bjacob bjacob merged commit 771eac5 into iree-org:main Mar 3, 2026
57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants