[Codegen] Migrate all ops to VectorizableOpInterface and remove TypeSwitch by hanhanW · Pull Request #23713 · iree-org/iree

hanhanW · 2026-03-09T20:27:28Z

The revision inlines the implementation from VectorExt transforms to avoid circular dependency. Now all the ops are vectorized through the interface, so TypeSwitch is removed from driver.

It also updates vectorizeGatherLikeGenericToTransferGather to not rewrite the result internally. It's expected to be replaced by drivers.

It is the last step towards https://lists.lfaidata.foundation/g/iree-technical-discussion/message/15

Assisted-by: Claude

hanhanW · 2026-03-09T20:30:19Z

It depends on

sommerlukas · 2026-03-10T09:25:36Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    if (succeeded(result)) {
+      rewriter.replaceOp(op, *result);
+    }


Should we signal pass failure here if the result is failure? The interface documentation says returning failure after isVectorizable returned true is a bug, so maybe we should signal this here. Otherwise, we will have silent failures that we need to track back to here later on.

Good point, let me give it a shot. It aims to match current behavior, but let's see if it triggers silent bugs or not.

It triggers a bug in CUDA backend: #23750

I turned it back with additional debug message. I think it is better to bypass the upstream bug (if exists) so the compiler can be more stable.

sommerlukas · 2026-03-10T09:26:32Z

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/VectorizeIREEVectorExtOps.cpp

-#include "iree/compiler/Dialect/LinalgExt/Utils/Utils.h"
-#include "mlir/Dialect/Linalg/IR/Linalg.h"
-#include "mlir/Dialect/Linalg/Transforms/Transforms.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"


Why do we need a new header if we only delete code?

Good catch! I'll remove it and the below VectorOps.h.

They were required because we have dialect deps in Passes.td. I remove all of them.

(Previously, it relied on indirect include to solve the issue. That's why it's added by Claude.)

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/Transforms.h

sommerlukas · 2026-03-10T09:33:30Z

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

+  }
+
+  for (Value val : newOutputs) {
+    Value out = tensor::EmptyOp::create(rewriter, fullOp.getLoc(), vectorSizes,


Nit: We could modernize this code with an ImplictLocOpBuilder.

I'm not convinced that ImplictLocOpBuilder is the modern builder. It's out for a long time, and we don't use it. The past usage that I've seen is mostly in stablehlo project, FYI, but that seems to be someone's preference.

I agree that it may be a proper util for op creation that requires body though, like generic op.

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

sommerlukas · 2026-03-10T09:38:04Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    // Filter out PadOp/PackOp/UnPackOp when masking is disabled.
+    // TODO(hanchung): Enable the vectorization without masking. This is mostly
+    // legacy code because it used to not working without masking.


Probably not as relevant here, as the dependency on masking should go away, but when should such a check happen in the driver vs. when should we pass the option to isVectorizable and decide there?

Passing InputVectorSize implies masking has been the default behavior for a long time. This option is used to triggering whether a vector size inference is required or not (see below). Reason: some targets do not support masking and it will require emulation in this context. It provides a path to not vectorize the op for some targets like old aarch64 targets.

Just for more context: the code is legacy code that the option was added because the non-masking implementation was done after masking support.

Groverkss · 2026-03-10T16:24:47Z

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

+    FailureOr<linalg::VectorizationResult> result = linalg::vectorize(
+        rewriter, op, vectorSizes, scalableDims, vectorizeNDExtract,
+        flatten1DDepthwiseConv, /*assumeDynamicDimsMatchVecSizes=*/false,
+        createNamedContraction);


I would eventually like to move away from upstream vectorization, but nothing actionable for now.

That would be tricky as we may need to dup the code for other ops like elementwise-op, contraction, reduction, etc.

hanhanW · 2026-03-10T17:54:38Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    // Filter out PadOp/PackOp/UnPackOp when masking is disabled.
+    // TODO(hanchung): Enable the vectorization without masking. This is mostly
+    // legacy code because it used to not working without masking.


Passing InputVectorSize implies masking has been the default behavior for a long time. This option is used to triggering whether a vector size inference is required or not (see below). Reason: some targets do not support masking and it will require emulation in this context. It provides a path to not vectorize the op for some targets like old aarch64 targets.

Just for more context: the code is legacy code that the option was added because the non-masking implementation was done after masking support.

hanhanW · 2026-03-10T17:56:51Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    if (succeeded(result)) {
+      rewriter.replaceOp(op, *result);
+    }


Good point, let me give it a shot. It aims to match current behavior, but let's see if it triggers silent bugs or not.

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/Transforms.h

hanhanW · 2026-03-10T17:59:29Z

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/VectorizeIREEVectorExtOps.cpp

-#include "iree/compiler/Dialect/LinalgExt/Utils/Utils.h"
-#include "mlir/Dialect/Linalg/IR/Linalg.h"
-#include "mlir/Dialect/Linalg/Transforms/Transforms.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"


Good catch! I'll remove it and the below VectorOps.h.

hanhanW · 2026-03-10T18:01:29Z

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

+  }
+
+  for (Value val : newOutputs) {
+    Value out = tensor::EmptyOp::create(rewriter, fullOp.getLoc(), vectorSizes,


I'm not convinced that ImplictLocOpBuilder is the modern builder. It's out for a long time, and we don't use it. The past usage that I've seen is mostly in stablehlo project, FYI, but that seems to be someone's preference.

I agree that it may be a proper util for op creation that requires body though, like generic op.

hanhanW · 2026-03-10T18:03:42Z

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

+    FailureOr<linalg::VectorizationResult> result = linalg::vectorize(
+        rewriter, op, vectorSizes, scalableDims, vectorizeNDExtract,
+        flatten1DDepthwiseConv, /*assumeDynamicDimsMatchVecSizes=*/false,
+        createNamedContraction);


That would be tricky as we may need to dup the code for other ops like elementwise-op, contraction, reduction, etc.

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

hanhanW · 2026-03-11T01:19:47Z

compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp

+      // Decompose vectorized+bufferized map_store ops before lowering to loops.
+      .addPass(IREE::LinalgExt::createDecomposeMapStorePass)


I need to drop this commit, it was an experimental change.

kuhar

just drive-by nits

kuhar · 2026-03-11T12:46:29Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+
+  // Build DictionaryAttr options from pass options. These are forwarded to
+  // upstream linalg::vectorize().
+  SmallVector<NamedAttribute> linalgOptionsList;


nit: We know there are at most 2 elements

Suggested change

SmallVector<NamedAttribute> linalgOptionsList;

SmallVector<NamedAttribute, 2> linalgOptionsList;

Note that it is only true in this snapshot, but it is not true for the existing upstream API: https://github.com/llvm/llvm-project/blob/3f65a03e8abb3e6fb3372cf4c254d6c9f090e2e0/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L2728-L2732

FailureOr<VectorizationResult> mlir::linalg::vectorize( RewriterBase &rewriter, Operation *op, ArrayRef<int64_t> inputVectorSizes, ArrayRef<bool> inputScalableVecDims, bool vectorizeNDExtract, bool flatten1DDepthwiseConv, bool assumeDynamicDimsMatchVecSizes, bool createNamedContraction) {

I'm happy to make the change for this snapshot, but I'd like to point out that it's like asking future contributors to update the size when they add new options.

kuhar · 2026-03-11T12:48:36Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+  SmallVector<NamedAttribute> linalgOptionsList;
+  linalgOptionsList.push_back(
+      rewriter.getNamedAttr("vectorizeNDExtract", rewriter.getBoolAttr(true)));
+  if (vectorizeToTransferGather) {
+    linalgOptionsList.push_back(rewriter.getNamedAttr(
+        "vectorizeToTransferGather", rewriter.getBoolAttr(true)));
+  }


nit: Don't use Builder::getNamedAttr, you can just pass in a StringRef to the NamedAttribute constructor:

Suggested change

SmallVector<NamedAttribute> linalgOptionsList;

linalgOptionsList.push_back(

rewriter.getNamedAttr("vectorizeNDExtract", rewriter.getBoolAttr(true)));

if (vectorizeToTransferGather) {

linalgOptionsList.push_back(rewriter.getNamedAttr(

"vectorizeToTransferGather", rewriter.getBoolAttr(true)));

}

SmallVector<NamedAttribute> linalgOptionsList = {{"vectorizeNDExtract", rewriter.getBoolAttr(true)}};

if (vectorizeToTransferGather) {

linalgOptionsList.emplace_back(

"vectorizeToTransferGather", rewriter.getBoolAttr(true));

}

Is it your coding preference? I don't see an issue of using Builder::getNamedAttr. If we already construct things from builder, I'd lean to being consistent.

In terms of LSP intergration, the function signature is provided in this case. Using brace initializers does not provide the signature easily to the tools (at least it does not work for my tools).

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

kuhar · 2026-03-11T12:50:08Z

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

+buildPartialGenericOp(RewriterBase &rewriter, linalg::GenericOp fullOp,
+                      ArrayRef<int64_t> vectorSizes,
+                      SmallVector<Operation *> partial,
+                      DenseMap<Value, std::pair<Value, AffineMap>> &tmap) {


What does tmap stand for? I think this would be worth expanding or adding a docstring for.

It is explained in the below comment. Let me add a function comment and move the documentation there.

compiler/src/iree/compiler/Codegen/Interfaces/VectorizableOpInterface.cpp

…witch Signed-off-by: hanhanW <[email protected]>

Signed-off-by: hanhanW <[email protected]>

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/Passes.td

sommerlukas · 2026-03-12T17:42:31Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    if (failed(result)) {
+      return signalPassFailure();
+    }


I'm a bit confused between this code change and the reply to my previous comment. Does the bug #23750 currently mean we can't signal pass failure here or can we just keep this change?

I'm sorry about the confusion. Apparently, I made changes without commiting. 🤦‍♂️

What happens with operations that remain unvectorized because their vectorization failed? Will they be handled by some alternative path or will they just fail later in the pipeline?

It depends on pipeline setup, but the basic lowering should handle it. They should be lowered to scalar codes. There may be bufferization (like large stack buffer or shared memory) though.

I find it difficult to say what the best strategy is here. I agree with your point that failing the pass here could mean that upstream changes/bugs cause instability. On the other hand, silently falling back to scalar code could either be inefficient (slow, which we would only catch in benchmarks, potentially with some delay that requires bisecting) or mis-compilation later on in the pipeline (stack buffer too large etc.).

The right choice probably depends on how bad the fallback can become. As there is an existing bug, maybe we can stick with this approach for now and add a TODO to add the pass failure later.

We can add an option to that escalates silent failure to an actual failure for such case, so it may be easier to discover when we see bnehcmark regressions. This will require more work because there may be other similar "bugs" when we look at all the pipelines.

iree/compiler/src/iree/compiler/Codegen/Utils/CodegenOptions.h

Lines 14 to 29 in ccebc03

// A base class that defines common codegen options that are shared across

// different backends (e.g., CPU and GPU). Derived classes can add

// backend-specific options as needed.

//

// Note: We need static members because they are shared across all derived

// instances to bind LLVM cl::opt registration at the single storage when

// multiple backends inherit from this class.

struct CodegenOptions {

// Path to a module containing a tuning spec.

static std::string tuningSpecPath;

// Whether to add attributes for the tuner on root ops.

static bool setTunerAttributes;

void bindOptions(OptionsBinder &binder);

};

Signed-off-by: hanhanW <[email protected]>

sommerlukas · 2026-03-12T18:32:11Z

compiler/src/iree/compiler/Codegen/Common/GenericVectorization.cpp

+    if (failed(result)) {
+      return signalPassFailure();
+    }


I find it difficult to say what the best strategy is here. I agree with your point that failing the pass here could mean that upstream changes/bugs cause instability. On the other hand, silently falling back to scalar code could either be inefficient (slow, which we would only catch in benchmarks, potentially with some delay that requires bisecting) or mis-compilation later on in the pipeline (stack buffer too large etc.).

The right choice probably depends on how bad the fallback can become. As there is an existing bug, maybe we can stick with this approach for now and add a TODO to add the pass failure later.

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/Transforms/Passes.td

hanhanW requested review from Groverkss, MaheshRavishankar, Max191 and qedawkins as code owners March 9, 2026 20:27

hanhanW mentioned this pull request Mar 9, 2026

[vectorization] Add vectorization of non-projected linalg.generic #23664

Merged

sommerlukas reviewed Mar 10, 2026

View reviewed changes

Groverkss approved these changes Mar 10, 2026

View reviewed changes

hanhanW force-pushed the users/hanhanW/vec-iface-c3a branch from ba02055 to dcccae8 Compare March 10, 2026 17:46

hanhanW requested review from IanWood1, krzysz00, kuhar and nirvedhmeshram as code owners March 10, 2026 17:46

hanhanW force-pushed the users/hanhanW/vec-iface-c3b branch from a3305bc to d5dfe0f Compare March 10, 2026 17:46

hanhanW commented Mar 10, 2026

View reviewed changes

hanhanW force-pushed the users/hanhanW/vec-iface-c3a branch from dcccae8 to c8f23d0 Compare March 10, 2026 23:07

hanhanW force-pushed the users/hanhanW/vec-iface-c3b branch from d5dfe0f to 355a2cd Compare March 10, 2026 23:08

hanhanW force-pushed the users/hanhanW/vec-iface-c3a branch from c8f23d0 to 54492ef Compare March 11, 2026 01:16

hanhanW force-pushed the users/hanhanW/vec-iface-c3b branch from 355a2cd to d9fbaa3 Compare March 11, 2026 01:16

hanhanW commented Mar 11, 2026

View reviewed changes

hanhanW removed request for IanWood1, krzysz00, kuhar and nirvedhmeshram March 11, 2026 01:20

kuhar reviewed Mar 11, 2026

View reviewed changes

Base automatically changed from users/hanhanW/vec-iface-c3a to main March 12, 2026 16:20

hanhanW added 4 commits March 12, 2026 09:37

[Codegen] Migrate all ops to VectorizableOpInterface and remove TypeS…

3ab5e82

…witch Signed-off-by: hanhanW <[email protected]>

Defer the replacement to driver.

c646cc5

Signed-off-by: hanhanW <[email protected]>

address comments

e2904b5

Signed-off-by: hanhanW <[email protected]>

Do not signal a failure if it fails.

63a8891

Signed-off-by: hanhanW <[email protected]>

hanhanW force-pushed the users/hanhanW/vec-iface-c3b branch from d9fbaa3 to 63a8891 Compare March 12, 2026 16:38

hanhanW added 2 commits March 12, 2026 10:11

address comments

d4ea0c6

Signed-off-by: hanhanW <[email protected]>

Add size to SmallVector

2c037a3

Signed-off-by: hanhanW <[email protected]>

sommerlukas reviewed Mar 12, 2026

View reviewed changes

Do not signal failure.

5a81d3b

Signed-off-by: hanhanW <[email protected]>

hanhanW requested a review from sommerlukas March 12, 2026 18:29

spell out auto

d35f1e5

Signed-off-by: hanhanW <[email protected]>

sommerlukas approved these changes Mar 12, 2026

View reviewed changes

hanhanW merged commit d1e7974 into main Mar 12, 2026
59 of 60 checks passed

hanhanW deleted the users/hanhanW/vec-iface-c3b branch March 12, 2026 20:49

hanhanW mentioned this pull request Mar 18, 2026

Move unrolling to common and use in CPU pipeline #23837

Closed

		// Decompose vectorized+bufferized map_store ops before lowering to loops.
		.addPass(IREE::LinalgExt::createDecomposeMapStorePass)

	SmallVector<NamedAttribute> linalgOptionsList;
	SmallVector<NamedAttribute, 2> linalgOptionsList;

	// A base class that defines common codegen options that are shared across
	// different backends (e.g., CPU and GPU). Derived classes can add
	// backend-specific options as needed.
	//
	// Note: We need static members because they are shared across all derived
	// instances to bind LLVM cl::opt registration at the single storage when
	// multiple backends inherit from this class.
	struct CodegenOptions {
	// Path to a module containing a tuning spec.
	static std::string tuningSpecPath;

	// Whether to add attributes for the tuner on root ops.
	static bool setTunerAttributes;

	void bindOptions(OptionsBinder &binder);
	};

Conversation

hanhanW commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW commented Mar 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kuhar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanhanW Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

hanhanW commented Mar 9, 2026 •

edited

Loading

hanhanW Mar 12, 2026 •

edited

Loading