[Codegen] Introduce VectorizableOpInterface and migrate LinalgExt::GatherOp#23653
[Codegen] Introduce VectorizableOpInterface and migrate LinalgExt::GatherOp#23653
Conversation
bc0a2cb to
16bcb57
Compare
…therOp Introduce a VectorizableOpInterface that provides an unified extension point for vectorization. Operations implement this interface to declare, whether it is vectorizable, how they should be vectorized, and the driver dispatches through it. - Implement external model for IREE::LinalgExt::GatherOp, inlining the vectorization logic from vectorizeLinalgExtGatherToTransferGather. - Update GenericVectorization.cpp to dispatch through the interface for ops that implement it, with TypeSwitch fallback for unmigrated ops. It is a step towards https://lists.lfaidata.foundation/g/iree-technical-discussion/message/15 Assisted-by: Claude Signed-off-by: hanhanW <[email protected]>
16bcb57 to
c35263b
Compare
Signed-off-by: hanhanW <[email protected]>
Signed-off-by: hanhanW <[email protected]>
sommerlukas
left a comment
There was a problem hiding this comment.
Just a small nit, but please also wait for @Groverkss to review.
| DictionaryAttr options) const { | ||
| auto gatherOp = cast<IREE::LinalgExt::GatherOp>(op); | ||
| int64_t batchRank = gatherOp.getBatchRank(); | ||
| Location loc = gatherOp.getLoc(); |
There was a problem hiding this comment.
Might be worth creating an ImplicitLocOpBuilder here and use it in the rest of this function.
There was a problem hiding this comment.
I tried ImplicitLocOpBuilder when it was created, but I did not like the idea much. It was useful if you create operations in a single line, though I still don't like the idea.
The other point is that it extends OpBuilder, not RewriterBase, so methods like modifyOpInPlace don't work.
There was a problem hiding this comment.
I'm not sure what you mean by
operations in a single line
If we create an ImplicitLocOpBuilder once in the beginning of vectorize, there is nine places where we pass rewriter, loc and could be passing that builder instead.
For things like modifyOpInPlace, you can keep using the rewriter, I think.
| auto indicesVecRead = vector::TransferReadOp::create( | ||
| rewriter, loc, indicesVecTy.clone(indicesTy.getElementType()), | ||
| gatherOp.getIndices(), SmallVector<Value>(indicesTy.getRank(), zero), | ||
| std::nullopt); |
There was a problem hiding this comment.
Note that eventually I would like to move this to an interface. Using transfer_read/transfer_write to vectorize something is not a very nice way of vectorization. We should be doing dialect conversion imo. We can do that after the VectorTile size analysis lands.
There was a problem hiding this comment.
Is there a plan for VectorTile size analysis yet? Is it something similar to what we implemented:
iree/compiler/src/iree/compiler/Codegen/Utils/Utils.h
Lines 363 to 372 in 4e3da92
There was a problem hiding this comment.
There is a PR with a first attempt at the analysis here: #23668
The idea is that we take the tile sizes from to_layout operations (that come from lowering_config) and propagate that to the other compute ops. It's then annotated as a discardable attribute, so we can use it in GenericVectorization. There, we use three sources of information in this order:
lowering_config- Result of the VectorTileSizeAnalysis
inferSizesFromIR
There was a problem hiding this comment.
The quick comment is about getVectorScalableFlags method. The lowering_config can implement the method to provide vector tile sizes. It's a little fragile because lowering config can be messed up during the transformation, so it is default off on x86 backends.
There was a problem hiding this comment.
So, you mean we need to propagate the scalable flags in the analysis?
There was a problem hiding this comment.
I meant the design idea was using the interface method, if the lowering config has the information. Scalable flags is just something similar to vector size and only some CPU backends use it.
Introduce a VectorizableOpInterface that provides an unified extension point for vectorization. Operations implement this interface to declare, whether it is vectorizable, how they should be vectorized, and the driver dispatches through it.
No new tests because it is an NFC in terms of functionality. It just follows different mechanism for vectorization.
It is a step towards https://lists.lfaidata.foundation/g/iree-technical-discussion/message/15
Assisted-by: Claude