[PCF] Add producer fusion into pcf.generic/loop ops#23447
[PCF] Add producer fusion into pcf.generic/loop ops#23447qedawkins merged 2 commits intoiree-org:mainfrom
Conversation
Add FuseProducersPass (iree-pcf-fuse-producers) that fuses DPS producers into pcf.generic and pcf.loop ops through their tied init arguments. For each tied init that is the single result of a TilingInterface + DestinationStyleOpInterface producer (e.g. linalg.fill, linalg.transpose), the pass: 1. Replaces the scoped op's init with the producer's DPS init 2. At each pcf.read_slice on the corresponding sref, generates a tiled version of the producer via generateResultTileValue 3. For vector-typed read_slices, converts the tiled tensor result to a vector via vector.transfer_read 4. Erases the original producer if it has no remaining uses The pass requires SyncOnReturn semantics on the sref, under which overlapping reads and writes are undefined behavior, allowing all reads to be treated as seeing the init value.
a982e8d to
78bbdd4
Compare
Max191
left a comment
There was a problem hiding this comment.
Overall LGTM, just some nits.
One question: Is fusion through inits expected to happen separately from fusion via implicit capture? Why does this pass only perform fusion through the init args?
compiler/src/iree/compiler/Codegen/Dialect/PCF/Transforms/Transforms.h
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/PCF/Transforms/Transforms.h
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/PCF/Transforms/test/fuse_producers.mlir
Show resolved
Hide resolved
Fusion through implicit capture doesn't need anything special since it's just |
So then we won't have pcf.read_slice ops on anything but the sref block args? |
You'd have to convert to sref somewhere, and that wouldn't be via the pcf.generic/loop since fusion through its tied inits is what this handles and that's the only source of sref the loop provides. So if we did have to handle pcf.read_slice somewhere else, it would be orthogonal to this. In general, I don't think making management of implicit capture the responsibility of the capturing op is a good idea outside of converstion to/from isolated from above. It breaks the core MLIR assumption that all uses are the same. |
Add FuseProducersPass (iree-pcf-fuse-producers) that fuses DPS producers into pcf.generic and pcf.loop ops through their tied init arguments.
For each tied init that is the single result of a TilingInterface + DestinationStyleOpInterface producer (e.g. linalg.fill, linalg.transpose), the pass:
The pass requires SyncOnReturn semantics on the sref, under which overlapping reads and writes are undefined behavior, allowing all reads to assume as though they see the init value.