Skip to content

Tags: DavidChan0519/iree

Tags

snapshot-20211109.639

Toggle snapshot-20211109.639's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[LLVMGPU][NFC] Unify hal.interface conversion for ROCM and CUDA (iree…

…-org#7568)

snapshot-20211109.638

Toggle snapshot-20211109.638's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Implementation of Util::AlignOp with tests and integration into compi…

…ler passes (iree-org#7437)

Adds a Util.Align op to `UtilOps.td`. Align accepts two arguments, the value to align and the alignment to return the newly aligned value. 

`--iree-hal-pack-allocations` pass produces `Util::Align` ops instead of arithmetic ops for the alignment. The `--iree-vm-conversion` pass concerts the alignment ops directly to VM arithmetic/const ops (bypassing the arithmetic ops altogether). 

iree-org#5405

snapshot-20211108.637

Toggle snapshot-20211108.637's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Adding -iree-stream-schedule-execution + -concurrency passes. (iree-o…

…rg#7549)

The passes themselves are rather simple and call into a partitioning
routine that performs the real work with the intent being that we
can have many and specify which one to use based on scoped attributes
in the IR (kind of like lowering configs in codegen). Today there's
just a reference implementation that does a single level of concurrency.
The hope is that someone who actually knows how to write a good
partitioning algorithm can contribute something better, but it's at
least no worse than what we have today and better than simple ML
systems that have no concurrency.

Though the passes are similar they operate at different scopes and
will have different partitioning algorithms. I thought about trying
to unify them however keeping them separate allows us to do things
like use a more complex execution partitioning pass while using the
same generic concurrency scheduling etc - including disabling the
concurrency scheduling entirely for debugging or environments where
there may be no benefits to such scheduling (single core execution,
etc). It's easy enough to reason about how they could be unified that
I wanted to err on the side of flexibility until we have an owner and
at least one or two more algorithms we can use to feel out the shape of
things.

A benefit of the independent execution and concurrency partitioning is
that debugging either is much simpler (and there's pretty good `-debug`
output). Since the concurrency scheduling operates only within the
scheduled execution regions there's no need to worry about host/device
interactions or the parent op CFG.

snapshot-20211108.636

Toggle snapshot-20211108.636's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[pydm] Implement sufficient support to run a couple of types of fibon…

…acci (iree-org#7301)

* Not yet compliant with anything we want but minimally works for default integer sizes (the VM seems to have issues with fp).
* Required also building out initial support for tuples and lists in order to get proper support for multiple returns (used for promotion RTL helpers).
* Adds while loop.
* Various fixes.

snapshot-20211107.635

Toggle snapshot-20211107.635's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[pydm] Implement sufficient support to run a couple of types of fibon…

…acci (iree-org#7301)

* Not yet compliant with anything we want but minimally works for default integer sizes (the VM seems to have issues with fp).
* Required also building out initial support for tuples and lists in order to get proper support for multiple returns (used for promotion RTL helpers).
* Adds while loop.
* Various fixes.

snapshot-20211107.634

Toggle snapshot-20211107.634's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[pydm] Implement sufficient support to run a couple of types of fibon…

…acci (iree-org#7301)

* Not yet compliant with anything we want but minimally works for default integer sizes (the VM seems to have issues with fp).
* Required also building out initial support for tuples and lists in order to get proper support for multiple returns (used for promotion RTL helpers).
* Adds while loop.
* Various fixes.

snapshot-20211106.633

Toggle snapshot-20211106.633's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
NFC: improve naming and doc for tiled and distributed loop info (iree…

…-org#7558)

The logic behind rediscovering the loop tiling and distribution
information is quite dense. This makes at least the API clearer.

snapshot-20211106.632

Toggle snapshot-20211106.632's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Adding -iree-stream-refine-usage pass. (iree-org#7537)

This adds the resource usage analysis pass using DFX to solve for
usage across a whole module and a pass that applies that analyzed
usage information back into the types.

snapshot-20211105.631

Toggle snapshot-20211105.631's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Adding -iree-stream-refine-usage pass. (iree-org#7537)

This adds the resource usage analysis pass using DFX to solve for
usage across a whole module and a pass that applies that analyzed
usage information back into the types.

snapshot-20211105.630

Toggle snapshot-20211105.630's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Adding -iree-stream-refine-usage pass. (iree-org#7537)

This adds the resource usage analysis pass using DFX to solve for
usage across a whole module and a pass that applies that analyzed
usage information back into the types.