ASPLOS'26 Tutorial: RAIC

Reconfigurable AI Computing

Mar 23 (Afternoon) @ ASPLOS'26

Abstract: The rapid proliferation of AI workloads across devices with vastly different computational capabilities has created an ecosystem of irregular and dynamic runtime demands. No single dataflow or layout can optimally serve all workloads, motivating years of research into reconfigurable accelerators that can flexibly adapt their dataflow and layout to the workload’s needs. However, despite significant academic progress, no publicly available reconfigurable accelerator platform currently exists for the community to learn from or build upon.

This tutorial fills that gap by offering the first hands-on experience in learning, simulating, and deploying a reconfigurable accelerator, from hardware to compiler co-design. Participants will gain practical knowledge across three pillars:

  • Pillar 1 – Hardware (FEATHER): We introduce FEATHER, a state-of-the-art reconfigurable accelerator architecture that enables low-cost switching between dataflows and layouts, efficiently supporting diverse workload patterns. The tutorial provides RTL simulation guidance and hands-on exercises to explore how FEATHER achieves this flexibility.
  • Pillar 2 – Accelerator Design Language (ALLO): To enable scalable accelerator generation, we present ALLO, an Accelerator Design Language that allows participants to generate FEATHER variants of different scales with only minor frontend program modifications. This facilitates rapid scaling and evolution of the accelerator for various device classes and feature extensions.
  • Pillar 3 – Compiler Infrastructure (ACT): Distinct workloads demand specialized dataflow and layout mappings for optimal performance. We introduce ACT, an ecosystem that automatically generates software tools like compilers for the ALLO-generated FEATHER accelerator. The generated compiler explores the large space of dataflow and layout mappings to identify the most latency-efficient configurations for workloads.

By the end of the tutorial, participants will understand how to compile diverse workloads, generate scalable accelerators, simulate performance via deployable RTL models, and tune the system for real-world deployment scenarios, empowering the ASPLOS community to advance the next generation of reconfigurable AI accelerators.

RAIC Resources

List of Topics

1:30 PM ~ 1:40 PM

Overview of the tutorial

1:40 PM ~ 2:20 PM

Introduce to reconfigurable AI Computing

Speaker: Tushar Krishna

  • Flexible Dataflow Demand
  • Flexible Layout Demand
  • Various Deployment Scenarios
2:30 PM ~ 3:30 PM

FEATHER – reconfigurable AI Accelerator

Speaker: Jianming Tong

  • FEATHER microarchitecture (15 min)
  • FEATHER+ microarchitecture (5 min)
  • FEATHER's ISA (MINISA) (10 min)
  • FEATHER's mapper (Mapping, Layout) Search (5 min)
  • LayoutLoop: FEATHER's Analytical Modeling (5 min)
  • RTL simulation flow for FEATHER(+) (5 min)
  • Hands-on: MINISA ISA visualization (5 min)
Break -- Setup Jupyter Notebook (3:30 PM ~ 4:00 PM)
4:00 PM ~ 4:50 PM

ACT Ecosystem – Automatically generating software support for accelerators

Speaker: Devansh Jain

  • Talk: Introduction to ACT Ecosystem (20 min)
  • Hands-on: Quick tutorial on TAIDL (ACT’s ISA specification language) (10 min)
  • Hands-on: Writing FEATHER ISA in TAIDL and generating its compiler (10 min)
  • Hands-on: Compilation using the generated FEATHER compiler (10 min)
4:50 PM ~ 5:40 PM

Allo – Python-embedded Accelerator Design Language (ADL)

Speaker: Niansong Zhang

  • Introduction Talk (20 min)
  • Hands-on: Deploy MINISA Traces on Allo-FEATHER (10 min)
  • Hands-on: End-to-end Demonstration of Matrix Multiplications -> ACT -> Allo-FEATHER (10 min)
5:40 PM ~ 6:00 PM

Conclusion and Ongoing Development

  • Conclusion and Summary (5 min)
  • Q&A (15 min)

Organizers

Citation

If you use FEATHER/MINISA in your research, please cite our papers:

@inproceedings{tong2026MINISA,
  author = {Tong, Jianming and Li, Yujie and Jain, Devansh and Mendis, Charith and Krishna, Tushar},
  title = {MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator},
  year = {2026},
  booktitle = {Proceedings of the 34th Annual International Symposium on Performance Analysis of Systems and Software},
  keywords = {minimal instruction set architecture, reconfigurable accelerator, virtual neurons},
  location = {Seoul, Korea},
  series = {ISPASS '26}
}

@inproceedings{tong2024FEATHER,
  author = {Tong, Jianming and Itagi, Anirudh and Chatarasi, Parsanth and Krishna, Tushar},
  title = {FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching},
  year = {2024},
  publisher = {Association for Computing Machinery},
  address = {Argentina},
  booktitle = {Proceedings of the 51th Annual International Symposium on Computer Architecture},
  keywords = {flexible accelerator, dataflow-layout coswitching},
  location = {Argentina},
  series = {ISCA '24}
}

Please refer to our PLDI'24 paper for more details. If you use Allo in your research, please cite our papers:

@article{chen2024allo,
  author       = {Chen, Hongzheng and Zhang, Niansong and Xiang, Shaojie and Zeng, Zhichen and Dai, Mengjia and Zhang, Zhiru},
  title        = {Allo: A Programming Model for Composable Accelerator Design},
  journal      = {Proceedings of the ACM on Programming Languages},
  volume       = {8},
  number       = {PLDI},
  articleno    = {171},
  year         = {2024},
  month        = jun,
  publisher    = {ACM},
  doi          = {10.1145/3656401}
}

@article{fang2025dato,
  author       = {Fang, Shihan and Chen, Hongzheng and Zhang, Niansong and Li, Jiajie and Meng, Han and Liu, Adrian and Zhang, Zhiru},
  title        = {Dato: A Task-Based Programming Model for Dataflow Accelerators},
  journal      = {arXiv preprint arXiv:2509.06794},
  year         = {2025},
  url          = {https://arxiv.org/abs/2509.06794}
}

@inproceedings{zhuang2025aries,
  author       = {Zhuang, Jinming and Xiang, Shaojie and Chen, Hongzheng and Zhang, Niansong and Yang, Zhuoping and Mao, Tony and Zhang, Zhiru and Zhou, Peipei},
  title        = {ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines},
  booktitle    = {Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
  series       = {FPGA},
  year         = {2025},
  publisher    = {ACM},
  note         = {Best Paper Nominee}
}

@inproceedings{pouchet2024formal,
  author       = {Pouchet, Louis-No{\"e}l and Tucker, Emily and Zhang, Niansong and Chen, Hongzheng and Pal, Debjit and Rodr{\'i}guez, Gabriel and Zhang, Zhiru},
  title        = {Formal Verification of Source-to-Source Transformations for HLS},
  booktitle    = {Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
  series       = {FPGA},
  year         = {2024},
  publisher    = {ACM},
  note         = {Best Paper Award}
}

@article{chen2024llmfpga,
  author       = {Chen, Hongzheng and Zhang, Jiahao and Du, Yixiao and Xiang, Shaojie and Yue, Zichao and Zhang, Niansong and Cai, Yaohui and Zhang, Zhiru},
  title        = {Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference},
  journal      = {ACM Transactions on Reconfigurable Technology and Systems},
  year         = {2024},
  publisher    = {ACM},
  note         = {FCCM 2024 Journal Track}
}