Reconfigurable AI Computing

Mar 23 (Afternoon) @ ASPLOS'26

Abstract: The rapid proliferation of AI workloads across devices with vastly different computational capabilities has created an ecosystem of irregular and dynamic runtime demands. No single dataflow or layout can optimally serve all workloads, motivating years of research into reconfigurable accelerators that can flexibly adapt their dataflow and layout to the workload’s needs. However, despite significant academic progress, no publicly available reconfigurable accelerator platform currently exists for the community to learn from or build upon.

This tutorial fills that gap by offering the first hands-on experience in learning, simulating, and deploying a reconfigurable accelerator, from hardware to compiler co-design. Participants will gain practical knowledge across three pillars:

Pillar 1 – Hardware (FEATHER): We introduce FEATHER, a state-of-the-art reconfigurable accelerator architecture that enables low-cost switching between dataflows and layouts, efficiently supporting diverse workload patterns. The tutorial provides RTL simulation guidance and hands-on exercises to explore how FEATHER achieves this flexibility.
Pillar 2 – Accelerator Design Language (ALLO): To enable scalable accelerator generation, we present ALLO, an Accelerator Design Language that allows participants to generate FEATHER variants of different scales with only minor frontend program modifications. This facilitates rapid scaling and evolution of the accelerator for various device classes and feature extensions.
Pillar 3 – Compiler Infrastructure (ACT): Distinct workloads demand specialized dataflow and layout mappings for optimal performance. We introduce ACT, an ecosystem that automatically generates software tools like compilers for the ALLO-generated FEATHER accelerator. The generated compiler explores the large space of dataflow and layout mappings to identify the most latency-efficient configurations for workloads.

By the end of the tutorial, participants will understand how to compile diverse workloads, generate scalable accelerators, simulate performance via deployable RTL models, and tune the system for real-world deployment scenarios, empowering the ASPLOS community to advance the next generation of reconfigurable AI accelerators.

RAIC Resources

ACT Ecosystem

Ecosystem that automatically generates software tools like compilers for the ALLO-generated FEATHER accelerator.

FEATHER

Reconfigurable accelerator architecture enabling low-cost switching between dataflows and layouts.

Allo

Hardware compiler and accelerator design language for scalable accelerator generation.

List of Topics

1:30 PM ~ 1:40 PM

Overview of the tutorial

1:40 PM ~ 2:20 PM

Introduce to reconfigurable AI Computing

Speaker: Tushar Krishna

Flexible Dataflow Demand
Flexible Layout Demand
Various Deployment Scenarios

2:30 PM ~ 3:30 PM

FEATHER – reconfigurable AI Accelerator

Speaker: Jianming Tong

FEATHER microarchitecture (15 min)
FEATHER+ microarchitecture (5 min)
FEATHER's ISA (MINISA) (10 min)
FEATHER's mapper (Mapping, Layout) Search (5 min)
LayoutLoop: FEATHER's Analytical Modeling (5 min)
RTL simulation flow for FEATHER(+) (5 min)
Hands-on: MINISA ISA visualization (5 min)

Break -- Setup Jupyter Notebook (3:30 PM ~ 4:00 PM)

4:00 PM ~ 4:50 PM

ACT Ecosystem – Automatically generating software support for accelerators

Speaker: Devansh Jain

Talk: Introduction to ACT Ecosystem (20 min)
Hands-on: Quick tutorial on TAIDL (ACT’s ISA specification language) (10 min)
Hands-on: Writing FEATHER ISA in TAIDL and generating its compiler (10 min)
Hands-on: Compilation using the generated FEATHER compiler (10 min)

4:50 PM ~ 5:40 PM

Allo – Python-embedded Accelerator Design Language (ADL)

Speaker: Niansong Zhang

Introduction Talk (20 min)
Hands-on: Deploy MINISA Traces on Allo-FEATHER (10 min)
Hands-on: End-to-end Demonstration of Matrix Multiplications -> ACT -> Allo-FEATHER (10 min)

5:40 PM ~ 6:00 PM

Conclusion and Ongoing Development

Conclusion and Summary (5 min)
Q&A (15 min)

Organizers

Jianming Tong

Georgia Institute of Technology

5th-year PhD candidate focusing on full-stack optimizations for efficiency and privacy of AI systems. Designer of FEATHER.

Niansong Zhang

Cornell University

5th-year PhD student. Research explores accelerator programming models, compute-in-SRAM techniques, and design automation.

Devansh Jain

UIUC

Ph.D. student. Primary research objective is to develop a unified compiler infrastructure for tensor architectures.

Tushar Krishna

Georgia Institute of Technology

Associate Professor in ECE. Research spans computer architecture, interconnection networks, and AI/ML accelerator systems.

Zhiru Zhang

Cornell University

Professor in ECE. IEEE Fellow. Research investigates new algorithms, design methodologies, and automation tools for heterogeneous computing.

Charith Mendis

UIUC

Assistant Professor. Research interests are at the intersection of compilers, program optimization, and machine learning.

Hongzheng Chen

Cornell University

5th-year Ph.D. candidate. Research interests lie in compilers, programming systems, and accelerator architecture.

Akash Pardeshi

UIUC

M.S. student. Research focuses on techniques such as equality saturation and e-graph applications to ML compilers.

Yujie Li

Georgia Institute of Technology

First-year MS student working on microarchitecture design and RTL simulation for FEATHER.

Citation

If you use FEATHER/MINISA in your research, please cite our papers:

@inproceedings{tong2026MINISA,
  author = {Tong, Jianming and Li, Yujie and Jain, Devansh and Mendis, Charith and Krishna, Tushar},
  title = {MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator},
  year = {2026},
  booktitle = {Proceedings of the 34th Annual International Symposium on Performance Analysis of Systems and Software},
  keywords = {minimal instruction set architecture, reconfigurable accelerator, virtual neurons},
  location = {Seoul, Korea},
  series = {ISPASS '26}
}

@inproceedings{tong2024FEATHER,
  author = {Tong, Jianming and Itagi, Anirudh and Chatarasi, Parsanth and Krishna, Tushar},
  title = {FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching},
  year = {2024},
  publisher = {Association for Computing Machinery},
  address = {Argentina},
  booktitle = {Proceedings of the 51th Annual International Symposium on Computer Architecture},
  keywords = {flexible accelerator, dataflow-layout coswitching},
  location = {Argentina},
  series = {ISCA '24}
}

Please refer to our PLDI'24 paper for more details. If you use Allo in your research, please cite our papers:

@article{chen2024allo,
  author       = {Chen, Hongzheng and Zhang, Niansong and Xiang, Shaojie and Zeng, Zhichen and Dai, Mengjia and Zhang, Zhiru},
  title        = {Allo: A Programming Model for Composable Accelerator Design},
  journal      = {Proceedings of the ACM on Programming Languages},
  volume       = {8},
  number       = {PLDI},
  articleno    = {171},
  year         = {2024},
  month        = jun,
  publisher    = {ACM},
  doi          = {10.1145/3656401}
}

@article{fang2025dato,
  author       = {Fang, Shihan and Chen, Hongzheng and Zhang, Niansong and Li, Jiajie and Meng, Han and Liu, Adrian and Zhang, Zhiru},
  title        = {Dato: A Task-Based Programming Model for Dataflow Accelerators},
  journal      = {arXiv preprint arXiv:2509.06794},
  year         = {2025},
  url          = {https://arxiv.org/abs/2509.06794}
}

@inproceedings{zhuang2025aries,
  author       = {Zhuang, Jinming and Xiang, Shaojie and Chen, Hongzheng and Zhang, Niansong and Yang, Zhuoping and Mao, Tony and Zhang, Zhiru and Zhou, Peipei},
  title        = {ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines},
  booktitle    = {Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
  series       = {FPGA},
  year         = {2025},
  publisher    = {ACM},
  note         = {Best Paper Nominee}
}

@inproceedings{pouchet2024formal,
  author       = {Pouchet, Louis-No{\"e}l and Tucker, Emily and Zhang, Niansong and Chen, Hongzheng and Pal, Debjit and Rodr{\'i}guez, Gabriel and Zhang, Zhiru},
  title        = {Formal Verification of Source-to-Source Transformations for HLS},
  booktitle    = {Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
  series       = {FPGA},
  year         = {2024},
  publisher    = {ACM},
  note         = {Best Paper Award}
}

@article{chen2024llmfpga,
  author       = {Chen, Hongzheng and Zhang, Jiahao and Du, Yixiao and Xiang, Shaojie and Yue, Zichao and Zhang, Niansong and Cai, Yaohui and Zhang, Zhiru},
  title        = {Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference},
  journal      = {ACM Transactions on Reconfigurable Technology and Systems},
  year         = {2024},
  publisher    = {ACM},
  note         = {FCCM 2024 Journal Track}
}