This repository contains the code and data processing scripts used to train SMART, a surrogate model for predicting application runtime on large-scale Dragonfly systems.
We use two datasets, D1 and D2, which are publicly available at:
-
generate_graph_data.ipynb:
Processes the raw data to generate feature tensorsX, labelsY, and the adjacency matrix used for modeling. -
smart.ipynb:
Loads the preprocessed data and trains the surrogate model for runtime prediction.
-
SMART: A Surrogate Model for Predicting Application Runtime in Dragonfly Systems
X. Wang, P. L. Rizzini, S. Medya and Z. Lan, "SMART: A Surrogate Model for Predicting Application Runtime in Dragonfly Systems", In Proceedings of AAAI 2026. -
Extended Version
The extended version includes additional experiments, ablation studies, hyperparameter variations, temporal model comparisons, and detailed supplementary analysis that go beyond the main AAAI 2026 paper.
arXiv Link
If you use this code or dataset, please cite the AAAI paper above.