Rouson, Damian; Bai, Zhe; Bonachea, Dan; Ergawy, Kareem; Gutmann, Ethan; Klemm, Michael; Rasmussen, Katherine; Richardson, Brad; Shende, Sameer; Torres, David; Zhang, Yunhao

doi:10.25344/S4VG6T

Download PDF

Automatically parallelizing batch inference on deep neural networks using Fiats and Fortran 2023 `do concurrent`

2025

et al.

Published Web Location

https://doi.org/10.25344/S4VG6T

Creative Commons 'BY' version 4.0 license

Abstract

This paper introduces novel programming strategies that leverage features of the Fortran 2023 standard of the International Standards Organization (ISO) to automatically parallelize computations on deep neural networks. The paper focuses on the interplay of object-oriented, parallel, and functional programming paradigms in the Fiats deep learning library. We demonstrate how several infrequently used language features play a role in enabling efficient, parallel execution. Specifically, the ability to explicitly declare that a procedure is pure facilitates inference in the context of the language’s loop-parallelism construct `do concurrent`. Also, explicitly prohibiting the overriding of a parent type’s type-bound procedures eliminates the need for dynamic dispatch in performance-critical code. Finally, this paper uses batch inference calculations on a neural network surrogate for atmospheric aerosol dynamics to demonstrate that LLVM Flang compiler’s automatic parallelization of `do concurrent` achieves roughly the same performance and scalability as achieved by OpenMP compiler directives. We also demonstrate that double-precision inference costs 37–72% longer runtime than default-real precision with most values in the range 57-60%.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Lawrence Berkeley National Laboratory

Automatically parallelizing batch inference on deep neural networks using Fiats and Fortran 2023 `do concurrent`

Published Web Location