Computational Science is being revolutionized by the integration of AI and simulation and, in particular, by deep learning surrogate models that can replace all or part or of traditional large-scale HPC computations. Such surrogates can achieve remarkable performance improvements, as much as several orders of magnitude, and save in both compute time and energy. The Surrogate Benchmark Initiative (SBI) project creates a community repository and FAIR (Findable, Accessible, Interoperable, and Reusable) data ecosystem for HPC application surrogate benchmarks. The SBI team comes from Argonne National Laboratory, Indiana University, Rutgers University, and the University of Tennessee, Knoxville. SBI repositories include data, code, and all relevant collateral artifacts, that the science and engineering community needs to use and reuse these data sets and surrogates. SBI repositories generate active research from both the participants in SBI and the broad community of AI and domain scientists. Fields covered by SBI include Applied Math, Astrophysics, Biomolecular Sciences, Climate, Computational Biology, Computational Fluids, Computer Science, Cosmology, Earthquake Science, Fusion, High Energy Physics, Molecular Docking, Nanoengineering, and Plasma Physics.
SBI collaborates with a leading machine learning benchmarking activity -- MLPerf -- and mirrors their process as much as possible. MLPerf has over 1400 members from over 80 institutional members (mainly from industry) and strong existing involvement of the Department of Energy laboratories through the HPC and science data MLPerf working groups. SBI builds tutorials around each deposited benchmark and collaborates with users from a broad range of fields to make new surrogates and new SBI benchmarks based on an initial set of four produced in house. SBI working groups and other community activities are set up to advance all issues around the surrogate concept. SBI will design and build general middleware to support the generation (training from HPC simulations) and the use of surrogates. This will also make it easier for general users to develop new surrogates, and help make their major performance increases pervasive across DoE computational science. SBI benefits application communities and computer systems research.
SBI will also support several AI research areas. The benchmarks will drive research on efficient generic surrogate architectures and how they fit with different hardware systems. Another specific activity will be research on the uncertainty quantification of the surrogate estimates, and how to build this into surrogates. Thirdly, there will be important studies of the amount of training data needed to get reliable surrogates for a given accuracy choice. Finally, the SBI team has already derived some simple but effective performance models or surrogates, but these need extension as deeper uses of surrogates become understood and exhibited in SBI repository depositions.
The project will explicitly fund staff to ensure that non-project users are appropriately supported and that SBI use of FAIR data principles is effective. MLPerf’s user interface technology is being extended to enable convenient FAIR use of SBI repositories and access through interactive Python notebooks.