Skip to content

Machine Learning in Science: encoding physical constraints and good development practices -- Workshop for AMLD 2021

Notifications You must be signed in to change notification settings

hanveiga/amld-2021-repML

Repository files navigation

Machine Learning in Science: encoding physical constraints and good development practices

Workshop in AMLD (Applied Machine Learning Days) EPFL 2021

Organizers:

  • Maria Han Veiga, Postdoctoral fellow at Michigan Institute for Data Science, University of Michigan
  • Miles Timpe, Postdoctoral fellow at Institute of Computational Science, University of Zurich

Description:

Advances in artificial intelligence (AI) have touched nearly every industry and scientific discipline. Machine learning (ML) models are now routinely used throughout academia, but often by practitioners with little to no practical education. Indeed, most AI/ML resources are focused on industry applications, where the purpose and goals of ML models often differ from those in academia.

We therefore propose a full day workshop focused on AI/ML in science. The workshop is intended to touch on topics critical to the application of ML models in science contexts, including how to integrate domain knowledge in machine learning models (e.g., encoding physical constraints), ensuring reproducibility, and best development practices.

The workshop will begin with a theoretical introduction to AI/ML in science, followed by a hands-on session to introduce participants to best practices, as well as how to encode physics into ML models.

In the theoretical session, we will motivate the problem, by showing concrete examples in astrophysics where constraints are not only desirable, but necessary (and establish a connection with other scientific and engineering fields). Then, we will present different methods that are designed to be physics preserving, for example, equivariant neural networks.

In the practical session, before jumping in and starting to explore physical constrained methods, we will first spend an hour showing different practices and tools that aid the development of ML models in the scientific context (e.g.: autoML frameworks), especially with respect to reproducibility, which is of critical importance to the scientific method. Finally, attendees will be able to try and experiment with the presented methods, on a series of novel datasets from different domains, such as astrophysics [1] and engineering.

[1] Timpe, Miles et al. (2020), Simulations of planetary-scale collisions between rotating, differentiated bodies, v2, Dryad, Dataset, https://doi.org/10.5061/dryad.j6q573n94

Tentative Schedule:

09:00 - 09:15 Virtual coffee / Introduction (MT,MV)
09:15 - 09:30 Virtual env set-up (MT)
09:30 - 10:00 Intro to AEGIS + Basic Example (Notebook 1) (MT)
10:00 - 10:30 Intro to rule based contraints (Notebook 2) (MT)
10:30 - 10:45 Coffee break
10:45 - 11:15 Reproducibility and Scientific Pipelines (MV)
11:15 - 12:00 Reproducibility tools (Notebook 3) (MT)
12:00 - 14:00 Lunch break
14:00 - 14:45 Encoding physical constraints (MV)
14:45 - 15:00 Coffee break
15:00 - 16:00 Encoding physical constraints hands-on session (Notebook 4) (MV,MT)
16:00 - 17:00 Free hacking / break-out rooms

About

Machine Learning in Science: encoding physical constraints and good development practices -- Workshop for AMLD 2021

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published