So, what is a physics-informed neural network?

Machine learning has become increasing popular across science, but do these algorithms actually “understand” the scientific problems they are trying to solve? In this article we explain physics-informed neural networks, which are a powerful way of incorporating physical principles into machine learning.

A machine learning revolution in science

Machine learning has caused a fundamental shift in the scientific method. Traditionally, scientific research has revolved around theory and experiment: one hand-designs a well-defined theory and then continuously refines it using experimental data and analyses it to make new predictions.

But today, with rapid advances in the field of machine learning and dramatically increasing amounts of scientific data, data-driven approaches have become increasingly popular. Here an existing theory is not required, and instead a machine learning algorithm can be used to analyse a scientific problem using data alone.

Learning to model experimental data

Let’s look at one way machine learning can be used for scientific research. Imagine we are given some experimental data points that come from some unknown physical phenomenon, e.g. the orange points in the animation below.

A common scientific task is to find a model which is able to accurately predict new experimental measurements given this data.

Fig 1: example of a neural network fitting a model to some experimental data

One popular way of doing this using machine learning is to use a neural network. Given the location of a data point as input (denoted x), a neural network can be used to output a prediction of its value (denoted u), as shown in the figure below:

Fig 2: schematic of a neural network

To learn a model, we try to tune the network’s free parameters (denoted by the \thetas in the figure above) so that the network’s predictions closely match the available experimental data. This is usually done by minimising the mean-squared-error between its predictions and the training points;

    \[\mathrm{min}~&\frac{1}{N} \sum^{N}_{i} (u_{\mathrm{NN}}(x_{i};\theta) - u_{\mathrm{true}}(x_i) )^2\]

The result of training such a neural network using the experimental data above is shown in the animation.

The “naivety” of purely data-driven approaches

The problem is, using a purely data-driven approach like this can have significant downsides. Have a look at the actual values of the unknown physical process used to generate the experimental data in the animation above (grey line).

You can see that whilst the neural network accurately models the physical process within the vicinity of the experimental data, it fails to generalise away from this training data. By only relying on the data, one could argue it hasn’t truly “understood” the scientific problem.

The rise of scientific machine learning (SciML)

What if I told you that we already knew something about the physics of this process? Specifically, that the data points are actually measurements of the position of a damped harmonic oscillator:

Fig 3: a 1D damped harmonic oscillator

This is a classic physics problem, and we know that the underlying physics can be described by the following differential equation:

    \[m\frac{d^2u}{dx^2} + \mu \frac{du}{dx} + k u = 0\]


Where m is the mass of the oscillator, \mu is the coefficient of friction and k is the spring constant.

Given the limitations of “naive” machine learning approaches like the one above, researchers are now looking for ways to include this type of prior scientific knowledge into our machine learning workflows, in the blossoming field of scientific machine learning (SciML).

So, what is a physics-informed neural network?

One way to do this for our problem is to use a physics-informed neural network [1,2]. The idea is very simple: add the known differential equations directly into the loss function when training the neural network.

This is done by sampling a set of input training locations (\{x_{j}\}) and passing them through the network. Next gradients of the network’s output with respect to its input are computed at these locations (which are typically analytically available for most neural networks, and can be easily computed using autodifferentiation). Finally, the residual of the underlying differential equation is computed using these gradients, and added as an extra term in the loss function.

Fig 4: schematic of physics-informed neural network

Let’s do this for the problem above. This amounts to using the following loss function to train the network:

    \begin{align*}\mathrm{min}~&\frac{1}{N} \sum^{N}_{i} (u_{\mathrm{NN}}(x_{i};\theta) - u_{\mathrm{true}}(x_i) )^2 \\+&\frac{1}{M} \sum^{M}_{j} \left( \left[ m\frac{d^2}{dx^2} + \mu \frac{d}{dx} + k \right] u_{\mathrm{NN}}(x_{j};\theta)  \right)^2\end{align}

We can see that the additional “physics loss” in the loss function tries to ensure that the solution learned by the network is consistent with the known physics.

And here’s the result when we train the physics-informed network:

Fig 5: a physics-informed neural network learning to model a harmonic oscillator

Remarks

The physics-informed neural network is able to predict the solution far away from the experimental data points, and thus performs much better than the naive network. One could argue that this network does indeed have some concept of our prior physical principles.

The naive network is performing poorly because we are “throwing away” our existing scientific knowledge; with only the data at hand, it is like trying to understand all of the data generated by a particle collider, without having been to a physics class!

Whilst we focused on a specific physics problem here, physics-informed neural networks can be easily applied to many other types of differential equations too, and are a general-purpose tool for incorporating physics into machine learning.

Conclusion

We have seen that machine learning offers a new way of carrying out scientific research, placing an emphasis on learning from data. By incorporating existing physical principles into machine learning we are able to create more powerful models that learn from data and build upon our existing scientific knowledge.

Our own work on physics-informed neural networks

We have carried out research on physics-informed neural networks! Read the following for more:

Moseley, B., Markham, A., & Nissen-Meyer, T. (2021). Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations. ArXiv.

Moseley, B., Markham, A., & Nissen-Meyer, T. (2020). Solving the wave equation with physics-informed deep learning. ArXiv.

References

1. Lagaris, I. E., Likas, A., & Fotiadis, D. I. (1998). Artificial neural networks for solving ordinary and partial differential equations. IEEE Transactions on Neural Networks.

2. Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics.

Physics problem inspired by this blog post: https://beltoforion.de/en/harmonic_oscillator/

6 Replies to “So, what is a physics-informed neural network?”

  1. Hi Ben, an awesome attempt at simplifying the issue. I didn’t get the concept of simply adding the “physics” to the cost function … wouldn’t this imply that you know the answer upfront!? (in your example: the 1d wave equation” … extending on that, would it be possible to give ML access to a huge library of all known physical problems in literature (which are limited) and let ML and the “right physics” to the cost function … maybe we should have another layer of optimization that takes care of this step …

    anyway fascinating subject.

  2. Dear Ben,

    Thanks for the amazing blogpost. Very informative. I work in computational electromagnetics and I was wondering if it is possible (or feasible) to solve an inverse problem associated with these non-linear differential equation? For example, what if we want to estimate any parameter (which is a function of space) given the physical quantity as input? For the above example of harmonic oscillator, can we estimate the coefficient of friction at different locations given the physical quantity such as speed as a function of space? Ovbiously, most inverse problems are ill-posed so we have to assume we have fewer measurements of speed (at multiple locations) and we want to estimate friction coefficient at more number of points.

    1. Hi Amartansh,
      Thanks a lot for your comment! Yes, PINNs can (and are frequently) used for inverse problems too. This is very simple conceptually – one just optimises the unknown parameters of the PDE alongside the free parameters of the network when training the network. So in the harmonic oscillator example, we could easily learn the coefficient of friction too (assuming we have enough real data points so that the inverse problem is not ill-posed, as you mention). We could also learn an unknown PDE function which varies in space too. One way would be to have another network which defines this function (and then we backpropagate through it to jointly optimise its weights and the PINN’s weights together).

  3. This is really great and thanks for the example – helps to understand the post.

    I have another question – e.g. I do not know the exact differential equation, but i know something, e.g. dimensional reasoning. i know that this problem can be described using dimensional analysis and presented as 3 or 4 non-dimensional parameters. What would be the right way to discover those combinations of dimensional parameters?

    1. Hi Alex,
      That is a really interesting idea. PINNs have also been used for learning underlying equations themselves, e.g. https://arxiv.org/abs/2005.03448. This paper uses a PINN to estimate the gradients with respect to the input coordinates of the real data points, and then does a sparse optimisation over linear combinations of these gradients to try to discover the underlying differential equation. I think doing this, and then adding a dimensional analysis constraint during the optimisation would be really interesting!

  4. Thanks for the nice article. Just a correction that the original PINNs paper (two of them for foreward and inverse problems) was first published in 2017 in the arxiv. We didi the work in 2015-2016 time preiod and all this work is documented in the DARPA EQUiPS reports.

Leave a Reply

Your email address will not be published. Required fields are marked *