We used AI to search for resources on the Moon

Read our NeurIPS 2019 paper here!

As part of NASA’s 2019 Frontier Development Lab, we used AI to search for resources on the surface of the Moon which could one day help humans settle there.

The Moon is a mysterious place that most of us have only seen through the lens of a telescope, but with companies like SpaceX offering commercial space flights, NASA’s new plans to put humans back on the Moon by 2024 and talk of a growing lunar economy it may start to feel much closer in the future.

Finding resources on the surface is a key part of building permanent lunar bases and sustaining our presence there.

The difficulty is that we are unsure where these resources are. Part of the problem is that the majority of the Moon is unexplored; there is a large amount of satellite data covering the surface, but without the presence of ground truth observations (i.e. humans or probes actually going there) we are not entirely confident which parts of this data correspond to resources.

Fig. 1: Optical image of the Tycho crater (approximately 120 km across). Credit: Lunar Reconnaissance Orbiter Camera

Locating metal

An important resource is metal, and one place we might expect to find it is around the Moon’s craters. We know that some of the asteroids which have impacted the Moon would have been metallic and so it is likely they left deposits in these areas.

Their temperature may be a key way to detect them. The Moon is heated by the Sun during the day and cools during the evening. Just like when you recoil from your car seatbelt in the summer because it is too hot to touch, metals are excellent at retaining their heat and we expect these craters to be hotter in the evening compared to their surroundings.

Fig. 2: Evening temperatures around the Tycho crater (left) and some examples of the temperature variation over the lunar day at different locations in and around the crater (right). Taken from the DIVINER instrument.

The DIVINER instrument aboard the Lunar Reconnaissance Orbiter has recorded the surface temperature of the Moon over 10+ years, however understanding whether a temperature anomaly relates to metal is difficult because there are many factors of variation on the lunar surface which can affect its temperature. One example is the albedo of the surface, or how reflective it is, which can affect how much heat it absorbs.

Unsupervised machine learning

We used a fully data-driven, unsupervised machine learning approach to disentangle these factors of variation in the surface temperature and uncover those which may relate to metals. We trained a variational autoencoder (VAE) to reconstruct examples of the surface temperature variation over the lunar day and then looked inside its hidden representation to understand what it found were the major factors of variation.

Fig. 3: New anomaly maps generated by our AI. Our VAE found 4 main factors of variation in the temperature variations; the plots in the bottom row show what each factor responds to in an input temperature profile and the left two columns show maps of the VAE’s latent value for each factor. We see the AI is able to disentangle the effect of albedo (latent variable 2), onset of radiation time (latent variable 1) and potentially changes in thermal conductivity relating to metal (latent variable 3).

We found that the VAE could disentangle different physical effects, such as the onset time of radiation and the albedo, allowing us to isolate and map those areas which potentially have high thermal conductivity relating to metals. Fig. 3 shows anomaly maps creating using our VAE which potentially locates metallic-like signatures over the Tycho crater.


Our AI-generated maps could help future mission planners and progress our vision of a fully data-driven approach to space exploration. See our final presentation below for more information!

Read the full NeurIPS 2019 workshop paper here: https://ml4physicalsciences.github.io/files/NeurIPS_ML4PS_2019_115.pdf

The team: Ben Moseley, Valentin Bickel, Jérôme Burelbach, Nicole Relatores, Daniel Angerhausen, Frank Soboczenski and Dennis Wingo


Bayesian positioning with pymc3

Today we rely on GPS for almost everything, from driving our cars to auditing our financial transactions. However when tracking animals this system is invasive because it requires the use of GPS collars. In our recent work researching elephants in Kenya, we investigated whether the vibrations elephants make through the ground could be used as an alternative way to track them.

See the full code!

In this post we will explain how we used Bayesian inference to carry out probabilistic time-difference-of-arrival positioning, using the pymc3 python library.

The problem

In this setup we have a source of vibrational waves (the elephant’s footsteps) and an array of receivers (seismometers). The receivers are synchronised in time and each measures the Time Of Arrival (TOA) of the waves at their location. In the simplest case, their TOA is given by Newton’s formula, TOA = distance between the source and receiver / speed of the waves. The goal is, given the TOA at each receiver, can we determine the location of the source?

Fig 1: The classic multilateration problem: given the time of arrival at each receiver location from a source which transmits energy waves outwards at some speed, can we determine the location of the source?

This is a classic multilateration problem and the simplest way to solve it is to draw the locus of possible source locations for each receiver and search for where they overlap (Fig. 1). However, when the time of arrival measurements are noisy and the speed of the wave is uncertain, the loci will not always overlap and this method becomes ambiguous (Fig. 2). This method also assumes we know the wave speed and the time at which the source was emitted, both of which are not always known (particularly when tracking elephants with vibrations!).

Fig 2: Noisy time of arrival measurements can result in ambiguous source positions when using the loci method.

The solution: Bayesian inference

Instead we will use some of the latest machinery for Bayesian inference in the pymc3 library to build a fully probabilistic solution to the multilateration problem. This will allow us to solve the problem without knowing the source emission time and the wave speed and quantify our uncertainty in the solution.

We represent the problem using Bayes’ theorem:

    \[ p(\mathbf{x}, v \mid \mathbf{t}) = { p(\mathbf{t} \mid \mathbf{x}, v) \, p(\mathbf{x}) \, p(v) \over p(\mathbf{t}) }\]

where the location of the source \mathbf{x}, wave speed v and observed arrival times \mathbf{t} are all random variables.

p(\mathbf{x}, v \mid \mathbf{t}) is the posterior distribution of the source location and wave speed given the arrival times and this is what we want to obtain.

The data likelihood p(\mathbf{t} \mid \mathbf{x}, v) is just the forward physics model plus observational noise and p(\mathbf{x}) and p(v) define prior distributions over the source location and wave speed which we must choose.

The difficulty in obtaining the posterior is in computing the evidence p(\mathbf{t}), which is intractable for many Bayesian inference problems.

Instead we obtain samples from the posterior using the powerful NUTS sampler in pymc3. Thankfully, pymc3 is very simple to use and the workflow only consists of two steps:

  1. Define a forward physics model
  2. Carry out inference on this model by sampling from its posterior distribution

We carry out these steps using the code below.

Step 1: define the forward model

First we define our forward model in pymc3:

import numpy as np
import pymc3 as pm
import matplotlib
import matplotlib.pyplot as plt

#stations = [array of receiver coordinates] (to be defined below)
#t_obs = [array of actual observed TDOA measurements from the receivers] (to be defined below)

with pm.Model():

    # Priors
    x = pm.Uniform("x", lower=0, upper=1000, shape=2)# prior on the source location (m)
    v = pm.Normal("v", mu=346, sigma=20)# prior on the wave speed (m/s)
    t1 = pm.Uniform("t1", lower=-2, upper=4)# prior on the time offset (s)

    # Physics
    d = pm.math.sqrt(pm.math.sum((stations - x)**2, axis=1))# distance between source and receivers
    t0 = d/v# time of arrival (TOA) of each receiver
    t = t0-t1# time difference of arrival (TDOA) from the time offset

    # Observations
    Y_obs = pm.Normal('Y_obs', mu=t, sd=0.05, observed=t_obs)# we assume Gaussian noise on the TDOA measurements

This model defines the source position and wave speed as pymc3 random variables and uses the pm.Uniform and pm.Normal classes to define their priors distributions. Newton’s equation is then used to calculate the Time-Difference-Of-Arrivals (TDOAs) at each receiver relative to some arbitrary time offset t1, using the pm.math library.

t1 is nuisance variable which would allow us to compute the source emission time if we knew its value; we must include this in our model if we are to use measurements without knowing the source emission time.

We use a normal distribution to model the observational noise and use the observed argument of the pm.Normal class to pass actual observed TDOA values from each receiver to the model.

Finally we wrap the entire model within the pm.Model(): context manager to allow pymc3 to automatically track these variables during inference.

Step 2: carry out inference

To carry out inference we generate some observations to test our model with:

# generate some test observations
stations = np.random.randint(250,750, size=(N_STATIONS,2))# station positions (m)
x_true = np.array([500,500])# true source position (m)
v_true = 346.# speed of sound (m/s)
t1_true = 1.0# can be any constant, used to calculate TDOA values (s)
d_true = np.linalg.norm(stations-x_true, axis=1)
t0_true = d_true/v_true# true time of flight values
t_obs = t0_true-t1_true# true time difference of arrival values
t_obs = t_obs+0.05*np.random.randn(*t_obs.shape)# noisy observations
print("t_obs (s):", t_obs)

# t_obs and stations are passed to the pymc3 model above

# plot them
plt.scatter(stations[:,0], stations[:,1], marker="^", s=80, label="Receivers")
plt.scatter(x_true[0], x_true[1], s=40, label="True source position")
plt.xlim(250, 750)
plt.ylim(250, 750)
plt.xlabel("x (m)")
plt.ylabel("y (m)")

t_obs (s): [-0.30165118 -0.36521993 -0.61286123 -0.68923435]

and then sample from the posterior of our model by using:

# Sample posterior
trace = pm.sample(draws=2000,

pymc3 auto-assigns the NUTS sampler and draws 2000 samples from the posterior, which are stored in the trace object. All we need to do is plot these samples to obtain an estimate of the location with its associated uncertainty, and we are done!

# get means and standard deviations of posterior samples
summary = pm.summary(trace)
mu = np.array(summary["mean"])
sd = np.array(summary["sd"])

# plot
plt.scatter(stations[:,0], stations[:,1], marker="^", s=80, label="Receivers")
plt.scatter(x_true[0], x_true[1], s=40, label="True source position")
ell = matplotlib.patches.Ellipse(xy=(mu[1], mu[2]),
          width=4*sd[1], height=4*sd[2],
          angle=0., color='black', label="Posterior (2\sigma)", lw=1.5)
plt.xlim(250, 750)
plt.ylim(250, 750)
plt.xlabel("x (m)")
plt.ylabel("y (m)")

Checking the convergence of the sampler

pymc3 includes some useful tools for checking that the sampler has converged correctly. One thing we can do is to call the pm.traceplot(trace) function to plot the samples and posterior distributions of each of the variables:

Using the velocity prior as a regulariser

A neat thing about using a Bayesian framework is that the wave speed prior p(v) regularises our solution. Note that we would typically need 4 receivers to solve for the source coordinates, TOAs of the receivers and the wave speed. However, with a strong enough wave speed prior, we can get away with only 3 receivers. By switching N_STATIONS = 3 in the code above, we obtain:


We have built a probabilistic time-difference-of-arrival positioning method using the pymc3 library. This allows us to solve the multilateration problem without knowing the source emission time and the wave speed and quantify our uncertainty in the solution.

See the full code here: https://github.com/benmoseley/bayesian-time-difference-of-arrival-positioning


Can we track elephants using the vibrations they make through the ground?

In February this year I joined a research trip in Kenya and helped to investigate whether it is possible to track elephants using the vibrations they make through the ground.

Many elephants suffer from poaching, habitat destruction and conflicts with local people and being able to track them more effectively would aid conservation efforts.

Current techniques, such as positioning based on satellite imagery, camera traps or acoustic sensing, either lack temporal resolution, are expensive, or are limited to small areas. One solution which may address these challenges is to track elephants using the vibrations they make.

Elephants make vibrations through the ground as they walk

Elephants have many different types of behaviour which generate ground vibrations, including running, playing, bathing and walking. They also make low-frequency vocalisations known as “rumbling”, thought to be a means of communication between elephants, which propagate vibrations into the ground.

A key benefit of using these vibrations is that we think they can travel long distances, potentially allowing us to track the elephants over large areas and at low cost.

We listen for vibrations in the ground using seismometers, which are buried in the ground and are therefore non-invasive. These vibrations are noisy and hard to interpret; we are investigating the use of AI to help us accurately identify and locate the signals made by elephants.

We record the vibrations the elephants make using seismometers buried in the ground

Once we are able to accurately detect elephants, we will use Bayesian positioning to accurately locate them (see my other post on this).

See more info here: https://www.eng.ox.ac.uk/news/track-elephants-vibrations-footsteps/


Hello world!

Brand new website launched!

I am excited to launch the website (it’s my first forage into website development so bear with me..), watch out for posts on AI and machine learning soon!