A Gentle Introduction to Bayesian Inference using PyMC3: Detecting a Signal in Astronomical Data
A very common question in astronomy is the following: is that blip I see in my data actually a detection of something or have I been staring at my screen for too long?
This seemingly simple question can actually be a difficult one to answer using a Frequentist (i.e. not Bayesian — yes the real definition is a wee-bit more involved; check the link to another medium article on the subject).
Enter Bayesian Inference!
In the broadest terms imaginable, Bayesian inference is a statistical methodology that answers the quintessential question posed above. Bayesian inference relies on the elegant rule aptly name Bayes’ Rule which states:
This statement illustrates the relationship between the posterior distribution, P(M|D), the likelihood distribution, P(D|M), the prior, P(M), and the marginal likelihood, P(D). With that said, this is not an article about the theory of Bayesian inference, but rather an example of how to use a python implementation of Bayesian inference (PyMC3) to solve a real problem. So let’s get on to the problem!
Let’s say you take a spectrum of your favorite astronomical object with your favorite instrument on your favorite telescope (for me it would be the Perseus Cluster with SITELLE on the CFHT but to each their own). Your spectrum would be represented as two arrays: the first one containing the wavelength information and the second one containing the number of counts in a given wavelength bin. So maybe it looks something like this:
x = np.linspace(-5,5, 50) # Wavelength data. Here we have fifty points between -5 and 5
sigma = 1
mu = 0
A = 20
B = 100
# define underlying model -- Gaussian
y = A * np.exp( - (x - mu)**2 / (2 * sigma**2)) + B
y_noise = np.random.normal(0, 1, 50) # Let's add some noise
data = y+y_noise
Here we have our 50 wavelength samples. The data recovered from the telescope is modeled with a simple normal distribution centered at x=0 with sigma=1. The amplitude (A) of the emission line (modeled with the Gaussian) is 20 units (counts/s if you like). Additionally, there is a constant background component (B)equal to 100 units. We also tack on Gaussian noise because … astronomy…
We want to know if there is an emission line in this data (modeled by a Gaussian) and, if so, what is the amplitude of the emission line. Additionally, we want to calculate the constant background value if there is any background.
To do this, we need to define a couple of things:
1 — The Prior Function
2 — The Likelihood Function
3 — The Model Function
We already know the model is a Gaussian with constant background, so each point predicted by the model will be described by the following equation:
One down two to go! Now, what about the likelihood function. If we wanted a photon-counting (discrete) approach, we would adopt a Poisson distribution. Instead, we will pretend this is continuous and use a normal distribution. Thus, our likelihood function is defined as:
Here we have used the fact that the signal-to-noise at each point is equivalent to the square root of the number of counts (N_x).
Finally, we need to choose a prior function. Let’s say I don’t know anything about my distribution (not really a stretch here). So we are simply going to say A and B have to be bigger than zero and, by eye, A<50 and B <200. How is this going to look in python? How do I code this?
Instead of coding this piece-by-piece in python, we are going to use the PyMC3 module. So how does this look in PyMC3 parlance? Let’s first take a look and then digest the purpose of each line.
import pymc3 as pm# Set model
basic_model = pm.Model()with basic_model: # Priors for unknown model parameters
A = pm.Uniform("A", lower=0, upper=50)
B = pm.Uniform("B", lower=0, upper=200)
sigma = 1 # Expected value of outcome
y_m = A*np.exp(-(x)**2/2)+B# Likelihood of observations
Y_obs = pm.Normal("Y_obs", mu=mu, sigma=np.sqrt(data), observed=data)
First things first, we gotta import the bugger. Then we will call the Model class to initiate an empty model. Using the suggested, and standard, methodology, we “open” the model using the “with basic_model:” line. Now we can define our model.
We begin by defining our priors with:
# Priors for unknown model parameters
A = pm.Uniform("A", lower=0, upper=50)
B = pm.Uniform("B", lower=0, upper=200)
We then define our expected outcome assuming a Gaussian model (y_m):
# Expected value of outcome
y_m = A*np.exp(-(x)**2/2)+B
Finally, we define our likelihood function:
# Likelihood of observations
Y_obs = pm.Normal("Y_obs", mu=mu, sigma=sigma, observed=data)
And with that final line, we have completely described our Bayesian model; we have our likelihood, underlying physical model, and prior. We are now going to use the magic that is PyMC3 to solve for our posterior distribution and marginal likelihood (also called the evidence):
# Now sample
with basic_model:
# draw posterior samples
trace = pm.sample_smc(100, parallel=True)
This lovely line of code actually has a LOT going on under the hood! In fact, the documentation describes it like this:
“Behind this innocent line, PyMC3 has hundreds of oompa loompas singing and baking a delicious Bayesian inference just for you!”
And with that, we have our results! Let’s take a quick look at the marginalized posterior for our two model parameters A and B.
In the figure above we see the resulting marginalized posterior distribution where each line represents the results from an MCMC chain (for more information check out the documentation for sample_smc).
We can also calculate the best fits with the following lines:
map_estimate = pm.find_MAP(model=basic_model)
print(map_estimate)
In our case, this spits out A~20 and B~100 which is exactly what we hoped to find!
There are a million other fantastic quantities you can directly calculate having sampled our posterior distribution. I leave you to go explore the PyMC3 documentation!
I hope this has provided you with a concrete (and simple) usage of PyMC3!