I’m still a little fuzzy on how
pymc3 things work. Luckily it turns out that
pymc3’s getting started tutorial includes this task.
Data generation corresponds to Bayesian Linear Regression part 2: demo data (The order of the first two posts of the original series are interchangeable.)
I need to generate observed data to learn from. I’ll use the same model as before where the underlying function \( f \) is a line:
There’s Gaussian noise on the observations, so the observed values are
pymc3 tutorial and my old posts, I’ll generate observed data with
Sampling from the prior
This part corresponds to Bayesian Linear Regression part 1: plotting samples from the weight prior.
Now I’ll use
pymc3! To sample from the weights prior, I need to set up my model. First I’ll copy over my hyperparameters from the old post.
(Note: I updated the subscript on the slope from
Before I sampled weights from the prior using:
w = np.random.randn(line_count, D) @ V_0 + w_0
Because \( V_0 \) is a diagonal matrix, this is like sampling from two single-variable Gaussians:
pymc3, defining this model and sampling some weights from a model looks like:
There are a few differences so far!
Importantly, by assuming the independence in the priors,
pymc3 is going to learn a model that assumes \( w_b \) and \( w_m \) are independent. This is different than the original post, which also learned a covariance between the two weights.
Aside from the model set up, the action is already a little different! Before I was sampling from the normal distribution using
np.random.randn, which can be done by sampling from a uniform distribution (a super common operation) and applying a transformation to it to scale it as a Gaussian.
pymc3 uses fancier sampling approaches (my last post on Gibbs sampling is another fancy sampling approach!) This is going to be a common theme in this post: The Gaussian linear regression model I’m using in these posts is a small Gaussian model, which is easy to work with and has a closed-form for its posterior. But this doesn’t work in complex models, so
pymc3 uses approximate methods like fancy sampling instead. In other words, I’m using fancy things in
pymc3 that are overkill for my particular model.
Next I can plot some of the slopes/intercepts sampled from the prior. It looks similar to the plots I got in my last post.
Sampling from the Posterior
In the posterior post, I had a closed-form for the posterior of a Gaussian likelihood with a Gaussian prior.
pymc3, I create a deterministic random variable
exp_f that is \( f = mx + b \), where
b are the random variables defined above, and
x are the x-values for the observed data. I set that as the mean of a Normal distribution with the \( \sigma_y \) noise (and like the other posts assume I know the true noise
sigma_y = true_sigma_y.) Then I can pass the observed values
Y I came up with at the beginning. Then I sample again!
Now I can plot a few of these. This is jumping ahead to “Sampling from the posterior” in Bayesian Linear Regression part 4: Plots. This is another interesting thing about
Before, I used the closed-form of the mean and variance of the normal distribution that represents the posterior. Then I sampled from that distribution to plot samples from the posterior.
pymc3, I skip computing the posterior, and instead use clever methods to sample directly from the posterior. This is useful when the posterior is hard to compute.
pymc3 also gives me a cool plot of the samples.
Thoughts: Plotting error
That’s all I’m going to post for now. I wasn’t able to recreate all of the original series’ graphs using
One of my favorite graphs from the series was plotting a shaded area for the uncertainty from Bayesian Linear Regression part 4: Plots. There are two issues: in this post, I’m assuming the slope and intercept are independent, which is slightly different than that post, and I think that means the uncertainty is incorrectly clumped around 0 (because the form of the function that plots the uncertainty is based on \( ax^2 + 2bx + c \), where \( b \) is the covariance.) I started using
pm.MvGaussian, which would find a covariance between the slope and intercept, but
pm.summary(posterior_trace) gives me separate standard deviations for individual variables. I’ll post a follow-up if I can sort this out!