This is another post from my PMR exam review. In this post, I’ll implement Gibbs Sampling.

Gibbs sampling is useful for sampling from high-dimensional distributions where single-variable conditional distributions are known.

For example, say it’s too expensive to sample from $$p(x_0, x_1, x_2, …, x_d)$$. With Gibbs sampling, I initialize all variables to arbitrary values. Then while taking each sample, I also iterate through the dimensions and replace its value with a sample from the univariate conditional distribution. For example I’d update $$x_1$$ using $$p(x_1 \mid x_0, x_2, …, x_d)$$, which is easy to sample over because it’s only one dimension.

## Data generation

My plan is to sample a bunch of points using Gibbs sampling and compare them to points sampled from the true distribution. In this section, I’ll define the true joint distribution.

I’m going to use my favorite 2D Gaussian. This will be

To show what samples from this distribution should look like, I’m going to use my favorite rule: if $$\epsilon \sim \mathcal N(0, 1)$$, such as data from np.random.randn, then I can sample from $$\mathcal{N}(\mu, \sigma^2)$$ by using $$\sigma\epsilon + \mu$$. In the case of multivariate Gaussians, I use the Cholesky decomposition of the covariance matrix (kinda like taking the square root of a variance to get the standard deviation) and then use it and the mean to adjust numbers generated using np.random.randn.

## Conditionals

Gibbs sampling requires conditional distributions for each variable.

In the case of Gaussians, there’s a closed-form for the conditional.

### Conditionals of multivariate Gaussians

(You can probably skip this part and the next code block if you want.)

Below I have a function that, given the joint and variable index, gives me another function that can sample from the corresponding conditional Gaussian.

In the case of multivariate Gaussians, I’ll just use the formula from my MLPR notes. It says if I have a multivariate Gaussian

then the conditional is

Now set up the conditionals for this particular problem.

## Gibbs sampling

Now I’ll implement the Gibbs sampling algorithm! What’s cool is that gibbs_sample function only needs the univariate conditionals and how many samples to take.

## Visualizing

Now I can try it out! I’ll use Gibbs sampling to sample a few points and then plot it on top of the real joint distribution from the “Data generation” section.

One thing to keep in mind about Gibbs sampling is that it only updates one dimension at a time. This means that samples from around the same time are correlated with each other. I drew the line connecting sequential samples to show this.

Now I can also sample a bunch of points and see how it compares to the original distribution. It looks the same! What’s cool is that the one using Gibbs sampling only used samples from the univariate conditionals!