What is rejection sampling and why would you need it? Suppose that we have a probability density function (PDF) that is impossible to analyze analytically. This is where rejection sampling can be used for: a simple to implement (yet not always effective) sampling method.

## Introduction

Suppose we have a function that is not computationally tracktable. How can we draw samples from it? One (of the many) methods that is out there, is rejection sampling. In this article, we will try to draw samples from the following distribution:

The secret function used for this plot is , which is a multimodal Gaussian. (Note: this function is mathematically tractable, but it is used for demonstration purposes.)

## Approximation

Now we have to come up with a probability density function , such that for all . In other words, we need a PDF that is, when multiplied by a constant , higher than . On our example, one such a function is and . How did I found this function? First of all, Gaussians are easy to compute. Therefore, I choose a Gaussian which has its mean around the “middle” of . I then tuned its variance and the constant by just trying out some values. The result is the following , which is indeed always more than :

## The mathematics

So, what to do next? We sample pairs of where are samples from the approximation Gaussian and is sampled from a uniform probability distribution in the interval . Then we look if , since if that is the case, then lives below the function we want to approximate! Note that this pair is also a sample of the function we wished to approximate.

## Results

The theory is summarized in the following Python script:

The resulted samples are the following:

These samples clearly lie below the function which we wished to sample from:

So this method works. However, note that this method is not efficient. We throw away many points and therefore waste a large amount of time. A better method is for example Markov Chain Monte Carlo (MCMC). The rejection sampling method is efficient in easy probability density functions, but you will have many problems (read: you throw a way a lot of points) if you have functions with many peeks.

## Get updates in your inbox

Join over 8,000 data science learners.