Markov Chain Monte Carlo
Markov chain Monte Carlo (MCMC) is a methodology that use Markov sequences to efficiently model otherwise intractable distributions. That is, Given a probability distribution , the goal of MCMC is to simulate a random variable whose distribution is . Such distribution may continuous or discrete even though we assume it's discrete.
SLLN
How MCMC works? The SLLN gives strong supports.
Let be a sequence of independent random variables, each having finite mean . The Strong Law of Large Numbers (SLLN) states that the averages converges with probability 1 to the common mean , denote . Or .
Monte Carlo Integration
One of the most common application of MCMC (based on SLLN) is Monte Carlo integration. Suppose we want to compute the integral of a function over a bounded interval . We can use the following Monte Carlo estimator: where and . Then since , and then we can generate from so that .
Markov Chain
Another application is the following theorem:
Assume that is an ergodic Markov chain with stationary distribution . Let be a bounded, real-valued function. Let be a random variable with distribution . Then with probability 1.
Metropolis-Hastings Algorithm
The most common MCMC method is Metropolis-Hastings algorithm. It produces an irreducible, aperiodic Markov chain with stationary distribution . Specifically:
Let be a discrete probability distribution. Let be a transition matrix for any irreducible Markov chain with the same state space as . used as a proposal chain which generates elements of a sequence that the algorithm decides whether or not to accept (assumed knows how to sample from ). The algorithm constructs a reversible Markov chain whose stationary distribution is
To describe the transition mechanism for assume that at time the chain is at state . The next step of the chain is determined by a two-step procedure.
- Choose a new state according the , that is, with probability
- Decide whether to accept or not. Let . If , then accept . Otherwise, accept with probability . If is not aaccepted, then we keep as the next state of the chain. (i.e. )
some remarks
- uniform lead
- is a symmetric matrix, then
- is ergodic, then resulting chain is ergodic
- long run may lead bias and burn-in (i.e. the discarding of the initial iterations)
Gibbs Sampling
The Gibbs sampler is a special case of the Metropolis–Hastings algorithm, with a particular choice of proposal distribution.
Suppose is such that . Let is the pdf on . Then we have and . Since hard to generate from which is a higher dimensional distribution, we can use the conditional distribution where and . Then we can use the following algorithm:
For the nth step with given :
- Generate
- If , Put and generate and if , Put and generate