Frequentism and Bayesianism IV: How to be a Bayesian in Python
I've been spending a lot of time recently writing about frequentism and Bayesianism.
- In Frequentism and Bayesianism I: a Practical Introduction I gave an introduction to the main philosophical differences between frequentism and Bayesianism, and showed that for many common problems the two methods give basically the same point estimates.
- In Frequentism and Bayesianism II: When Results Differ I went into a bit more depth on when frequentism and Bayesianism start to diverge, particularly when it comes to the handling of nuisance parameters.
- In Frequentism and Bayesianism III: Confidence, Credibility, and why Frequentism and Science Don't Mix I talked about the subtle difference between frequentist confidence intervals and Bayesian credible intervals, and argued that in most scientific settings frequentism answers the wrong question.
Here I want to back away from the philosophical debate and go back to more practical issues: in particular, demonstrating how you can apply these Bayesian ideas in Python. The workhorse of modern Bayesianism is the Markov Chain Monte Carlo (MCMC), a class of algorithms used to efficiently sample posterior distributions.
Below I'll explore three mature Python packages for performing Bayesian analysis via MCMC:
- emcee: the MCMC Hammer
- pymc: Bayesian Statistical Modeling in Python
- pystan: The Python Interface to Stan
I won't be so much concerned with speed benchmarks between the three, as much as a comparison of their respective APIs. This post is not meant to be a tutorial in any of the three; each of them is well documented and the links above include introductory tutorials for that purpose. Rather, what I want to do here is a side-by-side comparison which will give a feel for how each package is used. I'll propose a single relatively non-trivial test problem, and show the implementation and results of this problem using all three packages. Hopefully by seeing the three approaches side-by-side, you can choose which package might be best suited for your particular application.