Status: draft
Last modified: 2016-10-26

In a Nutshell

Probabilistic programming (PP) combines the “Math of Data” (statistics) with the “Math of Symbols & Logic” (computer programs)1.

Probabilistic programming is different from other methods (i.e. machine learning (ML), Bayes Nets, markov chain monte carlo (MCMC) using BUGS or STAN, etc.) and better for many types of risk analysis and modeling because it is has the potential to be:

Compared to Deep Learning, probabilistic programming has advantages3:

Why is PP Good For Risk Analysis?

For risk analysts, the main of benefit probabilistic programming is that it has the potential to help us take on many hairy problems that we either do with great difficulty now, or we don’t do at all.

From statistics, you will be familiar with discrete and continuous probability distributions, where the support is either a finite discrete set or a range of real numbers.

What’s different and better about probabilistic programming is that you can work with probability distributions over infinite sets of complex objects.

Unlike Bayesian Networks, you aren’t limited to dependence or conditional structures that are directed acyclic graphs (DAGs). You can model dependence or conditional structures of arbitrary* complexity, including circular or recursive systems.

*As long as they are finite and computable.

That should blowyourmind!

Caveats

Probabilistic programming is fairly new – just now transitioning from research to practical applications. Not everything is as simple, easy, and “no-brainer” as we might like. But things are improving fast. Now is the time to start learning and experimenting.

Probabilistic Programming languages (PPL)

There are many probabilistic programming languages (PPL) in development and use. None is perfect, and each has pros and cons.

For real™ development, I prefer Figaro. For teaching and interactive tutorials like this one, I prefer WebPPL. I have some experience with Anglican, and it was pretty good once I figured out Clojure.


Endnotes and Credits

1. This pithy phrase comes from Joshua Epstein’s presentation: “How to Grow a Mind: Statistics, Structure and Abstraction”

2. The spreadsheet is arguably the greatest programmer productivity tool of all time because it eliminated the need to write so many of the most common number processing programs. The DARPA Probabilistic Programming for Advancing Machine Learning (PPAML) program has aims in this direction, at least regarding machine learning and some aspects of AI.

3. These points are adapted from Avi Pfeffer’s MLconf presentation: “Probabilistic Programming with Figaro”