In a Nutshell
Probabilistic programming (PP) combines the “Math of Data” (statistics) with the “Math of Symbols & Logic” (computer programs)1.
Probabilistic programming is different from other methods (i.e. machine learning (ML), Bayes Nets, markov chain monte carlo (MCMC) using BUGS or STAN, etc.) and better for many types of risk analysis and modeling because it is has the potential to be:
- More powerful and generally applicable – able to handle complex problems that are hard or impossible with other methods
- Easier to learn, iterate, extend, maintain … – you focus your time on the model rather than computational details of inference
- The “spreadsheet” of risk modeling — i.e a multi-purpose tool with a low barrier to entry and a smooth learning curve2.
Compared to Deep Learning, probabilistic programming has advantages3:
- Easier to incorporate rich domain knowledge
- Can work well with “Small Data”
- More explainable and understandable
- Can make probabilistic inferences on complex data types of variable size
Why is PP Good For Risk Analysis?
For risk analysts, the main of benefit probabilistic programming is that it has the potential to help us take on many hairy problems that we either do with great difficulty now, or we don’t do at all.
From statistics, you will be familiar with discrete and continuous probability distributions, where the support is either a finite discrete set or a range of real numbers.
What’s different and better about probabilistic programming is that you can work with probability distributions over infinite sets of complex objects.
Unlike Bayesian Networks, you aren’t limited to dependence or conditional structures that are directed acyclic graphs (DAGs). You can model dependence or conditional structures of arbitrary* complexity, including circular or recursive systems.
*As long as they are finite and computable.
That should blow…your…mind!
Caveats
Probabilistic programming is fairly new – just now transitioning from research to practical applications. Not everything is as simple, easy, and “no-brainer” as we might like. But things are improving fast. Now is the time to start learning and experimenting.
Probabilistic Programming languages (PPL)
There are many probabilistic programming languages (PPL) in development and use. None is perfect, and each has pros and cons.
- WebPPL – used in this tutorial. A functional programming language built on top of Javascript. Good for interactive development, web demos, web applications, and teaching. Has R interface. Pronounced “web people”. Successor to the Church language.
- Figaro – A functional programming language implemented as Scala libraries, which is implemented as Java libraries. Good if you want/need the benefits of a mixed functional/object language like Scala (e.g. Akka actor framework, etc.).
- Anglican – A functional programming language built on top of Clojure (a dialect of Lisp designed for functional programming on the JVM). Like other Lisps, Clojure treats code as data, which is nice for symbolic AI. It also has a macro system. Some good inference algorithms for certain classes of models.
- … and many more – click here for descriptions and links.
For real™ development, I prefer Figaro. For teaching and interactive tutorials like this one, I prefer WebPPL. I have some experience with Anglican, and it was pretty good once I figured out Clojure.
Endnotes and Credits
1. ↩ This pithy phrase comes from Joshua Epstein’s presentation: “How to Grow a Mind: Statistics, Structure and Abstraction”
2. ↩ The spreadsheet is arguably the greatest programmer productivity tool of all time because it eliminated the need to write so many of the most common number processing programs. The DARPA Probabilistic Programming for Advancing Machine Learning (PPAML) program has aims in this direction, at least regarding machine learning and some aspects of AI.
3. ↩ These points are adapted from Avi Pfeffer’s MLconf presentation: “Probabilistic Programming with Figaro”