I am beginning a new project this week, the topic is Causal Inference. This is something I have been reading about, and wrestling with, for quite some time. Now seems a good point to take some time out, form a project, and see what I can get done on the topic.
Causal Inference involves the use of statistics and mathematical modelling to distinguish causal factors in systems. The basic rule in statistics is that you cannot prove causation from a statistical association. This focus on associative (correlative) inference is fine for passive prediction, but it is worse than useless when you actually want to influence the system.
Physics is, in a sense, all about the study of cause and effect. Parameters in physical models are treated as real things and experimentalists set out to, first measure, and then manipulate them. Quantum mechanics transformed physics twice over, initially by introducing the probabilistic wave-form distribution as a fundamendal unit, and secondly by occasionally reversing the arrow of time in causality. As a field, physicists and philosophers of physics have spent decades debating what is causal vs what is merely associative.
Biology, my field, has so far maintained a strong separation between statistics and causality. On the one hand, biology is all about gathering data and calculating summary statistics (see my previous article where I refer to the origins of the discipline as ‘stamp collecting‘). On the other hand, biology is largely about studying ‘mechanism’. However, try to write a paper where you use (advanced) statistical methods to demonstrate the validity of your mechanistic explanation and you will be severely tested by your reviewers. I’ve actually done it – and succeeded – it hurts!
Bridging the gap in biology between correlative statistical training and a modern-day mechanistic world view is an already huge, and growing, issue.
I would love to write the definitive introduction to Causal Inference. But time is short, and frankly I think this guy has already done a wonderful job. I will merely try to expand upon the bits which are particularly important to my work.
I will be writing my code in Julia. This is a fantastic language which has finally reached relative stability. The purpose of Julia is to combine the linguistic simplicity of scripting languages, such as Python, with the computational speed of C. This is exactly the niche where I want to be writing my code. As an added bonus (coincidence?) the syntax of the language is particularly similar to abstract mathematical thinking, via a mechanism called (dynamic) multiple dispatch which greatly contributes to an ability to quickly write complex numerical code. I began using Julia in 2014, founded the Julia Users Berlin meet-up in early 2015, and I am delighted to finally have a project through which I can learn the recently stabilised language/api.
There are a number of related projects. All of them are recent (in the extreme) and none of them are in any sense complete:
For Python, Microsoft recently opened up DoWhy. The structure I would like to write is similar.
For R, Google released CausalImpact. There might be some nice ideas in there for handling of time series, since the focus seems to be on views and clicks.
And in Julia, there IS an existing project CausalInference.jl. If it makes more sense that I contribute to this project rather than starting my own, then that is what I will do.
Finally, Omega is a probabilistic programming language, written in Julia, which incorporates the Testing and Counterfactual elements of causal inference. Those are the inference parts of causal inference. For reasons I will go into below, I may end up contributing here since the bits I am most likely to work on logically preceed these steps.
What is my purpose here? I am working on a project, without pay, for a number of months in order to:
Learn a new toolset – I have already read a lot about Causal Inference, I find it incredibly interesting, I am amazed that it is not in more widespread use, I want to learn what works and what does not. I am already learning why this approach is not in wider use; more on that another day.
Find out what is really possible – Reading books like Pearl’s Book of Why it feels as though the problem of Generalised Artificial Intelligence is already solved and just waiting to be implemented. But this is not really true. So what can be done with this toolset? What is it good for? And what are the tasks for which other toolset are far more powerful?
Get back into programming and data – I’ve spent two years leading teams of Software Engineers, Machine Learning Engineers, and Data Scientists. I architected systems. I gave nuanced feedback on how I wanted systems to be built. But I barely touched a line of code myself. For reasons of mentoring my team, I barely even looked at data myself. Along with my consulting work, I have become incredibly good at leading teams on advanced AI/ML/Data/Scientific projects. But I don’t like the feeling of letting my fundamental skills get rusty. I want to keep some skin in the game.
Decide what to do next – It seems that one of my key skills is in learning advanced technologies and then leading teams to develop tools which leverage, at least a small corner of, these techniques. I want to either work on biomedical applications of AI, or on Generalised AI. This approach has strong links to both. Let’s see what I can learn, and then we’ll see what I will work on next.
What can Causal Inference realistically do?
I fundamentally agree with the Microsoft DoWhy project in how they seem to divide up the world of Causal Inference. Essentially there are two different aspects to Causality toolsets.
- Graph theory driven analytics
- Inference / Prediction
The analytics involves (1) building a causal model, and (2) analysing what hypotheses are ‘provable’ and which ones are unconstrained by the model. This is probably where I will begin my project, and it is the bit which might be a welcome contribution to the Omega project (they don’t need it, but it would be useful).
The inference/prediction involves (1) fitting a predictive model, which obeys the causal model, to the data and estimating the (causal) effect, and (2) testing the robustness of the fitted model numerically.
In a large sense, part 1 is about finding out which causal effects are measurable for a given data set and hypothesis. And part 2, is the prediction of the outcome of What if… style manipulations. In true mathematical fashion, first you need to know what it is possible to show, then you can actually demonstrate effect size, etc.
Relationship to my previous work
My interest in Causal Inference grew out of two aspects of my previous work. Firstly, I use mathematics to explain and predict biophysical processes. With the best will in the world, it is exceedingly difficult to prove to all participants that your mathematics successfully explain a biological process. For this reason, I have wandered through most of the sub-topics of mathematical and statistical modelling by now. Causal Inference seems to be the logical endpoint of epidemiology, if things go well.
Secondly, I worked on a project involving Actor-Critic learning a number of years ago. Part of this project involved the creation of an entirely abstract representation of a Critic’s worldview. I never published this work. Basically, nobody has ever seen it. But it contributes remarkably to learning in an Actor-Critic framework and it looks as though it might actually be implemented in the brain. This work originally led me to Pearl’s causality framework. I would love to link the two. If the hypothesis representation in the causal model can be made symbolically, in accordance with my previous work, then it might be a very interesting step forward in AI and Learning Systems.