Modelling the Modeller

Something I’ve struggled with on and off over the 20 years that I have been making mathematical models is explaining those models to others. I have tried to bring people along and develop their understanding. But mainly what I observed was that, some people just got it and others did not.

I have certainly improved my own skill at explaining. This comes down to having streamlined stories and simpler take-home messages. Telling a clearer story certainly improves my audiences’ self-satisfaction, but ultimately some of them get the whole message and others do not.

I used to mock physicists for their penchant for making models of models rather than solving the real-world problems for which the original models were intended. Now I discover that I want to do the same thing. As my own experience in model-making and in the supervision of teams building models I find myself modelling the process that I see in front of me. (This is one of the reasons behind why I wrote my first academic paper in years.) Today I want to present a model of the modelling process and of the modeller.

I developed this model out of a personal frustration, last Winter, and I have found it to be extremely powerful in helping me to work with others on modelling projects.

Multi-scale Modelling

I’m going to let you in on a secret. Models rarely have more than 3 levels. There is a practical principle in physics modelling which basically says that all of the terms in a model should operate on the same scale. Then maybe you can build a layer on top which takes multiple smaller scale models and tries to couple them. This requires the combined model to operate at a different scale to the sub-models. But going to 3 layers is already almost impossible.

In practice, you see many local-level models, quite a few 2-level models and almost no 3-level models. Like every rule of thumb there are exceptions. Interestingly, modern machine-learning methods, particularly convolutional neuronal networks, finally seem to be able to take advantage of information at arbitrary levels of description.

This idea that ML models might be doing processing of many more levels of description than traditional physics models is what pushed my thinking in a new direction about the humans who do mathematical modelling. I always thought that the levels principle from physics was grounded in some theory. It isn’t. Sure there are aspects of numerical precision theory, but they can be taken care of. I realised that the reduced number of scales or levels we see in human constructed mathematical models are often a product of the thinking styles of the modellers rather than anything inherent in the data.

3 Types of Modeller

From what I have observed, there are three types of mathematical modeller. I would go so far as to say, I am describing a fundamental mode of human thought and there are three types of thinker in this model.

The first type of mathematical modeller works entirely at a local rules level. If I assign a problem carefully to them, they have the ability to follow rules and arrive at a result. They treat their work as an input-output relationship. Everything follows a pattern and they work according to the established patterns.

For somebody like me, who typically sits outside the system and observes, I find this behaviour bizarre. It is often the sign of a beginner. We often teach people the mechanics of how to handle data before we tell them why it works that way. But as I have gained in experience I have learned that this is also a fundamental mode of behaviour which some people maintain their entire working lives. They may still be an excellent engineer, as long as somebody else is taking care of the macro-level issues.

The second type of mathematical modeller works across two levels. This person is able to do the fundamental implementation work but they are also able to see how this work fits into an external layer. The two metaphors I use here are software APIs and analytic reports.

Such a practitioner might understand that the function they are writing is a component in a statistical toolkit and stands alongside similar functions. They may also be supervising multiple engineers who are doing the implementing while the modeller is making sure that the work still fits the API requirements. Producing for an API is similar to producing for a Report. Somebody else has commissioned the report but this person is capable of combining multiple analyses into a coherent analysis.

The vast majority of data science professionals operate at this level. I would even say this is their sweet-spot. If they trained in academia then they already know how to write reports. In general data scientists love structured work environments and so working at this level is the highest they can rise without being overly confronted by the frustratingly woolly needs of other corporate divisions.

Finally, the third level of mathematical modeller is capable of working across three levels. This is an exceedingly rare skill set. I have to admit that I took a very long time to appreciate how few people actually work at this level even at some of the elite institutions where I studied. Before I explain this trait I want to emphasise that no category is inherently better than the others. This just happens to be the category to which I fit best.

The 3-level modeller is capable of designing a process, or code, which works at the lowest level, forms an intermediate interaction-layer such as an API, where useful functions are grouped together, and delivers on a useful goal in essentially an entirely different mode of action. That’s all super abstract so let me try to describe two examples.

The first example is designing an API. Junior software engineers often ask me how ‘I know’ what level of abstractions to use when writing an API. My answer is usually, “Experience,” but the truth is more complex than this. What I am actually doing is maintaining a model in my head of the level of complexity of the code in the function, how that fits into the unfolding API, and how much freedom this will enable or limit when somebody else is subsequently using the API to develop a product. That’s the definition of a good API. It shouldn’t contain abstractions just for abstractions sake, but it should not overly limit the ultimate use-cases of the final product.

The second example is from structuring business units and product development in a rapidly growing company. This is very similar to the API example. How do I deploy my resources so that we don’t waste energy but we position ourselves well for not just our immediate plan but also for many likely alternatives? I look for chunks which make sense together. These might be completed components for the product or they may be business development discussions which are only half completed. I put placeholder boxes around these chunks in my mind. Then I try out configurations which allow us the greatest long-term flexibility but at relatively low time-and-energy costs.

I find myself more-and-more nowadays applying this 3-level modelling strategy to digital health projects. It is rare to be able to develop medical AI in a single step. Sometimes this is a data issue, but more often it is a regulatory/validation issue. Often what we want to do is use one model to form an insight and an entirely separate model to decide what to do with that insight. The problem is that both of these models are statistical in nature, and they carry inherent risks that we might mistreat the patient. The key is to develop the separation I describe here. The misstep would be to try it all in one go.


People are all different from one another. I took a very long time to learn this. There is no skill set better than any other. But I find this description of abstract thinking strategies very useful in dealing with others. It also has helped me to feel less frustrated when somebody doesn’t understand my work; they just think about it differently.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.