I am asked quite often how I see Data Science in the biomedical industry. I have, of course, many answers each of which is context dependent. However one theme which I find frequently recurring is a sort of straw-man debate which seems to inherently attract technical practitioners.
The debate is usually structured as follows:
How do you see the validation of medical AI products working in practice?
Answer: clinical trials, test-validation sets, blah, blah
But doesn’t this lead to enormous overheads?
Answer: yes, but there are shortcuts
But if you take these shortcuts then don’t you run the risk of running into costly failures when you finally run the clinical trials?
It goes on….
We need to short-circuit this debate somewhat. Yes, the entirety of the debate has merit. If nothing else it gets us all onto the same working page. But beyond a certain point we have to decide that there are different types of models and each comes with an associated risk-factor and inherent costs. The most expansive model will (should) perform best but will cost the most to produce. If a simpler model were capable of being as accurate then that is the model you would have produced in the first place. We can lower the late-stage risk by incurring higher early-stage costs, but this is a trade-off which entails a slower development process.
From my perspective, this leads naturally to three levels of modelling that Data Science can provide to the biomedical industry. These models form a natural ordering from most detailed and expensive to simplest and cheapest.
- Production / Product
- Internal Decision Making
- Research
Production / Product Models
Models which are to be deployed in production in a biomedical context undergo the highest levels of scrutiny. They form the core of the industrialisation of data modelling. These models have already gone through an extensive discovery process and are now at a point where they will contribute to decisions about human health (in production).
The only shortcuts on Production models are to be found during the earlier phases of development. These will all disappear by the time you run a clinical trial (see my paper on the present necessity of clincal trials for medical AI).
Research Models
The alternative to Production models is usually seen to be Research models. Data Scientists need to be allowed to perform discovery on data sets. They cannot be expected to turn every data dive into a product / production model.
In general, I like my teams to produce exploratory code which is shared with other team members. I don’t like the production of solo graphs without the potential to check the source later. This means that code should be shared, at least within the group, and I find that it leads to a slight homogenisation of workflows.
Of course, I expect Data Scientists to behave professionally. But I also accept that mistakes will be made. And we cannot invoke the entire machinery of automated testing for every data dive. Models from the Research aspect of the job are indicative – and no more than this!
Internal / Decision Making
There is a hidden intermediate level of model, which I actually see as one of the most important levels of modelling performed by professional Data Scientists. This is the level of modelling which will contribute to internal decision making. This can be a slightly simplified version of the desired production model – perhaps using early/small data sets – which can set goals and contribute to planning. It might be an automated set of metrics which are run on the data set – a canary in the mine – which can signal when the source data set has changed its basic characteristics.
The straw-man which I referred to in my introduction invokes a debate about whether Internal / Decision Making models should follow best practices for Production or be allowed to reside in the Research corner. I say that neither answer is appropriate! This is an intermediate level of modelling, which is more expensive to produce than Research models. It may even contribute to the Production pipeline as a set of automated tests or in the decision making process about where resources should be allocated. However, it cannot be validated to the ultimate level (eg. via a clinical trial) of a Production model. This is neither practical nor possible. The tooling around this level of model is also less well developed.
In many ways, this class of models are both the hardest and most interesting aspect of Data Science in industry today.