I am asked quite often how I see Data Science in the biomedical industry. I have, of course, many answers each of which is context dependent. However one theme which I find frequently recurring is a sort of straw-man debate which seems to inherently attract technical practitioners.
The debate is usually structured as follows:
Continue reading “Data Science in Biomedical Industry”
How do you see the validation of medical AI products working in practice?
Answer: clinical trials, test-validation sets, blah, blah
But doesn’t this lead to enormous overheads?
Answer: yes, but there are shortcuts
But if you take these shortcuts then don’t you run the risk of running into costly failures when you finally run the clinical trials?
It goes on….
I had the opportunity to talk recently with a relatively advanced researcher in machine learning methods. The conversation turned briefly to the study of embeddings when he mentioned that most of his work involves things that can be embedded in Euclidean space. Since I’ve been spending a bit of time thinking about embeddings recently, I asked him some questions to get the official ML take on the subject. I was resonably gratified to learn that – although most ML engineers don’t think much about embeddings – the research on this topic considers the embedding to be tightly bound to the network architecture. It is not possible to study abstract embeddings, divorced from applications. I fully agree with this point-of-view.
Continue reading “ML Embeddings and the Neuronal Code”
Randomised controlled trials (RCTs) have been the gold standard for statistical evidence, of treatment effect, for over 100 years. Their strength is in their attempt to avoid major sources of bias in a comparison of the evidence. However, they are costly to run, particularly in the domain of personalised medicine, to which medical AI products typically belong.
Continue reading “RCTs vs Real-World Evidence for medical AI”
I have been invited to speak at the Dynamics of Immune Responses workshop/seminar/conference in May-June 2020. The invitation arose through my previous efforts to found a company in this space.
There is a growing awareness in the field of immunology of the potential for using mathematical techniques. The wedge-issue here is the cascade of data appearing via new cytometry techniques; large-data looks like a math issue to most people. I of course come from the other side of a spectrum – everything looks like a math issue to me – I wanted to stimulate drug development which engages with immune system dynamics by founding my company.
Continue reading “Invited Speaker: Dynamics of Immune Responses”
First a mea culpa, I have a huge backlog of relatively heavy articles that I really want to add to the blog. But I’ve been busy getting married – congratulations to me – and I didn’t have enough time. I strongly believe in following relatively strict guidelines on writing and editing articles, where I set myself deadlines and avoid over-writing on topics – it is just a blog after all – but for deep insights I do also have a minimum standard that I want to be able to produce before I’m willing to hit the Publish button.
Continue reading “Build – Test – Move”
I am beginning a new project this week, the topic is Causal Inference. This is something I have been reading about, and wrestling with, for quite some time. Now seems a good point to take some time out, form a project, and see what I can get done on the topic.
Continue reading “New Project: Causal Inference”
Today is my last official day under contract to Fosanis GmbH. I had my first encounter with the founders following my talk at the Digital Health Forum in March 2018. Following that initial meeting I became an advisor, writing a major funding proposal, bringing scientific techniques to the core of the product. In November 2018, following the closure of my own company, I became a full-time member of staff – as Head of Data Science – and led the project on the basis of the ideas contained in my funding proposal.
Continue reading “Closing a Chapter @ Fosanis”
This topic occurred to me following my recent talk at a dental conference at Charité Berlin. Upon hearing that I have a strong interest in inference, my fellow keynote mentioned that it drives him crazy that random forests, and similar algorithms, work so much better than DNNs on genomic data. He challenged me to come up with a reason for why this is the case.
I think that I know why. The problem I have is that I suspect that I can never prove it. That issue of not being able to prove things in machine learning is probably an equally interesting topic, for a future article, but here I want to address my theory of why random forests work better than DNNs for analysing genome data.
Continue reading “Why do Trees work better than DNNs on genome data?”
How do I really feel about this topic? I think that I can only work out the answer to this question by writing about it.
My suspicion is that those who shout loudest about personalised medicine know least about it. I fear that the promises being made publicly are categorically not possible. My hope is that I am wrong on this.
Continue reading “Personalised Medicine – A statistical theory approach”
Apparently, it’s that time again. I just gave my second invited keynote at a conference at Charité Berlin. It was really fun.
The audience were dentists – academic dentists. I confess that I struggled to understand why they thought I would be a good fit for their conference. My previous keynote was at the BIH Digital Health Forum – a much more obviously appropriate audience. But, perhaps strangely, the fit was very good.
Continue reading “Keynote @ Charité Berlin”