Aesclea - Our Oncology Foundation Model

It's no secret that ML and AI is going to transform healthcare, but building the models to do so is incredibly difficult. Unlike other 'easier' domains, such as self driving cars or coding, where there are clear constraints on the action space and the input signal is (almost) fully observable, the medical domain is unbounded and only partially observed. Here, the signal about a patient is severely undersampled in the sense that we only extract certain observations about what is actually happening internally to them, such as blood tests, vitals, etc., but ignore a whole host of information, such as genomics, detailed logs of the patient history (did they live with asbestos in 1967?) etc. An analogy to the 'easier' domains I mentioned would be like training self driving car, that can only take an image of a constrained view or drive without knowing the speed or direction it's going for most of the time; or training a coding agent without looking at full code files. Which, as I'm sure you can understand, would be incredibly difficult.

So if this is so hard, why even bother? Well, because we believe that this is where the value of AI really lies, and not in replacing low skilled workers, which is what is typically happening. But medical predictions is a much harder problem, because as innovators it requires to reimagine what is physically possible without using the guiding light of what humans can already do to show us the way; and secondly for the reason I've mentioned above, there often simply isn't the data in a format that is amenable to modelling right now, but we're getting there, and there are now methods to extract data that we can use to great effect.

Something that has enabled a step change towards this is the advent of LLMs, which now enables the large scale feature extraction of key events and values in a patient's clinical notes. Is this perfect? No, as information is not always in the notes, and we need to know a priori what data to use. But it enables us to now leverage predictive features that up to this point were not available at this scale. A typical outcome from a patient event extraction would look something like the outcome below and can be extracted for less than $0.01.

EVENT_DATE,EVENT_TOKEN
1972-07-03 00:00:00,DEMO_SEX_MALE
2018-03-12 09:00:00,ADM_Outpatient
2018-03-12 00:00:00,DIAG_I10
2019-11-01 00:00:00,DIAG_R91
2019-11-01 00:00:00,LAB_PDL1_POSITIVE_60PCT
2019-11-01 00:00:00,LAB_EGFR_L858R
2023-01-15 08:00:00,PROC_BRONCHOSCOPY
2023-01-15 00:00:00,PATHOLOGY_NSCLC_ADENOCARCINOMA
2023-01-15 14:23:00,ADM_Routine/Elective
2023-02-10 00:00:00,MED_TYROSINE_KINASE_INHIBITOR
2023-02-10 07:10:00,ADM_Start_Treatment
2023-06-05 00:00:00,DIAG_R63
2023-06-05 00:00:00,MED_ANTIEMETIC
2023-06-05 10:44:00,ADM_Followup
2024-03-22 00:00:00,PROC_LOBECTOMY_RIGHT
2024-03-22 06:50:00,ADM_Surgery
2024-09-14 00:00:00,LAB_TMB_HIGH
2024-09-14 00:00:00,MED_IMMUNOTHERAPY
2024-09-14 09:00:00,ADM_Infusion_Center
2024-09-1809:00:00,CENSORED
(this is simulated data, not PHI)

So how do we model this? Well, sequence modelling problems have been worked on for a very long time (with moderate effect) until a very popular method called the transformer emerged in 2018. The transformer is nearly 8 years old now, but despite this, its impact in healthcare is only just starting to be felt. At its heart, it's a fantastic architecture for modelling autoregressive sequences (of which language is particularly well suited), but perhaps what is of greater impact is modelling the sequences of key medical events that occur over a patients lifespan.

It's the combination of large scale data extraction, coupled with the computational power of these models that enable a new era of predictive modelling in patient outcomes. Why is the transformer such a powerful model? Because it is excellent at dynamically modelling pairwise events and then systematically composing them to create complex interdependence between several events. We're starting to see models such as Delphi-2m and SleepFM, show impressive performance at predicting outcomes such as mortality or disease incidence. What is more powerful though, is these models' natural ability to predict a range of potential outcomes and not just single events such as mortality.

This is a big swing we're taking at radical, where we're building a single unified model that models the entire history of oncology patients by encapsulating the relationships between key events during a patient's lifespan, and then simulating what outcomes for the patient look like. Doing so is incredibly valuable, as it allows patients to make more informed decisions about their care, provides alerts for when patients should seek extra testing, and any other medical outcomes that are being directly modelled in the data. This is an inflection point for what's becoming possible, the architectures and now becoming powerful enough to model these long range dependencies and compose them into robust predictions and most essentially hospitals are opening up their data silos to enable these predictions to happen. At Radical we're partnering with top US centres to provide the data of 11m patients and billions of temporal events to power our unified Aesclea model.

The model is still cooking, and we'll provide some results very soon, so watch this space! But if you're interested in what we're building, want to brainstorm or simply have a chat, please reach out to us at here, we'd be really keen to talk to you.