Abstract
Background An accurate estimate of expected survival time assists people near the end of life to make informed decisions about their medical care.
Objectives Use advanced machine learning methods to develop an interpretable survival model for older people admitted to residential age care.
Setting A large Australasian provider of residential age care services.
Participants All residents aged 65 years and older, admitted for long-term residential care between July 2017 and August 2023.
Sample size 11,944 residents from 40 individual care facilities.
Predictors Age category, gender, predictors related to falls, health status, co-morbidities, cognitive function, mood state, nutritional status, mobility, smoking history, sleep, skin integrity, and continence.
Outcome Probability of survival at all time points post-admission. The final model is calibrated to estimate the probability of survival at 6 months post-admission.
Statistical Analysis Cox Proportional Hazards (CoxPH), Elastic Net (EN), Ridge Regression (RR), Lasso, Gradient Boosting (GB), XGBoost (XGB) and Random Forest (RF) were tested in 20 experiments using different train/test splits at a 90/10 ratio. Model accuracy was evaluated with the Concordance Index (C-index), Harrell’s C-index, dynamic AUROC, Integrated Bier Score (IBS) and calibrated ROC analysis. XGBoost was selected as the optimal model and calibrated for time-specific predictions at 1,3,6 and 12 months post admission using Platt scaling. SHapley Additive exPlanations (SHAP) values from the 6-month model were plotted to demonstrate the global and local effect of specific predictors on survival probabilities.
Results For predicting survival across all time periods the GB, XGB and RF ensemble models had the best C-Index values of 0.714, 0.712 and 0.712 respectively. We selected the XGB model for further development and calibration and to provide interpretable outputs. The calibrated XGB model had a dynamic AUROC, when predicting survival at 6-months, of 0.746 (95% CI 0.744-0.749). For individuals with a 0.2 survival probability (80% risk of death within 6-months) the model had a negative predictive value of 0.74. Increased age, male gender, reduced mobility, poor general health status, elevated pressure ulcer risk, and lack of appetite were identified as the strongest predictors of imminent mortality.
Conclusions This study demonstrates the effective application of machine learning in developing a survival model for people admitted to residential aged care. The model has adequate predictive accuracy and confirms clinical intuition about specific mortality risk factors at both the cohort and the individual level. Advancements in explainable AI, as demonstrated in this study, not only improve clinical usability of machine learning models by increasing transparency about how predictions are generated but may also reveal novel clinical insights.
Section 1: What is already known on this topic
Existing models for estimating survival in aged care settings have been primarily based on prognostic indices which do not have advanced capabilities of machine learning approaches.
There is a notable absence of both machine learning and AI tools that provide high interpretability of models and their predictions in residential aged care settings, crucial for clinical decision-making.
Section 2: What this study adds
Our study applies and demonstrates the utility of machine learning models for survival prediction in residential aged care settings, with a focus on the six month survival probabilities.
The study performs extensive experiments using numerous algorithms, and demonstrates how multiple tools can be used in concert to provide personalized and highly interpretable predictions that enable clinicians to discuss care preferences with patients and families in an informed manner.
This research sets a benchmark on how various AI technologies can be integrated with machine learning to offer effective solutions and greater transparency for clinical decision-making in aged care settings specifically, and predictive healthcare analytics more generally.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics approval for this study was granted by the Aotearoa Research Ethics Committee (formerly New Zealand Ethics Committee, NZEC22_11) and noted by the Human Ethics Committee (Ohu Matatika 2) of Massey University New Zealand.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data is confidential; however, the predictive model has been made available online with URL links provided in the article.