Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

GenAI Exceeds Clinical Experts in Predicting Acute Kidney Injury following Paediatric Cardiopulmonary Bypass2

View ORCID ProfileMansour Sharabiani, View ORCID ProfileAlireza Mahani, View ORCID ProfileAlex Bottle, View ORCID ProfileYadav Srinivasan, View ORCID ProfileRichard Issitt, View ORCID ProfileSerban Stoica
doi: https://doi.org/10.1101/2024.05.14.24307372
Mansour Sharabiani
aSchool of Public Health, Imperial College London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mansour Sharabiani
  • For correspondence: mt5605{at}imperial.ac.uk
Alireza Mahani
bStatman Solution Ltd., 128 City Road, London, EC1V 2NX, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alireza Mahani
  • For correspondence: statman{at}statmansolution.com
Alex Bottle
aSchool of Public Health, Imperial College London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alex Bottle
Yadav Srinivasan
cCardiac Surgery Department, Great Osmond Street Hospital for Children, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yadav Srinivasan
Richard Issitt
dPerfusion Department, Great Osmond Street Hospital for Children, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard Issitt
Serban Stoica
eCardiac Surgery Department, Bristol Royal Children’s Hospital, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Serban Stoica
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

The emergence of large language models (LLMs) offers new opportunities to leverage, often unused, information in clinical text. This study examines the utility of text embeddings generated by LLMs in predicting postoperative acute kidney injury (AKI) in paediatric cardiopulmonary bypass (CPB) patients using electronic health record (EHR) text, and to explore methods for explaining their output. AKI is a significant complication in paediatric CPB and its prediction can significantly improve patient outcomes by enabling timely interventions. We evaluate various text embedding algorithms such as Doc2Vec, top-performing sentence transformers on Hugging Face, and commercial LLMs from Google and OpenAI. We benchmark the out-of-sample predictive performance of these ‘AI models’ against a ‘baseline model’ as well as an established clinically-defined ‘expert model’. The baseline model includes patient gender, age, height, body mass index and length of operation. The majority of AI models surpass, not only the baseline model, but also the expert model. An ensemble of AI and clinical-expert models improves discriminative performance by nearly 23% compared to the baseline model. Consistency of patient clusters formed from AI-generated embeddings with clinical-expert clusters - measured via the adjusted rand index and adjusted mutual information metrics - illustrates their medical validity. We use text-generating LLMs to explain the output of embedding LLMs, e.g., by summarising the differences between AI and expert clusters, and/or by providing descriptive labels for the AI clusters. Such ‘explainability’ can increase medical practitioners’ trust in the AI applications, and help generate new hypotheses, e.g., by correlating cluster memberships with outcomes of interest.

Highlights

  • LLMs outperform clinical experts in predicting risk of AKI after paediatric CPB.

  • LLMs generate clinically plausible explanations and hypotheses using embeddings.

  • Successful application of LLMs in paediatric CPB suggests potential in other specialised fields.

  • Fine-tuning LLMs on domain data and forming ensembles of AI and clinical experts may boost accuracy.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study did not receive any funding.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Ethics committee of Great Ormond Street Hospital for Children, London gave ethical approval for this work (audit number 3045).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • 2 AKI = Acute Kidney Injury; CPB: Cardiopulmonary Bypass; KDIGO: Kidney Disease Improving Global Outcomes; BDG: Broad Diagnosis Grouping; TSP: Transformed Specific Procedure

  • - Added sentence transformers and Google LLMs to list of benchmarked embedding algorithms - Expanded explainability section

Data Availability

All data produced in the present study are available upon reasonable request to the authors.

  • 10. Glossary

    Acute Kidney Injury (AKI)
    A sudden decrease in kidney function, often occurring after surgery, particularly in paediatric patients undergoing cardiopulmonary bypass (CPB).
    Adjusted Mutual Information (AMI)
    A measure of agreement between two clusterings, adjusted for chance, based on the mutual information between the clusterings.
    Adjusted Rand Index (ARI)
    A metric used to measure the similarity between two data clusterings, adjusted for the chance grouping of elements.
    Area Under the Receiver Operating Characteristic Curve (AUC)
    A performance measurement for classification models at various threshold settings, indicating the ability of the model to distinguish between classes.
    Bag-of-Codes (BoC)
    A text embedding technique where each medical code in a patient’s record is represented as a binary indicator in a vector.
    Cardiopulmonary Bypass (CPB)
    A technique used during heart surgery where a machine temporarily takes over the function of the heart and lungs, allowing surgeons to operate on a still heart.
    Cross-Validation (CV)
    A statistical method used to estimate the performance of machine learning models, where the data is split into multiple folds, and the model is trained and validated on different folds.
    Doc2Vec
    A text embedding technique that learns distributed representations of documents, allowing for the transformation of entire documents into fixed-length vectors.
    Ensemble Model
    A machine learning technique that combines the predictions of multiple models to improve accuracy and robustness.
    Explainability
    Techniques used to interpret and understand the predictions made by complex machine learning models, often to increase trust and provide insights into the decision-making process.
    Fine-Tuning
    The process of adjusting a pre-trained model on a new dataset, typically with a smaller learning rate, to adapt the model to a specific task or domain.
    Hyperparameters
    Parameters of a machine learning model that are set before training and control the learning process, such as the number of clusters in k-means or the learning rate in neural networks.
    KDIGO
    Kidney Disease Improving Global Outcomes; a set of guide-lines used to define and classify the severity of acute kidney injury.
    Large Language Models (LLMs)
    Advanced machine learning models, often based on transformer architectures, that are trained on vast amounts of text data and can perform a variety of natural language processing tasks.
    Partial Risk Adjustment in Surgery (PRAiS)
    A model used in the UK to predict 30-day mortality risk after paediatric heart surgery, incorporating various clinical variables.
    Spherical K-Means
    A variant of the k-means clustering algorithm that uses cosine distance instead of Euclidean distance, making it suitable for clustering high-dimensional data like text embeddings.
    Text Embedding
    A method of converting text into numeric vectors that capture the semantic meaning of the text, used in machine learning models for various predictive tasks.
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted September 02, 2024.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    GenAI Exceeds Clinical Experts in Predicting Acute Kidney Injury following Paediatric Cardiopulmonary Bypass2
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    GenAI Exceeds Clinical Experts in Predicting Acute Kidney Injury following Paediatric Cardiopulmonary Bypass2
    Mansour Sharabiani, Alireza Mahani, Alex Bottle, Yadav Srinivasan, Richard Issitt, Serban Stoica
    medRxiv 2024.05.14.24307372; doi: https://doi.org/10.1101/2024.05.14.24307372
    Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    GenAI Exceeds Clinical Experts in Predicting Acute Kidney Injury following Paediatric Cardiopulmonary Bypass2
    Mansour Sharabiani, Alireza Mahani, Alex Bottle, Yadav Srinivasan, Richard Issitt, Serban Stoica
    medRxiv 2024.05.14.24307372; doi: https://doi.org/10.1101/2024.05.14.24307372

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Health Informatics
    Subject Areas
    All Articles
    • Addiction Medicine (349)
    • Allergy and Immunology (668)
    • Allergy and Immunology (668)
    • Anesthesia (181)
    • Cardiovascular Medicine (2648)
    • Dentistry and Oral Medicine (316)
    • Dermatology (223)
    • Emergency Medicine (399)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
    • Epidemiology (12228)
    • Forensic Medicine (10)
    • Gastroenterology (759)
    • Genetic and Genomic Medicine (4103)
    • Geriatric Medicine (387)
    • Health Economics (680)
    • Health Informatics (2657)
    • Health Policy (1005)
    • Health Systems and Quality Improvement (985)
    • Hematology (363)
    • HIV/AIDS (851)
    • Infectious Diseases (except HIV/AIDS) (13695)
    • Intensive Care and Critical Care Medicine (797)
    • Medical Education (399)
    • Medical Ethics (109)
    • Nephrology (436)
    • Neurology (3882)
    • Nursing (209)
    • Nutrition (577)
    • Obstetrics and Gynecology (739)
    • Occupational and Environmental Health (695)
    • Oncology (2030)
    • Ophthalmology (585)
    • Orthopedics (240)
    • Otolaryngology (306)
    • Pain Medicine (250)
    • Palliative Medicine (75)
    • Pathology (473)
    • Pediatrics (1115)
    • Pharmacology and Therapeutics (466)
    • Primary Care Research (452)
    • Psychiatry and Clinical Psychology (3432)
    • Public and Global Health (6527)
    • Radiology and Imaging (1403)
    • Rehabilitation Medicine and Physical Therapy (814)
    • Respiratory Medicine (871)
    • Rheumatology (409)
    • Sexual and Reproductive Health (410)
    • Sports Medicine (342)
    • Surgery (448)
    • Toxicology (53)
    • Transplantation (185)
    • Urology (165)