PT - JOURNAL ARTICLE AU - Bershan, Sivan AU - Meisel, Andreas AU - Mergenthaler, Philipp TI - Classifying the risk for myasthenic crisis using data-driven explainable machine learning with informative feature design and variance control – a pilot study AID - 10.1101/2023.08.19.23294175 DP - 2023 Jan 01 TA - medRxiv PG - 2023.08.19.23294175 4099 - http://medrxiv.org/content/early/2023/08/21/2023.08.19.23294175.short 4100 - http://medrxiv.org/content/early/2023/08/21/2023.08.19.23294175.full AB - Importance Myasthenic crisis (MC) is a critical progression of Myasthenia gravis (MG), requiring intensive care treatment and invasive therapies. Classifying patients at high-risk for MC facilitates treatment decisions and helps prevent disease progression.Objective To test whether machine learning models trained with real-world routine clinical data can aid precisely identifying MG patients at risk for MC.Design This is a pseudo-prospective cohort study of MG patients presenting since January 2010.Setting Single center.Participants A cohort of 51 MG patients was used for model training based on a defined set of real-world clinical data. The cohort was created from a convenience sample of 13 MC patients matched based on sex, five-year age band, antibody status, thymus pathology with MG patients who had not suffered an MC. Data analyses and model refinements were performed from June 2022 to May 2023.Exposure Classification of MG patients to high or low risk for MC using Lasso regression or random forest machine learning models.Main Outcomes and Measures The accuracy of the risk classification was assessed by patient.Results This study included 51 MG patients (13 MC, 38 non-MC; median age MC group 70.5, non-MC group 65.5). The mean cross-validated AUC classifying MG patients as high or low risk for MC based on simple or compound features derived from real-world routine clinical data showed a predictive accuracy of 68.8% for the regularized Lasso regression and of 76.5% for the random forest model. Feature importance scores suggest that multimorbidity may play a role in risk classification. Different thresholds were applied to tune model performance to optimal parameters. Studying result stability across 100 runs further indicated that the random forest model was better suited to cope with feature variance. Studying feature importance across 5100 model runs identified explainable features to distinguish MG patients at high or low risk for MC.Conclusions and Relevance In this study, feasibility of classifying risk for MC based on real-world routine clinical data using machine learning was shown. The models showed accurate and consistent performance indicating the utility of personalized risk assessment in MG patients using machine learning models.Question Can machine learning models be used to classify Myasthenia gravis patients into groups at high or low risk for myasthenic crisis with high precision based on explainable data-driven features derived from real-world clinical data?Findings In this pseudo-prospective study of 51 Myasthenia gravis patients, the risk of myasthenic crisis using real-world clinical data was accurately classified employing two machine learning models with explainable features.Meaning These findings suggest that it is possible to classify the risk for myasthenic crisis in patients based on real-world clinical data with high precision.Competing Interest StatementS.B. is co-owner of exago.ml, a geoanalytics-focused machine learning company. A.M. has received speaker honoraria, consulting fees or (institutional) financial research support from Alexion Pharmaceuticals Inc., Argenx, Grifols SA, Hormosan Pharma GmbH, Janssen, Octapharma, and UCB. He is chairman of the medical advisory board of the German Myasthenia Gravis Society. P.M. has been on the board of HealthNextGen.Funding StatementThis study did not receive dedicated funding. PM is Einstein Junior Fellow funded by the Einstein Foundation Berlin and acknowledges funding support by the Einstein Foundation Berlin (EJF-2020-602; EVF-2021-619) and the Leducq Foundation for Cardiovascular and Neurovascular Research (Consortium International pour la Recherche Circadienne sur l'AVC). Besides funding, the sponsoring organizations did not play any role in the design and conduct of the consensus meetings; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethics committee of Charité - Universitätsmedizin Berlin gave ethical approval for this work.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesFeature categories and lists are published as supplement to this manuscript (Supplementary Table 1). Ethical approval currently does not permit sharing of raw data. The analysis code will be made available upon reasonable request.