Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A Bayesian Model for Prediction of Rheumatoid Arthritis from Risk Factors

Leon Lufkin, Marko Budišić, Sumona Mondal, View ORCID ProfileShantanu Sur
doi: https://doi.org/10.1101/2020.07.09.20150326
Leon Lufkin
1The Clarkson School, Clarkson University, Potsdam, NY 13699, US
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marko Budišić
2Department of Mathematics, Clarkson University, Potsdam, NY 13699, US
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sumona Mondal
2Department of Mathematics, Clarkson University, Potsdam, NY 13699, US
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shantanu Sur
3Department of Biology, Clarkson University, Potsdam, NY 13699, US
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shantanu Sur
  • For correspondence: ssur{at}clarkson.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Rheumatoid arthritis (RA) is a chronic autoimmune disorder that typically manifests as destructive joint inflammation but also affects multiple other organ systems. The pathogenesis of RA is complex where a variety of factors including comorbidities, demographic, and socioeconomic variables are known to influence the incidence and progress of the disease. In this work, we aimed to predict RA from a set of 11 well-known risk factors and their interactions using Bayesian logistic regression. We considered up to third-order interactions between the risk factors and implemented factor analysis of mixed data (FAMD) to account for both the continuous and categorical natures of these variables. The predictive model was further optimized over the area under the receiver operating characteristic curve (AUC) using a genetic algorithm (GA). We use data from the National Health and Nutrition Examination Survey (NHANES). Our optimal predictive model has a smoothed AUC of 0.826 (95% CI: 0.801–0.850) on a validation dataset and 0.805 (95% CI: 0.781–0.829) on a holdout test dataset. Our model identified multiple second- and third-order interactions that demonstrate a strong association with RA, implying the potential role of risk factor interactions in the disease mechanism. Interestingly, we find that the inclusion of higher-order interactions in the model only marginally improves overall predictive ability. Our findings on the contribution of RA risk factors and their interaction on disease prediction could be useful in developing strategies for early diagnosis of RA, thus opening potential avenues for improved patient outcomes and reduced healthcare burden to society.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

No external funding was received for this work.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study was conducted using publicly available datasets from the National Health and Nutrition Examination Survey (NHANES) and no IRB approval was required to use the data.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Datasets used in this work are publicly available for download from NHANES website

https://wwwn.cdc.gov/nchs/nhanes/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted July 11, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A Bayesian Model for Prediction of Rheumatoid Arthritis from Risk Factors
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A Bayesian Model for Prediction of Rheumatoid Arthritis from Risk Factors
Leon Lufkin, Marko Budišić, Sumona Mondal, Shantanu Sur
medRxiv 2020.07.09.20150326; doi: https://doi.org/10.1101/2020.07.09.20150326
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A Bayesian Model for Prediction of Rheumatoid Arthritis from Risk Factors
Leon Lufkin, Marko Budišić, Sumona Mondal, Shantanu Sur
medRxiv 2020.07.09.20150326; doi: https://doi.org/10.1101/2020.07.09.20150326

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Rheumatology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)