Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Epistatic Features and Machine Learning Improve Alzheimer’s Risk Prediction Over Polygenic Risk Scores

Stephen Hermes, Janet Cady, Steven Armentrout, James O’Connor, Sarah Carlson, View ORCID ProfileCarlos Cruchaga, View ORCID ProfileThomas Wingo, Ellen McRae Greytak, The Alzheimer’s Disease Neuroimaging Initiative
doi: https://doi.org/10.1101/2023.02.10.23285766
Stephen Hermes
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Janet Cady
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steven Armentrout
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James O’Connor
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah Carlson
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carlos Cruchaga
2Department of Psychiatry, Washington University, St. Louis, MO, USA
3Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carlos Cruchaga
Thomas Wingo
4Goizueta Alzheimer’s Disease Center, Emory University School of Medicine, Atlanta, GA, USA
5Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
6Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thomas Wingo
Ellen McRae Greytak
1Parabon NanoLabs, Inc., Reston, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: ellen{at}parabon.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Polygenic risk scores (PRS) are linear combinations of genetic markers weighted by effect size that are commonly used to predict disease risk. For complex heritable diseases such as late onset Alzheimer’s disease (LOAD), PRS models fail to capture much of the heritability. Additionally, PRS models are highly dependent on the population structure of data on which effect sizes are assessed, and have poor generalizability to new data.

Objective The goal of this study is to construct a paragenic risk score that, in addition to single genetic marker data used in PRS, incorporates epistatic interaction features and machine learning methods to predict lifetime risk for LOAD.

Methods We construct a new state-of-the-art genetic model for lifetime risk of Alzheimer’s disease. Our approach innovates over PRS models in two ways: First, by directly incorporating epistatic interactions between SNP loci using an evolutionary algorithm guided by shared pathway information; and second, by estimating risk via an ensemble of machine learning models (gradient boosting machines and deep learning) instead of simple logistic regression. We compare the paragenic model to a PRS model from the literature trained on the same dataset.

Results The paragenic model is significantly more accurate than the PRS model under 10-fold cross-validation, obtaining an AUC of 83% and near-clinically significant matched sensitivity/specificity of 75%, and remains significantly more accurate when evaluated on an independent holdout dataset. Additionally, the paragenic model maintains accuracy within APOE genotypes.

Conclusion Paragenic models show potential for improving lifetime disease risk prediction for complex heritable diseases such as LOAD over PRS models.

Competing Interest Statement

SH, JC, SA, JO, SC, and EG are employees of Parabon NanoLabs, Inc. CC has received research support from: GSK and EISAI. The funders of the study had no role in the collection, analysis, or interpretation of data; in the writing of the report; or in the decision to submit the paper for publication. CC is a member of the advisory board of Vivid Genomics and Circular Genomics. TW is a co-founder of revXon.

Funding Statement

Research reported in this publication was supported by the National Institute on Aging of the National Institutes of Health under award number 2R44AG050366-02. Participant recruitment at Emory was supported in part by awards P30 AG066511 and R01 AG070937.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Advarra IRB gave ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • ↵† Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf

  • Funding statement corrected

  • https://neurogenomics.wustl.edu/

Data Availability

The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted March 15, 2023.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Epistatic Features and Machine Learning Improve Alzheimer’s Risk Prediction Over Polygenic Risk Scores
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Epistatic Features and Machine Learning Improve Alzheimer’s Risk Prediction Over Polygenic Risk Scores
Stephen Hermes, Janet Cady, Steven Armentrout, James O’Connor, Sarah Carlson, Carlos Cruchaga, Thomas Wingo, Ellen McRae Greytak, The Alzheimer’s Disease Neuroimaging Initiative
medRxiv 2023.02.10.23285766; doi: https://doi.org/10.1101/2023.02.10.23285766
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Epistatic Features and Machine Learning Improve Alzheimer’s Risk Prediction Over Polygenic Risk Scores
Stephen Hermes, Janet Cady, Steven Armentrout, James O’Connor, Sarah Carlson, Carlos Cruchaga, Thomas Wingo, Ellen McRae Greytak, The Alzheimer’s Disease Neuroimaging Initiative
medRxiv 2023.02.10.23285766; doi: https://doi.org/10.1101/2023.02.10.23285766

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)