Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Machine Learning Prediction of Progression in FEV1 in the COPDGene Study

Adel Boueiz, Zhonghui Xu, Yale Chang, Aria Masoomi, Andrew Gregory, Sharon M. Lutz, Dandi Qiao, James D. Crapo, Jennifer G. Dy, Edwin K. Silverman, View ORCID ProfilePeter J. Castaldi, for the COPDGene investigators
doi: https://doi.org/10.1101/2022.01.10.22268804
Adel Boueiz
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
2Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: adel.boueiz{at}channing.harvard.edu
Zhonghui Xu
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yale Chang
3Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aria Masoomi
3Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrew Gregory
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sharon M. Lutz
4Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dandi Qiao
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James D. Crapo
5Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer G. Dy
3Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Edwin K. Silverman
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
2Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter J. Castaldi
1Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
6Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter J. Castaldi
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background The heterogeneous nature of COPD complicates the identification of the predictors of disease progression and consequently the development of effective therapies. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features.

Methods We included 4,496 smokers with available data from their enrollment and 5-year follow-up visits in the Genetic Epidemiology of COPD (COPDGene) study. We constructed supervised random forest models to predict 5-year progression in FEV1 from 46 baseline demographic, clinical, physiologic, and imaging features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit.

Results Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. Nevertheless, the area under the ROC curves for the prediction of subjects in the top quartile of observed disease progression was 0.70 in the 10-year follow-up data. The model performance accuracy was best for GOLD1-2 subjects and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD grade.

Conclusion This state-of-the art approach along with deep phenotyping predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported by NHLBI K08 HL141601, R01 HL124233, R01 HL126596, R01 HL147326, U01 HL089897, and U01 HL089856. The COPDGene study (NCT00608764) is also supported by the COPD Foundation through contributions made to an Industry Advisory Board that has included AstraZeneca, Bayer Pharmaceuticals, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer, and Sunovion.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Institutional review board (IRB) approval was obtained. IRB Protocol Title: Genetic Epidemiology of COPD. IRB Protocol Number: Brigham and Women's Hospital / 2007P000554

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

  • Authors’ email addresses: Adel Boueiz (adel.boueiz{at}channing.harvard.edu), Zhonghui Xu (zhonghui.xu{at}channing.harvard.edu), Yale Chang (changyalee{at}gmail.com), Aria Masoomi (masoomi.a{at}northeastern.edu), Andrew Gregory (andrew.gregory{at}channing.harvard.edu), Sharon M. Lutz (sharon.m.lutz{at}gmail.com), Dandi Qiao (dandi.qiao{at}channing.harvard.edu), James D. Crapo (CrapoJ{at}njhealth.org), Jennifer G. Dy (jdy{at}ece.neu.edu), Edwin K. Silverman (ed.silverman{at}channing.harvard.edu), Peter J. Castaldi (peter.castaldi{at}channing.harvard.edu)

  • Funding Sources: This work was supported by NHLBI K08 HL141601, R01 HL124233, R01 HL126596, R01 HL147326, U01 HL089897, and U01 HL089856. The COPDGene study (NCT00608764) is also supported by the COPD Foundation through contributions made to an Industry Advisory Board that has included AstraZeneca, Bayer Pharmaceuticals, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer, and Sunovion.

  • Supplement: This article has an online data supplement, which is accessible from this issue’s table of content online.

  • Notation of prior abstract publication/presentation: Boueiz A, Chang Y, Cho MH, DeMeo DL, Dy J, Silverman EK, Castaldi PJ. Machine Learning Prediction of 5-Year Progression of FEV1 in the COPDGene Study. [abstract]. American Journal of Respiratory and Critical Care Medicine 2018;197(A7430).

Data Availability

All data produced in the present study are available upon reasonable request to the authors.

  • ABBREVIATION LIST

    AUC
    Area under the curve
    BMI
    Body mass index
    COPD
    Chronic Obstructive Pulmonary Disease
    COPDGene study
    Genetic Epidemiology of COPD study
    ΔFEV1
    Annualized five-year changes in FEV1
    FEV1
    Forced expiratory volume in one second
    FVC
    Forced vital capacity
    GOLD
    Global Initiative for Chronic Obstructive Lung Disease spirometric grading system
    HU
    Hounsfield units
    IQR
    Interquartile range
    %LAA-950
    Percent of CT scan low attenuation area below -950 HU at end-inspiration
    MMRC
    Modified Medical Research Council
    NHW
    Non-Hispanic White
    RF
    Random forest
    RMSE
    Root mean squared error
    ROC
    Receiver operator characteristic
    SGRQ
    St. George’s Respiratory Questionnaire
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
    Back to top
    PreviousNext
    Posted January 11, 2022.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Machine Learning Prediction of Progression in FEV1 in the COPDGene Study
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Machine Learning Prediction of Progression in FEV1 in the COPDGene Study
    Adel Boueiz, Zhonghui Xu, Yale Chang, Aria Masoomi, Andrew Gregory, Sharon M. Lutz, Dandi Qiao, James D. Crapo, Jennifer G. Dy, Edwin K. Silverman, Peter J. Castaldi, for the COPDGene investigators
    medRxiv 2022.01.10.22268804; doi: https://doi.org/10.1101/2022.01.10.22268804
    Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Machine Learning Prediction of Progression in FEV1 in the COPDGene Study
    Adel Boueiz, Zhonghui Xu, Yale Chang, Aria Masoomi, Andrew Gregory, Sharon M. Lutz, Dandi Qiao, James D. Crapo, Jennifer G. Dy, Edwin K. Silverman, Peter J. Castaldi, for the COPDGene investigators
    medRxiv 2022.01.10.22268804; doi: https://doi.org/10.1101/2022.01.10.22268804

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Respiratory Medicine
    Subject Areas
    All Articles
    • Addiction Medicine (349)
    • Allergy and Immunology (668)
    • Allergy and Immunology (668)
    • Anesthesia (181)
    • Cardiovascular Medicine (2648)
    • Dentistry and Oral Medicine (316)
    • Dermatology (223)
    • Emergency Medicine (399)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
    • Epidemiology (12228)
    • Forensic Medicine (10)
    • Gastroenterology (759)
    • Genetic and Genomic Medicine (4103)
    • Geriatric Medicine (387)
    • Health Economics (680)
    • Health Informatics (2657)
    • Health Policy (1005)
    • Health Systems and Quality Improvement (985)
    • Hematology (363)
    • HIV/AIDS (851)
    • Infectious Diseases (except HIV/AIDS) (13695)
    • Intensive Care and Critical Care Medicine (797)
    • Medical Education (399)
    • Medical Ethics (109)
    • Nephrology (436)
    • Neurology (3882)
    • Nursing (209)
    • Nutrition (577)
    • Obstetrics and Gynecology (739)
    • Occupational and Environmental Health (695)
    • Oncology (2030)
    • Ophthalmology (585)
    • Orthopedics (240)
    • Otolaryngology (306)
    • Pain Medicine (250)
    • Palliative Medicine (75)
    • Pathology (473)
    • Pediatrics (1115)
    • Pharmacology and Therapeutics (466)
    • Primary Care Research (452)
    • Psychiatry and Clinical Psychology (3432)
    • Public and Global Health (6527)
    • Radiology and Imaging (1403)
    • Rehabilitation Medicine and Physical Therapy (814)
    • Respiratory Medicine (871)
    • Rheumatology (409)
    • Sexual and Reproductive Health (410)
    • Sports Medicine (342)
    • Surgery (448)
    • Toxicology (53)
    • Transplantation (185)
    • Urology (165)