Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Deep Learning Utilizing Suboptimal Spirometry Data to Improve Lung Function and Mortality Prediction in the UK Biobank

View ORCID ProfileDavin Hill, Max Torop, Aria Masoomi, Peter J. Castaldi, Edwin K. Silverman, Sandeep Bodduluri, Surya P. Bhatt, Taedong Yun, Cory Y. McLean, Farhad Hormozdiari, Jennifer Dy, View ORCID ProfileMichael H. Cho, Brian D. Hobbs
doi: https://doi.org/10.1101/2023.04.28.23289178
Davin Hill
1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
2Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Davin Hill
Max Torop
1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aria Masoomi
1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter J. Castaldi
2Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
3Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, USA
4Harvard Medical School, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Edwin K. Silverman
2Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
4Harvard Medical School, Boston, MA, USA
5Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandeep Bodduluri
6Division of Pulmonary, Allergy and Critical Care Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Surya P. Bhatt
6Division of Pulmonary, Allergy and Critical Care Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Taedong Yun
7Google Research, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Cory Y. McLean
7Google Research, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Farhad Hormozdiari
7Google Research, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer Dy
1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jdy{at}ece.neu.edu remhc{at}channing.harvard.edu
Michael H. Cho
2Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
4Harvard Medical School, Boston, MA, USA
5Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael H. Cho
  • For correspondence: jdy{at}ece.neu.edu remhc{at}channing.harvard.edu
Brian D. Hobbs
2Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
4Harvard Medical School, Boston, MA, USA
5Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Spirometry measures lung function by selecting the best of multiple efforts meeting pre-specified quality control (QC), and reporting two key metrics: forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC). We hypothesize that discarded submaximal and QC-failing data meaningfully contribute to the prediction of airflow obstruction and all-cause mortality.

Methods We evaluated volume-time spirometry data from the UK Biobank. We identified “best” spirometry efforts as those passing QC with the maximum FVC. “Discarded” efforts were either submaximal or failed QC. To create a combined representation of lung function we implemented a contrastive learning approach, Spirogram-based Contrastive Learning Framework (Spiro-CLF), which utilized all recorded volume-time curves per participant and applied different transformations (e.g. flow-volume, flow-time). In a held-out 20% testing subset we applied the Spiro-CLF representation of a participant’s overall lung function to 1) binary predictions of FEV1/FVC < 0.7 and FEV1 Percent Predicted (FEV1PP) < 80%, indicative of airflow obstruction, and 2) Cox regression for all-cause mortality.

Findings We included 940,705 volume-time curves from 352,684 UK Biobank participants with 2-3 spirometry efforts per individual (66.7% with 3 efforts) and at least one QC-passing spirometry effort. Of all spirometry efforts, 24.1% failed QC and 37.5% were submaximal. Spiro-CLF prediction of FEV1/FVC < 0.7 utilizing discarded spirometry efforts had an Area under the Receiver Operating Characteristics (AUROC) of 0.981 (0.863 for FEV1PP prediction). Incorporating discarded spirometry efforts in all-cause mortality prediction was associated with a concordance index (c-index) of 0.654, which exceeded the c-indices from FEV1 (0.590), FVC (0.559), or FEV1/FVC (0.599) from each participant’s single best effort.

Interpretation A contrastive learning model using raw spirometry curves can accurately predict lung function using submaximal and QC-failing efforts. This model also has superior prediction of all-cause mortality compared to standard lung function measurements.

Funding MHC is supported by NIH R01HL137927, R01HL135142, HL147148, and HL089856.

BDH is supported by NIH K08HL136928, U01 HL089856, and an Alpha-1 Foundation Research Grant.

DH is supported by NIH 2T32HL007427-41

EKS is supported by NIH R01 HL152728, R01 HL147148, U01 HL089856, R01 HL133135, P01 HL132825, and P01 HL114501.

PJC is supported by NIH R01HL124233 and R01HL147326.

SPB is supported by NIH R01HL151421 and UH3HL155806.

TY, FH, and CYM are employees of Google LLC

Competing Interest Statement

BDH receives grant support from Bayer. MHC has received grant support from GlaxoSmithKline and Bayer, consulting fees from Genentech and AstraZeneca, and speaking fees from Illumina. EKS has received grant support from GlaxoSmithKline and Bayer. PJC has received grant support from Bayer. SPB has received consulting fees from Sanofi/Regeneron and Boehringer Ingelheim, and CME fees from IntegrityCE. His institute has received funds from Sanofi and Nuvaira for the conduct of clinical trials. TY, FH, and CYM are employees of Google LLC and own Alphabet stock.

Funding Statement

MHC is supported by NIH R01HL137927, R01HL135142, HL147148, and HL089856. BDH is supported by NIH K08HL136928, U01 HL089856, and an Alpha-1 Foundation Research Grant. DH is supported by NIH 2T32HL007427-41 EKS is supported by NIH R01 HL152728, R01 HL147148, U01 HL089856, R01 HL133135, P01 HL132825, and P01 HL114501. PJC is supported by NIH R01HL124233 and R01HL147326. SPB is supported by NIH R01HL151421 and UH3HL155806. TY, FH, and CYM are employees of Google LLC

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study used only publicly available data from the UK Biobank Resource.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

This research has been conducted using the UK Biobank Resource under application number 20915. Feature representations and models produced in the present study are available upon reasonable request to the authors.

https://github.com/davinhill/Spiro-CLF

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted April 29, 2023.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Deep Learning Utilizing Suboptimal Spirometry Data to Improve Lung Function and Mortality Prediction in the UK Biobank
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Deep Learning Utilizing Suboptimal Spirometry Data to Improve Lung Function and Mortality Prediction in the UK Biobank
Davin Hill, Max Torop, Aria Masoomi, Peter J. Castaldi, Edwin K. Silverman, Sandeep Bodduluri, Surya P. Bhatt, Taedong Yun, Cory Y. McLean, Farhad Hormozdiari, Jennifer Dy, Michael H. Cho, Brian D. Hobbs
medRxiv 2023.04.28.23289178; doi: https://doi.org/10.1101/2023.04.28.23289178
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Deep Learning Utilizing Suboptimal Spirometry Data to Improve Lung Function and Mortality Prediction in the UK Biobank
Davin Hill, Max Torop, Aria Masoomi, Peter J. Castaldi, Edwin K. Silverman, Sandeep Bodduluri, Surya P. Bhatt, Taedong Yun, Cory Y. McLean, Farhad Hormozdiari, Jennifer Dy, Michael H. Cho, Brian D. Hobbs
medRxiv 2023.04.28.23289178; doi: https://doi.org/10.1101/2023.04.28.23289178

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Respiratory Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)