Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Cluster analysis of transcriptomic datasets to identify endotypes of Idiopathic Pulmonary Fibrosis

View ORCID ProfileLuke M Kraven, Adam R. Taylor, View ORCID ProfilePhilip L. Molyneaux, View ORCID ProfileToby M. Maher, John E. McDonough, Marco Mura, Ivana V. Yang, David A. Shwartz, Yong Huang, Imre Noth, Shwu-Fan Ma, Astrid J. Yeo, William A. Fahy, View ORCID ProfileR. Gisli Jenkins, View ORCID ProfileLouise V. Wain
doi: https://doi.org/10.1101/2021.07.16.21260633
Luke M Kraven
1Department of Health Sciences, University of Leicester, Leicester, United Kingdom
2Research & Development, GlaxoSmithKline, Stevenage, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luke M Kraven
Adam R. Taylor
2Research & Development, GlaxoSmithKline, Stevenage, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Philip L. Molyneaux
3National Institute for Health Research Respiratory Clinical Research Facility, Royal Brompton Hospital, London, United Kingdom
4National Heart and Lung Institute, Imperial College, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philip L. Molyneaux
Toby M. Maher
3National Institute for Health Research Respiratory Clinical Research Facility, Royal Brompton Hospital, London, United Kingdom
4National Heart and Lung Institute, Imperial College, London, United Kingdom
5Keck School of Medicine, University of Southern California, Los Angeles, California, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Toby M. Maher
John E. McDonough
6Division of Pulmonary, Critical Care & Sleep Medicine, Yale School of Medicine, New Haven, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marco Mura
7Division of Respirology, Western University, London, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ivana V. Yang
8Department of Medicine, University of Colorado, Aurora, Colorado, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David A. Shwartz
8Department of Medicine, University of Colorado, Aurora, Colorado, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yong Huang
9Division of Pulmonary & Critical Care Medicine, University of Virginia, Charlottesville, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Imre Noth
9Division of Pulmonary & Critical Care Medicine, University of Virginia, Charlottesville, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shwu-Fan Ma
9Division of Pulmonary & Critical Care Medicine, University of Virginia, Charlottesville, Virginia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Astrid J. Yeo
2Research & Development, GlaxoSmithKline, Stevenage, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William A. Fahy
2Research & Development, GlaxoSmithKline, Stevenage, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R. Gisli Jenkins
3National Institute for Health Research Respiratory Clinical Research Facility, Royal Brompton Hospital, London, United Kingdom
4National Heart and Lung Institute, Imperial College, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for R. Gisli Jenkins
Louise V. Wain
1Department of Health Sciences, University of Leicester, Leicester, United Kingdom
10National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Louise V. Wain
  • For correspondence: lvw1{at}le.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background Considerable clinical heterogeneity in Idiopathic Pulmonary Fibrosis (IPF) suggests the existence of multiple disease endotypes. Identifying these endotypes could allow for a biomarker-driven personalised medicine approach in IPF. To improve our understanding of the pathogenesis of IPF by identifying clinically distinct groups of patients with IPF that could represent distinct disease endotypes.

Methods We co-normalised, pooled and clustered three publicly available blood transcriptomic datasets (total 220 IPF cases). We compared clinical traits across clusters and used gene enrichment analysis to identify biological pathways and processes that were over-represented among the genes that were differentially expressed across clusters. A gene-based classifier was developed and validated using three additional independent datasets (total 194 IPF cases).

Findings We identified three clusters of IPF patients with statistically significant differences in lung function (P=0·009) and mortality (P=0·009) between groups. Gene enrichment analysis implicated dysregulation of mitochondrial homeostasis, apoptosis, cell cycle and innate and adaptive immunity in the pathogenesis underlying these groups. We developed and validated a 13-gene cluster classifier that predicted mortality in IPF (high-risk clusters vs low-risk cluster: hazard ratio= 4·25, 95% confidence interval= [2·14, 8·46], P=3·7×10−5).

Interpretation We have identified blood gene expression signatures capable of discerning groups of IPF patients with significant differences in survival. These clusters could be representative of distinct pathophysiological states, which would support the theory of multiple endotypes of IPF. Although more work must be done to confirm the existence of these endotypes, our classifier could be a useful tool in patient stratification and outcome prediction in IPF.

Funding L.V.W. holds a GSK/British Lung Foundation Chair in Respiratory Research (C17-1). R.G.J. is supported by a National Institute for Health Research (NIHR) Research Professorship (NIHR reference RP-2017-08-ST2-014). P.L.M. is supported by an Action for Pulmonary Fibrosis Mike Bray fellowship. T.M. Maher is supported by a National Institute for Health Research Clinician Scientist Fellowship (CS-2013-13-017) and a British Lung Foundation Chair in Respiratory Research (C17-3). I.N. is supported by a National Heart, Lung, and Blood Institute (NHLBI) grant (R01HL145266). D.A.S. is supported by NHLBI grants (UG3HL151865, R01HL097163, P01HL092870, X01HL134585 and UH3HL123442) and a United States Department of Defense grant (W81XWH-17-1-0597). The GSE110147 study was supported by the Roche Multi Organ Transplant Academic Enrichment Fund, Lawson Research Institute Internal Research Fund and Western Strategic Support for CIHR Success, Seed Grant. The research was partially supported by the NIHR Leicester Biomedical Research Centre; the views expressed are those of the author(s) and not necessarily those of the National Health Service (NHS), the NIHR, or the Department of Health.

Evidence before this study We searched PubMed Central in February 2020 with the search terms “idiopathic pulmonary fibrosis”, “gene expression” and “cluster analysis” with no restrictions on publication date or language. Previous transcriptomic cluster analyses have found that differences in gene expression can be used to predict disease status, severity and outcome in IPF. A previous transcriptomic prognostic biomarker has been developed that can predict outcome in IPF using blood expression data from 52 genes.

Added value of this study By utilising new methods of data co-normalisation and machine learning, we were able to combine multiple publicly available datasets and perform one of the largest transcriptomic studies in IPF to-date with a total of 416 IPF cases across the discovery and validation stages. We identified three clusters of patients, one of which appeared to contain, on average, the healthiest subjects with favourable lung function and survival over time. These clusters were defined using expression from groups of genes that were significantly enriched for many different biological pathways and processes, including metabolic changes, apoptosis, cell cycle and immune response, and so could be representative of distinct pathophysiological states. Additionally, we developed a 13-gene expression-based classifier to assign individuals with IPF to one of the clusters and validated this classifier using three additional independent cohort of IPF patients (totalling 194 IPF cases). As the clusters were associated with survival, our classifier could potentially be used to predict outcome in IPF.

Implications of all the available evidence Our findings support the hypothesis that the disease consists of multiple endotypes. The clusters identified in this study could provide some valuable insight into the underlying biological processes that may be driving the considerable clinical heterogeneity in IPF. With further development, our gene expression-based classifier could be a useful tool for patient stratification and outcome prediction in IPF.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

L.V.W. holds a GSK/British Lung Foundation Chair in Respiratory Research (C17-1). R.G.J. is supported by a National Institute for Health Research (NIHR) Research Professorship (NIHR reference RP-2017-08-ST2-014). P.L.M. is supported by an Action for Pulmonary Fibrosis Mike Bray fellowship. T.M. Maher is supported by a National Institute for Health Research Clinician Scientist Fellowship (CS-2013-13-017) and a British Lung Foundation Chair in Respiratory Research (C17-3). I.N. is supported by a National Heart, Lung, and Blood Institute (NHLBI) grant (R01HL145266). D.A.S. is supported by NHLBI grants (UG3HL151865, R01HL097163, P01HL092870, X01HL134585 and UH3HL123442) and a United States Department of Defense grant (W81XWH-17-1-0597). The study with Gene Expression Omnibus accession code GSE110147 was supported by the Roche Multi Organ Transplant Academic Enrichment Fund, Lawson Research Institute Internal Research Fund and Western Strategic Support for CIHR Success, Seed Grant. The research was partially supported by the NIHR Leicester Biomedical Research Centre; the views expressed are those of the author(s) and not necessarily those of the National Health Service (NHS), the NIHR, or the Department of Health.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Individuals used in this study come from a previously reported study with appropriate institutional ethics approval at each study centre. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The data that supports the findings in this study are openly available via the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/).

https://www.ncbi.nlm.nih.gov/geo/

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38958

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33566

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93606

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132607

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE27957

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28042

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted November 19, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Cluster analysis of transcriptomic datasets to identify endotypes of Idiopathic Pulmonary Fibrosis
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Cluster analysis of transcriptomic datasets to identify endotypes of Idiopathic Pulmonary Fibrosis
Luke M Kraven, Adam R. Taylor, Philip L. Molyneaux, Toby M. Maher, John E. McDonough, Marco Mura, Ivana V. Yang, David A. Shwartz, Yong Huang, Imre Noth, Shwu-Fan Ma, Astrid J. Yeo, William A. Fahy, R. Gisli Jenkins, Louise V. Wain
medRxiv 2021.07.16.21260633; doi: https://doi.org/10.1101/2021.07.16.21260633
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Cluster analysis of transcriptomic datasets to identify endotypes of Idiopathic Pulmonary Fibrosis
Luke M Kraven, Adam R. Taylor, Philip L. Molyneaux, Toby M. Maher, John E. McDonough, Marco Mura, Ivana V. Yang, David A. Shwartz, Yong Huang, Imre Noth, Shwu-Fan Ma, Astrid J. Yeo, William A. Fahy, R. Gisli Jenkins, Louise V. Wain
medRxiv 2021.07.16.21260633; doi: https://doi.org/10.1101/2021.07.16.21260633

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Respiratory Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)