Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Using genetic information to define idiopathic pulmonary fibrosis in UK Biobank

View ORCID ProfileOlivia C Leavy, View ORCID ProfileRichard J Allen, View ORCID ProfileLuke M Kraven, Ann Morgan, View ORCID ProfileMartin D Tobin, View ORCID ProfileJennifer K Quint, View ORCID ProfileR Gisli Jenkins, View ORCID ProfileLouise V Wain
doi: https://doi.org/10.1101/2022.04.01.22273306
Olivia C Leavy
1Department of Health Sciences, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Olivia C Leavy
Richard J Allen
1Department of Health Sciences, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard J Allen
Luke M Kraven
1Department of Health Sciences, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luke M Kraven
Ann Morgan
2National Heart and Lung Institute, Imperial College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Martin D Tobin
1Department of Health Sciences, University of Leicester, Leicester, UK
3National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin D Tobin
Jennifer K Quint
2National Heart and Lung Institute, Imperial College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer K Quint
R Gisli Jenkins
2National Heart and Lung Institute, Imperial College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for R Gisli Jenkins
Louise V Wain
1Department of Health Sciences, University of Leicester, Leicester, UK
3National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Louise V Wain
  • For correspondence: lvw1{at}leicester.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Introduction Idiopathic pulmonary fibrosis (IPF) is a rare lung disease characterised by progressive scarring in the alveoli. IPF can be defined in population studies using electronic healthcare records (EHR) but recent genetic studies of IPF using EHR have shown an attenuation of effect size for known genetic risk factors when compared to clinically-derived datasets, suggesting misclassification of cases.

Methods We used EHR (ICD-10, Read (2 & 3)) and questionnaire data to define IPF cases in UK Biobank, and evaluated these definitions using association results for the largest genetic risk variant for IPF (rs35705950-T, MUC5B). We further evaluated the impact of exclusions based on co-occurring codes for non-IPF pulmonary fibrosis and restricting codes according to changes in diagnostic practice.

Results Odds ratio (OR) estimates for rs35705950-T associations with IPF defined using EHR and questionnaire data in UK Biobank were significant and ranged from 2.06 to 3.09 which was lower than those reported using clinically-derived IPF datasets (95% confidence intervals: 3.74, 6.66). Code-based exclusions of cases gave slightly closer effect estimates to those previously reported, but sample sizes were substantially reduced.

Discussion We show that none of the UK Biobank IPF codes replicate the effect size for the association of rs35705950-T on IPF risk when using clinically-derived IPF datasets. Further code-based exclusions also did not lead to effect estimates closer to those expected. Whilst the apparent increased sample sizes available for IPF from general population cohorts may be of benefit, future studies should take these limitations of the case definition into account.

What is already known on this topic UK Biobank is a very large prospective cohort that can be utilised to increase sample sizes for studies of rare diseases such as idiopathic pulmonary fibrosis (IPF). However, effect size estimates for genetic risk factors for IPF in UK Biobank and other general population cohorts, when defining cases using electronic healthcare records (EHR), are smaller than those estimated from clinically-derived IPF datasets.

What this study adds Using Hospital Episode Statistics (HES) data, primary care data, death registry data and self-report data in UK Biobank, we used the association rs35705950-T, the largest genetic risk factor for IPF, to evaluate code-based definitions of IPF. We show that none of the available IPF coding replicates the effect size for rs35705950-T on IPF risk that is observed in clinically-derived IPF datasets.

How this study might affect research, practice or policy Research using large general population cohorts and datasets for observational studies of IPF should take these limitations of EHR definitions of IPF into consideration.

Competing Interest Statement

LVW reports current and recent research funding from GSK, Genentech and Orion Pharma, and consultancy for Galapagos. RGJ is a trustee of Action for Pulmonary Fibrosis and reports personal fees from Astra Zeneca, Biogen, Boehringer Ingelheim, Bristol Myers Squibb, Chiesi, Daewoong, Galapagos, Galecto, GlaxoSmithKline, Heptares, NuMedii, PatientMPower, Pliant, Promedior, Redx, Resolution Therapeutics, Roche, Veracyte and Vicore. JKQ has received grants from The Health Foundation, MRC, GSK, Bayer, BI, AUK-BLF, HDR UK, Chiesi and AZ and personal fees for advisory board participation or speaking fees from GlaxoSmithKline, Boehringer Ingelheim, AstraZeneca, Chiesi, Insmed and Bayer.

Funding Statement

LVW holds a GSK/Asthma+Lung UK Chair in Respiratory Research (C17-1). LVW and RGJ are supported by MRC Programme grant MR/V00235X/1. RJA is an Action for Pulmonary Fibrosis Research Fellow. RGJ is supported by a National Institute for Health Research (NIHR) Research Professorship (NIHR reference RP-2017-08-ST2-014). LMK was funded by a Medical Research Council (MRC) PhD studentship (MR/N013913/1). MDT is supported by a Wellcome Trust Investigator Award (WT202849/Z/16/Z). The research was partially supported by the NIHR Leicester Biomedical Research Centre; the views expressed are those of the author(s) and not necessarily those of the National Health Service (NHS), the NIHR, or the Department of Health. This research has been conducted using the UK Biobank Resource under application 77050. This research used the SPECTRE High Performance Computing Facility at the University of Leicester.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

UK Biobank has approval by the Research Ethics Committee (REC) under approval number 16/NW/0274.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

UK Biobank data is publicly accessible upon approval of an application through www.ukbiobank.ac.uk.

https://www.ukbiobank.ac.uk/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted April 01, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Using genetic information to define idiopathic pulmonary fibrosis in UK Biobank
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Using genetic information to define idiopathic pulmonary fibrosis in UK Biobank
Olivia C Leavy, Richard J Allen, Luke M Kraven, Ann Morgan, Martin D Tobin, Jennifer K Quint, R Gisli Jenkins, Louise V Wain
medRxiv 2022.04.01.22273306; doi: https://doi.org/10.1101/2022.04.01.22273306
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Using genetic information to define idiopathic pulmonary fibrosis in UK Biobank
Olivia C Leavy, Richard J Allen, Luke M Kraven, Ann Morgan, Martin D Tobin, Jennifer K Quint, R Gisli Jenkins, Louise V Wain
medRxiv 2022.04.01.22273306; doi: https://doi.org/10.1101/2022.04.01.22273306

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Respiratory Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)