Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Evaluating the quality of prostate cancer diagnosis recording in routinely collected primary care data for observational research: A study using multiple linked English electronic health records databases

View ORCID ProfileGayasha Somathilake, Elizabeth Ford, Jo Armes, Sotiris Moschoyiannis, Michelle Collins, Patrick Francsics, Agnieszka Lemanska
doi: https://doi.org/10.1101/2024.08.21.24312333
Gayasha Somathilake
1School of Health Sciences, Faculty of Health and Medical Sciences, University of Surrey, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gayasha Somathilake
  • For correspondence: g.somathilake{at}surrey.ac.uk
Elizabeth Ford
2Department of Primary Care and Public Health, Brighton and Sussex Medical School, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jo Armes
1School of Health Sciences, Faculty of Health and Medical Sciences, University of Surrey, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sotiris Moschoyiannis
3Computer Science Research Centre, Faculty of Engineering and Physical Sciences, University of Surrey, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michelle Collins
4Royal Surrey County Hospital, Guildford, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick Francsics
4Royal Surrey County Hospital, Guildford, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Agnieszka Lemanska
1School of Health Sciences, Faculty of Health and Medical Sciences, University of Surrey, UK
5Data Science, National Physical Laboratory, Teddington, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Primary care data in the UK are widely used for cancer research, but the reliability of recording key events such as diagnoses remains uncertain. Data linkage can mitigate these uncertainties; however, researchers may avoid linkage due to high costs, tight timelines, and sample size limitations. Hence, this study aimed to assess the quality of prostate cancer (PCa) diagnoses in primary care. We utilised Clinical Practice Research Datalink (CPRD) primary care data linked to National Cancer Registration and Analysis Service (NCRAS) and Hospital Episode Statistics (HES) in England. We compared accuracy, completeness, and timing of diagnosis recording between sources to facilitate decision-making regarding data source selection for future research.

Methods Incident PCa diagnoses (2000-2016) for males aged ≥46 years recorded in at least one study data source were examined. The accuracy of a data source was estimated by the proportion of diagnoses recorded in the specific source that was also confirmed by any linked source. Completeness was estimated by identifying the proportion of all diagnoses in linked sources with a matching diagnosis in the specific source.

Results The study included 51,487 PCa patients from either source. CPRD demonstrated 86.9% accuracy and 68.2% completeness against NCRAS and 75.1% accuracy and 61.1% completeness against HES. Overall, CPRD showed the highest accuracy (93%) but the lowest completeness (60.7%). Diagnosis dates in CPRD were more concordant with NCRAS (90.6% within 6 months) than with HES (61.2%). Over time, accuracy and completeness improved, especially after 2004. Discrepancies in diagnosis dates revealed a median delay of 2 weeks in CPRD than NCRAS and 1 week than HES. CPRD Aurum exhibited better quality compared to GOLD.

Conclusions While the accuracy of PCa diagnoses in CPRD compared to linked sources was high, completeness was low. Therefore, linking to HES or NCRAS should be considered for improved case capture, acknowledging their inherent limitations.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This research was funded by the University of Surrey, UK as part of the Doctoral studentship awarded to Gayasha.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study was approved by the Medicines and Healthcare Products Regulatory Agency (MHRA) Independent Scientific Advisory Committee (protocol number 19_050R).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The data utilised in this study were obtained from the CPRD, facilitated by the UK MHRA. However, the authors' license for using these data does not permit the sharing of raw data with third parties. For information regarding access to CPRD data, interested parties may refer to the following link: Research applications | CPRD.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted August 21, 2024.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Evaluating the quality of prostate cancer diagnosis recording in routinely collected primary care data for observational research: A study using multiple linked English electronic health records databases
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Evaluating the quality of prostate cancer diagnosis recording in routinely collected primary care data for observational research: A study using multiple linked English electronic health records databases
Gayasha Somathilake, Elizabeth Ford, Jo Armes, Sotiris Moschoyiannis, Michelle Collins, Patrick Francsics, Agnieszka Lemanska
medRxiv 2024.08.21.24312333; doi: https://doi.org/10.1101/2024.08.21.24312333
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Evaluating the quality of prostate cancer diagnosis recording in routinely collected primary care data for observational research: A study using multiple linked English electronic health records databases
Gayasha Somathilake, Elizabeth Ford, Jo Armes, Sotiris Moschoyiannis, Michelle Collins, Patrick Francsics, Agnieszka Lemanska
medRxiv 2024.08.21.24312333; doi: https://doi.org/10.1101/2024.08.21.24312333

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Primary Care Research
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)