Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

An AI Method for Assessing Coding Consistency in a Large Dataset

View ORCID ProfileStuart J. Nelson, Ying Yin, Yijun Shao, Phillip Ma, Mark S. Tuttle, Qing Zeng-Treitler
doi: https://doi.org/10.1101/2024.01.17.24301268
Stuart J. Nelson
1Biomedical Informatics Center, George Washington University, Washington DC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Stuart J. Nelson
  • For correspondence: stunelson{at}gwu.edu
Ying Yin
1Biomedical Informatics Center, George Washington University, Washington DC, USA
2Center for Data Science and Outcomes Research, Washington DC VA Medical Center, Washington, DC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yijun Shao
1Biomedical Informatics Center, George Washington University, Washington DC, USA
2Center for Data Science and Outcomes Research, Washington DC VA Medical Center, Washington, DC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Phillip Ma
1Biomedical Informatics Center, George Washington University, Washington DC, USA
2Center for Data Science and Outcomes Research, Washington DC VA Medical Center, Washington, DC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark S. Tuttle
3Apelon, Inc., Hingham, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Qing Zeng-Treitler
1Biomedical Informatics Center, George Washington University, Washington DC, USA
2Center for Data Science and Outcomes Research, Washington DC VA Medical Center, Washington, DC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Objective We developed a method to assess the consistency of the assignment of ICD codes, using coding performed at a United States health system at the time of the transition from ICD-9CM to ICD-10CM.

Methods Using clusters of equivalent codes derived from the US Centers for Disease Control General Equivalence Mapping (GEM) tables, ICD assignments occurring during the ICD-9CM to ICD-10CM transition were evaluated in EHR data from the US Veterans Administration Central Data Warehouse, using a deep learning model based on 860 covariates. The model was then used to detect abrupt changes across the transition; additionally changes at each VA station were examined.

Results Many of the 687 most-used code clusters had ICD-10CM assignments differing greatly from that predicted by the GEM from the codes used in ICD-9CM. Notably, the observed transition patterns varied widely across care locations.

Conclusion Machine learning can model variability across time and across location, enabling an assessment of coding consistency. Expert review is not scalable, deep learning model applied to a large dataset of EHR records provides an approximation of ground truth.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported by VA HSRD grant 1I21HX003278-01A1, and by AHRQ grant R01 HS28450-01A1.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The IRBs of the Veterans Administration Health Services Research Division and the George Washington University School of Medicine and Health Sciences have determined this research is exempt from review, as it involves deidentified data.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The data is available through the US Veterans Administration.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 17, 2024.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
An AI Method for Assessing Coding Consistency in a Large Dataset
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
An AI Method for Assessing Coding Consistency in a Large Dataset
Stuart J. Nelson, Ying Yin, Yijun Shao, Phillip Ma, Mark S. Tuttle, Qing Zeng-Treitler
medRxiv 2024.01.17.24301268; doi: https://doi.org/10.1101/2024.01.17.24301268
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
An AI Method for Assessing Coding Consistency in a Large Dataset
Stuart J. Nelson, Ying Yin, Yijun Shao, Phillip Ma, Mark S. Tuttle, Qing Zeng-Treitler
medRxiv 2024.01.17.24301268; doi: https://doi.org/10.1101/2024.01.17.24301268

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)