Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Patient Phenotyping for Atopic Dermatitis with Transformers and Machine Learning

View ORCID ProfileAndrew Wang, View ORCID ProfileRachel Fulton, View ORCID ProfileSy Hwang, View ORCID ProfileDavid J. Margolis, View ORCID ProfileDanielle L. Mowery
doi: https://doi.org/10.1101/2023.08.25.23294636
Andrew Wang
1Department of Computer and Information Science, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, PA
BS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew Wang
Rachel Fulton
2Lankenau Medical Center, Dermatology Services, Wynnewood, PA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rachel Fulton
Sy Hwang
3Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
MS MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sy Hwang
David J. Margolis
4Department of Dermatology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
5Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
MD PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David J. Margolis
Danielle L. Mowery
3Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
4Department of Dermatology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
PhD MS MS FAMIA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Danielle L. Mowery
  • For correspondence: dlmowery{at}pennmedicine.upenn.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Atopic dermatitis (AD) is a chronic skin condition that millions of people around the world live with each day. Performing research studies into identifying the causes and treatment for this disease has great potential to provide benefit for these individuals. However, AD clinical trial recruitment is a non-trivial task due to variance in diagnostic precision and phenotypic definitions leveraged by different clinicians as well as time spent finding, recruiting, and enrolling patients by clinicians to become study subjects. Thus, there is a need for automatic and effective patient phenotyping for cohort recruitment.

Objective Our study aims to present an approach for identifying patients whose electronic health records suggest that they may have AD.

Methods We created a vectorized representation of each patient and trained various supervised machine learning methods to classify when a patient has AD.

Results The most accurate AD classifier performed with a class-balanced accuracy of 0.8036, a precision of 0.8400, and a recall of 0.7500 when using XGBoost (Extreme Gradient Boosting).

Conclusions Creating an automated approach for identifying patient cohorts has the potential to accelerate, standardize, and automate the process of patient recruitment for AD studies, therefore reducing clinician burden and informing knowledge discovery of better treatment options for AD.

Competing Interest Statement

David J. Margolis is or recently has been a consultant for Pfizer, Leo, and Sanofi with respect to studies of atopic dermatitis and served on an advisory board for the National Eczema Association.

Funding Statement

This study was partially funded by the National Institutes of Health (NIH) National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) P30-AR069589 as part of the Penn Skin Biology and Diseases Resource-Based Center (Core: David J. Margolis, Danielle L. Mowery).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

IRB of University of Pennsylvania gave ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • ↵+ First author,

  • ↵* Co-last authors

Data Availability

To protect patient privacy, the clinical data is not available.

  • Abbreviations

    AD
    atopic dermatitis
    BERT
    Bidirectional Encoder Representations from Transformers
    EHR
    Electronic Health Records
    ICD
    International Classification of Disease
    UKWP
    United Kingdom Working Party
    HR
    Hanifin and Rajka
    AI
    Artificial Intelligence
    NLP
    Natural Language Processing
    ML
    Machine Learning
    MLP
    Multi-layer Perceptron
    ReLU
    Rectified Linear Unit
    SGD
    Stochastic Gradient Descent
    KNN
    K-Nearest Neighbors
    XGBoost
    Extreme Gradient Boosting
    AdaBoost
    Adaptive Boosting
    SVM
    Support Vector Machines
    TP
    True Positive
    TN
    True Negative
    FP
    False Positive
    FN
    False Negative
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
    Back to top
    PreviousNext
    Posted August 28, 2023.
    Download PDF
    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Patient Phenotyping for Atopic Dermatitis with Transformers and Machine Learning
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Patient Phenotyping for Atopic Dermatitis with Transformers and Machine Learning
    Andrew Wang, Rachel Fulton, Sy Hwang, David J. Margolis, Danielle L. Mowery
    medRxiv 2023.08.25.23294636; doi: https://doi.org/10.1101/2023.08.25.23294636
    Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Patient Phenotyping for Atopic Dermatitis with Transformers and Machine Learning
    Andrew Wang, Rachel Fulton, Sy Hwang, David J. Margolis, Danielle L. Mowery
    medRxiv 2023.08.25.23294636; doi: https://doi.org/10.1101/2023.08.25.23294636

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Health Informatics
    Subject Areas
    All Articles
    • Addiction Medicine (349)
    • Allergy and Immunology (668)
    • Allergy and Immunology (668)
    • Anesthesia (181)
    • Cardiovascular Medicine (2648)
    • Dentistry and Oral Medicine (316)
    • Dermatology (223)
    • Emergency Medicine (399)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
    • Epidemiology (12228)
    • Forensic Medicine (10)
    • Gastroenterology (759)
    • Genetic and Genomic Medicine (4103)
    • Geriatric Medicine (387)
    • Health Economics (680)
    • Health Informatics (2657)
    • Health Policy (1005)
    • Health Systems and Quality Improvement (985)
    • Hematology (363)
    • HIV/AIDS (851)
    • Infectious Diseases (except HIV/AIDS) (13695)
    • Intensive Care and Critical Care Medicine (797)
    • Medical Education (399)
    • Medical Ethics (109)
    • Nephrology (436)
    • Neurology (3882)
    • Nursing (209)
    • Nutrition (577)
    • Obstetrics and Gynecology (739)
    • Occupational and Environmental Health (695)
    • Oncology (2030)
    • Ophthalmology (585)
    • Orthopedics (240)
    • Otolaryngology (306)
    • Pain Medicine (250)
    • Palliative Medicine (75)
    • Pathology (473)
    • Pediatrics (1115)
    • Pharmacology and Therapeutics (466)
    • Primary Care Research (452)
    • Psychiatry and Clinical Psychology (3432)
    • Public and Global Health (6527)
    • Radiology and Imaging (1403)
    • Rehabilitation Medicine and Physical Therapy (814)
    • Respiratory Medicine (871)
    • Rheumatology (409)
    • Sexual and Reproductive Health (410)
    • Sports Medicine (342)
    • Surgery (448)
    • Toxicology (53)
    • Transplantation (185)
    • Urology (165)