Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Predicting Autism Spectrum Disorder: Transformer-Based Deep Learning Ensemble Framework Using Health Administrative & Birth Registry Data

View ORCID ProfileKevin Dick, Emily Kaczmarek, Robin Ducharme, Alexa C. Bowie, Alysha L.J. Dingwall-Harvey, Heather Howley, Steven Hawken, Mark C. Walker, Christine M. Armour
doi: https://doi.org/10.1101/2024.07.03.24309684
Kevin Dick
1BORN Ontario, Children’s Hospital of Eastern Ontario, Ottawa, Canada
2Prenatal Screening Ontario, Better Outcomes Registry & Network, Ottawa, Canada
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kevin Dick
Emily Kaczmarek
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robin Ducharme
4Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexa C. Bowie
4Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alysha L.J. Dingwall-Harvey
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
4Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Heather Howley
1BORN Ontario, Children’s Hospital of Eastern Ontario, Ottawa, Canada
2Prenatal Screening Ontario, Better Outcomes Registry & Network, Ottawa, Canada
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steven Hawken
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
4Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada
5School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada
6Department of Obstetrics and Gynecology, University of Ottawa, Ottawa, Canada
7ICES, Toronto, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark C. Walker
1BORN Ontario, Children’s Hospital of Eastern Ontario, Ottawa, Canada
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
4Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada
5School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada
6Department of Obstetrics and Gynecology, University of Ottawa, Ottawa, Canada
8International and Global Health Office, University of Ottawa, Ottawa, Canada
9Department of Obstetrics, Gynecology & Newborn Care, The Ottawa Hospital, Ottawa, Canada
10Department of Pediatrics, University of Ottawa, Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christine M. Armour
1BORN Ontario, Children’s Hospital of Eastern Ontario, Ottawa, Canada
2Prenatal Screening Ontario, Better Outcomes Registry & Network, Ottawa, Canada
3Children’s Hospital of Eastern Ontario Research Institute (CHEO-RI), Ottawa, Canada
10Department of Pediatrics, University of Ottawa, Ottawa, Canada
11Department of Genetics, CHEO, Ottawa, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: carmour{at}cheo.on.ca
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Early diagnosis and access to resources, support and therapy are critical for improving long-term outcomes for children with autism spectrum disorder (ASD). ASD is typically detected using a case-finding approach based on symptoms and family history, resulting in many delayed or missed diagnoses. While population-based screening would be ideal for early identification, available screening tools have limited accuracy. This study aims to determine whether machine learning models applied to health administrative and birth registry data can identify young children (aged 18 months to 5 years) who are at increased likelihood of developing ASD.

Methods We assembled the study cohort using individually linked maternal-newborn data from the Better Outcomes Registry and Network (BORN) Ontario database. The cohort included all live births in Ontario, Canada between April 1st, 2006, and March 31st, 2018, linked to datasets from Newborn Screening Ontario (NSO), Prenatal Screening Ontario (PSO), and Canadian Institute for Health Information (CIHI) (Discharge Abstract Database (DAD) and National Ambulatory Care Reporting System (NACRS)). The NSO and PSO datasets provided screening biomarker values and outcomes, while DAD and NACRS contained diagnosis codes and intervention codes for mothers and offspring. Extreme Gradient Boosting models and large-scale ensembled Transformer deep learning models were developed to predict ASD diagnosis between 18 and 60 months of age. Leveraging explainable artificial intelligence methods, we determined the impactful factors that contribute to increased likelihood of ASD at both an individual- and population-level.

Results The final study cohort included 703,894 mother-offspring pairs, with 10,964 identified cases of ASD. The best-performing ensemble of Transformer models achieved an area under the receiver operating characteristic curve of 69.6% for predicting ASD diagnosis, a sensitivity of 70.9%, a specificity of 56.9%. We determine that our model can be used to identify an enriched pool of children with the greatest likelihood of developing ASD, demonstrating the feasibility of this approach.

Conclusions This study highlights the feasibility of employing machine learning models and routinely collected health data to systematically identify young children at high likelihood of developing ASD. Ensemble transformer models applied to health administrative and birth registry data offer a promising avenue for universal ASD screening. Such early detection enables targeted and formal assessment for timely diagnosis and early access to resources, support, or therapy.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This project is supported by an anonymous donation to develop the CHEO Precision Child and Youth Mental Health Initiative.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The Children's Hospital of Eastern Ontario's Research Ethics Board (REB# 22/06PE) and the ICES Privacy Office (ICES# 2023 901 377 000) gave ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • ↵* kdick{at}bornontario.ca; carmour{at}cheo.on.ca

  • Funding: This project is supported by an anonymous donation to develop the CHEO Precision Child and Youth Mental Health Initiative.

  • Article Description: This study examines whether routinely collected health data can determine the likelihood that young children will develop autism spectrum disorder. We developed machine learning models that show promising results, improve upon existing screening tools and highlight the potential for early detection.

Data Availability

All data produced in the present study are available upon reasonable request to ICES.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 05, 2024.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Predicting Autism Spectrum Disorder: Transformer-Based Deep Learning Ensemble Framework Using Health Administrative & Birth Registry Data
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Predicting Autism Spectrum Disorder: Transformer-Based Deep Learning Ensemble Framework Using Health Administrative & Birth Registry Data
Kevin Dick, Emily Kaczmarek, Robin Ducharme, Alexa C. Bowie, Alysha L.J. Dingwall-Harvey, Heather Howley, Steven Hawken, Mark C. Walker, Christine M. Armour
medRxiv 2024.07.03.24309684; doi: https://doi.org/10.1101/2024.07.03.24309684
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Predicting Autism Spectrum Disorder: Transformer-Based Deep Learning Ensemble Framework Using Health Administrative & Birth Registry Data
Kevin Dick, Emily Kaczmarek, Robin Ducharme, Alexa C. Bowie, Alysha L.J. Dingwall-Harvey, Heather Howley, Steven Hawken, Mark C. Walker, Christine M. Armour
medRxiv 2024.07.03.24309684; doi: https://doi.org/10.1101/2024.07.03.24309684

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Systems and Quality Improvement
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)