Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Classification and Biomarker Identification In Autism Using Conjunctive Clause Evolutionary Algorithm

Yu Han, Patricia A. Prelock, John P. Hanley, Emily L. Coderre, Donna M. Rizzo
doi: https://doi.org/10.1101/2020.11.09.20227843
Yu Han
1Department of Communication Sciences and Disorders, University of Vermont, Burlington, VT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yhan8{at}uvm.edu
Patricia A. Prelock
1Department of Communication Sciences and Disorders, University of Vermont, Burlington, VT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John P. Hanley
2Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Emily L. Coderre
1Department of Communication Sciences and Disorders, University of Vermont, Burlington, VT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Donna M. Rizzo
3Department of Civil and Environmental Engineering, University of Vermont, Burlington, VT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Autism spectrum disorder (ASD) is a developmental disability that can cause significant social, communication and behavioral challenges. Many challenges remain for ASD diagnosis and treatments. Current diagnostic criteria are based on behavioral symptoms alone. Wait times for appointments at diagnostic centers range from 9 to 13 months, and a single diagnostic appointment can last several hours. There is an urgent need to identify ASD associated biomarkers and features to help automate diagnostics and develop predictive ASD models. The present study adopts a novel evolutionary algorithm, the conjunctive clause evolutionary algorithm (CCEA), to select features most significant for distinguishing individuals with and without ASD using a unique dataset having a small number of samples with a very large number of feature measurements. The dataset comprises both behavioral and neuroimaging measurements from a total of 21 children from 7 to 14 years old. Potential biomarker candidates including volume, area, cortical thickness and mean curvature in specific regions in the cingulate cortex, frontal cortex and temporal-parietal junction areas were identified. Behavioral features associated with theory of mind were selected. Study findings demonstrate how machine learning tools can advance ASD research in the genre of big data to benefit this special population in the future.

1 Introduction

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by impairments in social interaction, and restricted and repetitive behaviors [1–4]. According to the Center for Disease Control and Prevention (CDC) 2018 report, the number of U.S. children diagnosed with ASD has increased from 1 in every 150 in year 2000 to about 1 in every 54 in year 2016 [5]. The economic burden of pediatric ASD is due substantially to the costs associated with an increased use of health services, school support, ASD-related therapy, family services, and caregiver time. Total societal costs in the United States for children with ASD were estimated at $11.5 billion in year 2011 [6, 7]. While genetic and environmental factors have been linked to the development of ASD; at present, there is no identified cause or cure for ASD.

Some symptoms of ASD are not evident until age two or later. In fact, a child may appear to be developing normally until the age of two when they stop learning new skills and may even forget old skills [8, 9]. Currently, the diagnosis of autism is based on behavioral symptoms alone. There are two common behavioral assessment tools guiding the diagnostic process, The Autism Diagnostic Observation Schedule-second edition (ADOS-2) and The Autism Diagnostic Interview-revised (ADI-R) [10, 11]. However, a typical diagnostic appointment consists of evaluations lasting several hours at a designated clinical office. Due to the rigorous and time-consuming nature of ASD diagnostic examinations, the demand exceeds the capacity to see patients. As a result, many diagnostic centers have expanding wait lists for appointments. This bottleneck translates to delays in diagnosis of 13 months and longer for minority and lower socio-economic status groups [2, 8, 12–18]. It is also believed that a substantial number of individuals on the spectrum remain undetected [19]. With growing awareness of ASD, there is a high demand for a faster and automated ASD diagnostic approach that might allow for more efficient diagnosis and early identification of high-risk populations [20].

Building an automated diagnostic and predictive model of ASD is timely as many studies have adopted machine learning approaches to identify significant biomarkers that include both behavioral and biological features. Duda and colleagues (2016) applied machine learning to distinguish ASD from attention deficit hyperactivity disorder (ADHD) using a 65-item Social Responsiveness Scale [21]. Bone et al. (2015) trained their models to diagnose autism against healthy controls using the same Social Responsiveness Scale and the Autism Diagnostic Interview-Revised score [22]. Other studies aggregated items from the ADOS and scores from the Autism Quotient (AQ) to accurately classify an ASD group. However, behavioral measures may be interpreted as subjective, and there can be a wide range of select features depending on which tests are used. Consequently, it becomes important to identify consistent markers associated with ASD.

As a result of the wide range in ASD behavioral measures and their subjective nature, many studies have searched for brain-based biological markers to identify a common etiology across individuals with ASD. These brain-based biological markers are less subjective than behavioral measures and may represent potential targets for treatments. Currently, markers that are measurable via magnetic resonance imaging (MRI) are highly desirable because they may represent potential targets for both treatments and diagnostic tools [23]. Independent structural MRI studies have found differences in whole brain volume and the developmental trajectories between individuals with ASD and those without ASD [24–33]. Other structural brain abnormalities associated with ASD include cortical folding signatures appearing in the following regions of the brain: temporal-parietal junction, anterior insula, posterior cingulate, lateral and medial prefrontal, corpus callosum, intra-parietal sulcus, and the occipital cortex [24–33]. Evidence also shows that an accelerated expansion of cortical surface area, but not cortical thickness, causes an early overgrowth of the brain in children with ASD [34], while other studies suggest that individuals with ASD tend to have thinner cortices and reduced surface area as an effect of aging [35].

Machine learning (ML) has been introduced to the neuroimaging field to identify the abnormal brain regions in individuals with ASD. The support vector machine (SVM) is an algorithm that avoids over-fitting and is capable of high classification accuracy without requiring large sample sizes. The SVM algorithm is able to classify ASD from corresponding controls using extracted features from functional connections and grey matter volume [36–40]. Other ML classifications of ASD include deep neural networks [41] and the random forests (RF) algorithm; the latter uses random ensembles of independently grown decision trees [42]. Although these methods have demonstrated high accuracy for classifying ASD, they have not identified precise neuroimaging-based biomarkers and features associated with ASD. The majority of studies have adopted data from the Autism Brain Imaging Data Exchange (ABIDE) dataset that includes 1112 existing resting-state functional magnetic resonance (rs-fMRI) imaging datasets with corresponding structural MRI and phenotypic information from 539 individuals with ASD and 573 age-matched typical controls collected from 24 international brain imaging laboratories [43].

Classification across a heterogeneous population is challenging [44, 45], particularly when neuroimaging data are pooled from multiple acquisition sites (e.g., the ABIDE dataset, which has considerable variation in demographic and phenotypic profiles). Variances introduced in the data due to scanner hardware, imaging protocols, operator characteristics, regional demographics, and other factors that are acquisition site-specific, can affect the classification performance. This problem is especially relevant for ASD given the inherent heterogeneity of the population. It is often difficult to collect neuroimaging data from individuals with autism given the loudness of the scanner and difficulties remaining still. In fact, most individual site datasets have small sample sizes that can lead to over-fitting and classification inaccuracies. Moreover, while many traditional ML algorithms were designed to classify large amounts of data (e.g., ABIDE) rather than optimize the selection of features, the ultimate goal for ML-based diagnostic classification in neuroimaging is to identify discriminative features to provide insight into abnormal structure and dysfunctional connectivity patterns in the affected population [46].

The present study employed an evolutionary algorithm, the conjunctive clause evolutionary algorithm (CCEA), to select features most correlated with classifying individuals with and without ASD. The dataset has a relatively large number of features (both behavioral and neuroimaging measurements) from a total of 26 children, which comprises a training set (7 children with ASD and 14 neurotypical (NT) children), and a testing set (1 child with ASD and 4 NT). Children in the testing set were enrolled as a cohort at a later time after the CCEA was trained.

The neuroimaging measurements included brain volume, brain surface area, cortical thickness, and cortical curvature extracted from MRI whole brain T1 weighted scans. Behavioral measurements included scores from the Comprehensive Assessment of Spoken Language (CASL) [47], the Universal Nonverbal Intelligence Test-2 (UNIT-2) [48, 49], the Theory of Mind Task Battery (ToMTB) [50] and the Theory of Mind Inventory-2 (ToMI-2) [50]. The present study examined the validity of using the CCEA algorithm for feature selection in ASD, particularly to address challenges associated with traditional ML algorithms when working with small datasets. It aims to identify discriminative biomarkers and behavioral features to help develop an automatic diagnostic and predictive system for ASD.

Although some ML-based methods have been applied to ASD, the suitability of machine learning and the choice of algorithms with regard to the specific behavior examined, as well as the quality and quantity of the data obtained from individual studies, needs further investigation [51]. We believe this study is the first study to:

  • Classify ASD and select discriminative biomarkers among children from 7 to 14 years old.

  • Include both behavioral and biological measurements in the feature selection model.

  • Identify models (sets of features) that most strongly correlate to children with ASD given a dataset with a relatively small sample size (i.e., N=26) and large number of features (i.e., 247 neuroimaging features and 14 behavioral features).

2 Methods

2.1 Participants

A total of 8 children with ASD (1 female, mean age = 11) and 18 NT children (7 female, mean age = 10.28) were enrolled in the study. All children participated in 2-3 hours of baseline behavioral assessments in which both groups completed the CASL, UNIT-2, ToMTB, and ToMI-2,while the ASD group also completed the ADOS-2 and the Social Communication Questionnaire-Lifetime version (SCQ) [52] to confirm their ASD diagnosis. All ASD children demonstrated understanding of the instructions given in the behavioral and functional magnetic resonance imaging (fMRI) tasks.

2.2 Behavioral Measurements

The CASL is an orally administered research-based assessment consisting of 15 subtests measuring language for individuals ranging from 3 to 21 years of age. For the present study, only those basic subsets that establish the CASL language core are used: Antonyms, Sentence Completion, Syntax Construction, Paragraph Comprehension, and Pragmatic Judgment. The UNIT-2 is a multidimensional assessment of intelligence for individuals with speech, language, or hearing impairments. It consists of nonverbal tasks that test symbolic memory, non-symbolic quantity, analogic reasoning, spatial memory, numerical series, and cube design.

The ToMTB and ToMI-2 are two norm-referenced tools and behavioral tasks used as outcome measures to assess theory of mind (ToM) [53, 54]. ToM is the ability to reason about the thoughts and feelings of self and others, including the ability to predict what others will do or how they will feel in a given situation on the basis of their inferred beliefs. ToM is a core social deficit in ASD. Scores from both ToMTB and ToMI-2 provide valid representations of a child’s social cognition level. The ToMI-2 is a parent-informant measure of a child’s functional level of ToM. Each of the 60 items assesses a particular ToM dimension using items that range from simple content to those that evaluate more complex skills. Each item is rated on a 20-unit continuous scale anchored by “Definitely Not” and “Definitely.” Respondents indicate their response with a vertical hash mark at the point on the scale that best reflects their attitude. Item, subscale, and composite scores range from 0-20 with higher values reflecting greater parental confidence that the child possesses a particular ToM skill. The ToMI-2 is designed to be a socially and ecologically valid index of ToM as it occurs in everyday social interactions. It has demonstrated excellent test-retest reliability, internal consistency, and criterion-related validity for both NT and ASD children as well as contrasting-group validity and statistical evidence of construct validity (i.e., factor analysis). The ToMTB directly assesses a child’s understanding of a series of scenarios tapping theory of mind. It consists of 15 test questions within nine tasks, arranged in ascending difficulty. Tasks are presented as short vignettes that appear in a story-book format. Each page has color illustrations and accompanying text. For all tasks, children are presented with one correct response option and three plausible distracters. Memory control questions are included that must be passed for credit on the test questions. The ToMTB has strong test-retest reliability [55–57].

We included the total score of the CASL, full scale score of the UNIT-2, abbreviated score of the UNIT-2, total score of the ToMTB, total composite mean of the ToMI-2 (i.e., assessing overall ToM ability), early subscale mean of the ToMI-2 (i.e., assessing early developing ToMI ability such as regulating desire-based emotion and recognition of happy and sad), basic subscale mean of the ToMI-2 (i.e., assessing basic ToM ability such as recognition of surprise), advanced subscale mean of the ToMI-2 (assessing advanced ToM ability such as recognition of embarrassment) in the CCEA algorithm for feature selection. We also included scores from single items assessing recognition of simple emotions such as happy and sad in the model, as well as more complex emotions such as surprise and embarrassment, which ASD children often find difficult to recognize and process [58–62]. There were 13 behavioral features in total. Table 1 provides demographic information for all participants (i.e., N represents NT subject and A represents ASD subject) including their age, gender and scores on the 13 behavioral tests. Results from T-tests found significant differences (i.e. p<0.05) between the NT group and the ASD group on scores of CASL, UNIT-2 full scale, ToMTB total, ToMI-2 total, ToMI-2 early subscale, ToMI-2 basic subscale, ToMI-2 advanced subscale, ToMI-2 embarrassment and ToMI-2 desire based, where NT subjects scored higher.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1: Demographic Summary Table: NT vs. ASD

2.3 MRI Acquisition and Preprocessing

All data were acquired using the MRI Center for Biomedical Imaging 3T Philips Achieva dStream scanner and 32-channel head coil at the University of Vermont (UVM). Parameters for T1 acquisition are TR 800ms, TE 30ms, flip angle 52 degree, 2.4mm isotropic imaging resolution with a 216 × 216 × 144mm3 field of view using a multiband acceleration factor of 6 (60 slices, no gap). Participants watched three videos at home before coming to the MRI center. The first was a cartoon video explaining what an MRI is, and what one might experience while laying in an MRI scanner [63]. The second video, recorded at the UVM MRI mock scanner room, helped visualize the real setting and procedures a child would experience. The third video explained the procedures of wearing earplugs. All participants practiced laying still and became familiar with the scanner noise in the mock scanner room. The T1 structural scan was preprocessed using the Human Connectome Project (HCP) minimal preprocessing pipelines, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinates spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data [64]. Brain anatomical features are extracted using FreeSurfer aparcstats2tabl script [65], including volume, cortical thickness, mean curvature, and area of all ROIs for each subject. These ROIs are defined using the automatic segmentation procedures that assign one of 37 labels to each brain voxel, including left and right caudate, putamen, pallidum, thalamus, lateral ventricles, hippocampus, and amygdala [66]. There are 276 brain features included in total.

2.4 Conjunctive Clause Evolutionary Algorithm

We used a novel evolutionary algorithm to identify the features associated with ASD. The CCEA is a machine learning tool that searches for both the combinations of features associated with a given category (e.g., ASD) as well as their corresponding range of values [67]. The CCEA is capable of finding feature interactions even in the absence of main-effects, and can, therefore, find feature combinations that would be difficult to discover using traditional statistics. The CCEA selects for the best conjunctive clauses (CC) of the form: Embedded Image where Fi represents a risk factor i whose value lies in the range ai; and the symbol ∧ represents a conjunction (i.e., logical AND). The benefit of the CCEA is that it produces parsimonious models that are correlated with a select category (e.g., ASD). The model parsimony is measured using the order of the conjunctive clause, which is the total number of features in the conjunctive clause. One example of a parsimonious 2nd-order conjunctive clause is: a person with a right hemisphere isthmus cingulate volume of 3,300 – 4,100 AND a right hemisphere posterior cingluate volume of 4,100 – 6,200 is more likely to have ASD than someone who does not meet these criteria.

The fitness of each conjunctive clause (CC) is evaluated using the hypergeometric probability mass function (PMF) and only the most-fit conjunctive clauses are saved. The hypergeometric PMF is not a p-value and thus, is not constrained by issues associated with what threshold is “significant” [68–70]. To prevent overfitting, the CCEA performs feature sensitivity on each conjunctive clause to ensure each feature contributes to the overall fitness. For each feature in a conjunctive clause, the sensitivity is calculated by taking the difference between the conjunctive clause fitness and the fitness when that feature is removed. Thus, a feature’s sensitivity may be viewed as the amount of fitness that it contributes to the conjunctive clause. To visualize the fitness landscape, both positive predictive value and coverage are calculated. Positive predictive value (PPV) is the number of true positives divided by the sum of true and false positives; and class coverage is the number of true positives divided by the sum of true positives and false negatives (i.e., the percent of ASD individuals that match the conjunctive clause). In this work, the CCEA was run five times using the training set to ensure a more thorough search of the fitness landscape.

3 Results

3.1 Training Set: 7 ASD and 14 NT

In the training set, 2438 CCs (i.e., models) were generated ranging from first-order to fifth-order. The PPV of the 2438 models range anywhere from 46.47% to 100% and their class coverage ranges from 42.86% to 100%. Among these models, we looked for the most parsimonious (i.e., lower order models) to draw meaningful conclusions and to avoid overfitting, which exist with higher-order models. As a result, we selected 8 second-order models (i.e., those having only two features) and the highest fitness (PMF) among the total 520 second-order models. These 8 best performing models have 100% PPV and 100% class coverage. All of the features identified have only brain anatomical features (Table 2). Because of our desire to examine the behavioral features, we expanded our analysis to include third-order models (i.e., model combinations with three features). There were 651 third-order models in total; some consisted only of anatomical brain features, while others had two behavioral features plus one brain anatomical feature. We selected the six best performing third-order models with the highest fitness (PMF); each had 100% PPV and 100% class coverage. Each of these third-order models (Table 3) contained two behavioral features and one brain anatomical feature.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2: Second-Order CCs
View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3: Third-Order CCs

Selected second-order models have 100% PPV and 100% class coverage. These models, using CC 113 (Table 2) as an example, would be interpreted as - any subjects whose posterior cingulate gyrus volume was within the range of 3500 to 4600 mm3 AND left rostral middle frontal gyrus volume was within the range of 20,000 to 25,000 mm3 would be classified as having ASD. The volume of the left hemisphere posterior cingulate gyrus and the volume of the right hemisphere isthmus of the cingulate gyrus were the two features to appear most frequently (i.e., four times) across all models, suggesting that the volume of cingulate gyrus is a potentially important biomarker for ASD. Figure 1 provides a 2D visualization for the range of feature values (numerical boundaries) associated with these models and the placement of each subject within this range. Note: Green dots represent ASD subjects and group together within the rectangle defining the range of values in Table 2.

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Second-Order CC 2D Visualization

Selected third-order models have 100% PPV and 100% class coverage. Using CC 46 (Table 3) as an example, any subjects who had a total score on ToMTB within the range of 5 to 13 AND an early subscale mean score on ToMI-2 within the range of 12 to 18 AND a mean curvature value of the left hemisphere pars orbitalis within the range of 0.17 to 2 would fall in the ASD class. The ToMTB total score feature occurred in all of our best fit third-order models; and the ToMI-2 early subscale mean score occurred in all but one (CC 1163) of the models, where the ToMI-2 total composite mean played a role. Such a finding further suggests that ToMTB and ToMI-2 might be effective for ASD testing and diagnosis tools. Figure 2 is a 3D visualization of the CC feature value boundaries and classification placement of each subject, where green dots represent ASD subjects and group together within the pink cube defined by feature values in Table 3.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2: Third-Order CC 3D Visualization

3.2 Testing Set: 1 ASD and 4 NT

A cohort of new subjects comprising 1 ASD child and 4 NT children were recruited separately at a later time. This later cohort served as a testing set to validate the original 2438 models generated in the training set. Three third-order models (Table 4) and four fourth-order models (Table 5) were selected from the total 2438 models generated using the training set, as these 7 models were the only models that remained 100% PPV and 100% class coverage on the testing set. Among the important features selected for the testing set, cortical thickness of the left hemisphere pericalcarine cortex and mean curvature of the right hemisphere pars orbitalis were the two features to appear most frequently across all models, indicating the important roles of these areas in ASD. The inclusion of this testing provided an additional observations to examine the robustness of the best-fit models identified using the original training set (i.e., parsimonious models with 100% PPV and 100% class coverage).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 4: Third-Order CCs
View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 5: Fourth-Order CCs

4 Discussion

The present study successfully classified and selected discriminative biomarkers and behavioral features in ASD children from 7 to 14 years old using a small dataset collected from a single research site. Machine learning (ML) tools have long been introduced to ASD research; but it still remains a far-reaching goal to build an ultimate prediction model and diagnostic system for ASD. With some progress, other studies often face the problem of using datasets across different research sites for classification purpose, rather than identifying discriminative features for diagnostic and treatment development purposes [21, 22, 36–43, 71, 72]. Additionally, traditional ML algorithms do not work well with ASD datasets due to high variances and the heterogeneous nature of the disease [44, 45]. Meanwhile, it requires a tremendous amount of effort to include ASD individuals in a research study given the social and language challenges of such a population. Thus, nearly all ASD datasets have a large number of features with small sample size, which despite being inappropriate for many ML algorithms, often leads to overfitting and poor classification accuracy. The present study, however, has demonstrated exceptionally good performance (i.e., 100% PPV and 100% class coverage) of the CCEA with a dataset containing a large number of features yet small sample size; and in this case, identifies features significantly associated with ASD.

The select CCEA features included volume, area, cortical thickness and mean curvature in specific regions in the cingulate cortex, frontal cortex and temporal-parietal junction areas as biomarkers for ASD (e.g., the pericalcarine cortex, posterior cingulate cortex, isthmus of the cingulate gyrus, pars orbitalis, etc.). Such findings are consistent with previous literature suggesting that individuals with ASD have abnormalities in these brain regions [24–33, 73–75]. Additionally, third-order models from the training set include measurements from the ToMI-2 and the ToMTB as significant features [55–57], which has further validated the use of these tools in ASD assessments. Such findings can potentially help clinicians and researchers address specific domains in ToM to improve the social skills of children with ASD. Besides the select 8 second-order and 6 third-order models from the training set, it is impressive that the CCEA is also able to validate 7 additional best performing third- and fourth-order models on the testing set. Although it would have been ideal if the 8 second-order and 6 third-order models identified on the training set had perfectly modeled the testing set, the fact that an additional 7 or the original models classify, with 100% PPV and 100% class coverage, on the testing set further demonstrates the robustness of the original models generated using the training set, as well as the exceptional performance of the CCEA algorithm in this study.

5 Limitations and Future Directions

ASD research often struggles with balancing between issues of either having a large sample size but high heterogeneity between subjects, or the other way around. Additionally, most currently available ML algorithms are designed for solving classification problems using big data. Although the present study has limitations given the relatively small sample size, it is impressive that the CCEA algorithm is able to identify features with exceptional classification performance and biomarker identification using a small dataset containing a large number of features. With the strong performance of the CCEA for the current sample size, the generation of more comprehensive results would be possible given a larger sample size.

The present study has established important biomarker candidates of ASD. These biomarker candidates fall into the same brain regions that have been identified to show abnormalities in ASD from studies adopting traditional neuroimaging measurements. Thus, it provides evidence that AI methodologies can perform as well as the traditional approaches in the field of neuroscience and ASD in selecting neuroanatomical biomarkers. Although AI techniques have been adopted to help with diagnosis and treatment development in medicine, ASD is exceptionally challenging given its great heterogeneity nature. It will require a large, diverse, and comprehensive dataset to extract solid biomarkers, which can be very time-consuming and less accurate using traditional approaches. Under such circumstances, ML techniques can help advance the development of an automatic diagnostic and predictive system for ASD. In summary, the present study has provided a new direction in adopting AI techniques in ASD research and medicine in general.

Data Availability

The data in this manuscript was collected by the research team under Dr. Patricia Prelock, access can be granted upon request.

Acknowledgements

This project was supported by a private donor committed to advancing research in autism spectrum disorder. We thank Jay V. Gonyea, Administrative Director, and Scott Hipko, Senior Research Technologist, in the MRI Research Unit at the University of Vermont, for their support in acquiring the MRI scans. We thank Dr. Richard Watts, Ph.D., Director at the FAS Brain Imaging Center, and Dr. Joeseph Orr, Ph.D., Assistant Professor at the Texas A&M University, for sharing their knowledge in MRI data pre-processing.

Footnotes

  • ↵* yu.han{at}uvm.edu

References

  1. [1].↵
    M. Fitzgerald, “The clinical gestalts of autism: Over 40 years of clinical experience with autism.,” Recent Research and Clinical Applications, vol. 2, 2017. doi: 10.577265906cf0.
    OpenUrlCrossRef
  2. [2].
    S. Levy, M. Duda, N. Haber, and D. P. Wall, “Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism.,” Molecular Autism, vol. 8, 2017. doi: 10.1186s13229-017-0180-6cf0.
    OpenUrlCrossRef
  3. [3].
    X. Bi, Y. Wang, Q. Shu, Q. Sun, and Q. Xu, “Classification of autism spectrum disorder using random support vector machine cluster.,” Front Genet., vol. 9, 2018.
  4. [4].↵
    A. Diagnostic and S. Manual, 5th ed DSM-5. Washington, DC: American Psychiatric Association; 2013.
  5. [5].↵
    C. for Disease Control and Prevention, “2020 community report on autism,” vol. 13, 2020.
  6. [6].↵
    L. Achenie, A. Scarpa, R. Factor, T. Wang, D. Robins, and D. Mccrickard, “A machine learning strategy for autism screening in toddlers.,” Journal of Developmental & Behavioral Pediatrics, vol. 40, 2019.
  7. [7].↵
    N. J. Lavelle T.A. Weinstein M.C., “Economic burden of autism spectrum disorders,” Pediatrics, vol. 133, 2014.
  8. [8].↵
    P. F. Bolton, J. Golding, A. Emond, and S. CD., “Autism spectrum disorder and autistic traits in the avon longitudinal study of parents and children: Precursors and early signs,” J Am Acad Child Adolesc Psychiatry, vol. 51, p. 3, 2012.
    OpenUrlPubMed
  9. [9].↵
    J. Kleinman, P. Ventola, J. Pandey, A.D. Verbalis, M. Barton, S. Hodgson, J. Green, T. Dumont-Mathieu, D.L. Robins, and D. Fein, “Diagnostic stability in very young children with autism spectrum disorders,” J Autism Dev Disord, vol. 38, p. 4, 2008.
    OpenUrl
  10. [10].↵
    M. C. Lord and A. Couteur, “A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders,” J Autism Dev Disord, vol. 24, p. 659, 1994.
    OpenUrlCrossRefPubMedWeb of Science
  11. [11].↵
    C. Lord, S. Risi, and L. Lambrecht, “The autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism,” J Autism Dev Disord, vol. 30, p. 205, 2000.
    OpenUrlCrossRefPubMedWeb of Science
  12. [12].
    J. Baio, “Prevalence of autism spectrum disorders: Autism and developmental disabilities monitoring network, 14 sites, united states,” vol. 61, 2012.
  13. [13].
    R. A. Rhoades, A. Scarpa, and B. Salley, “The importance of physician knowledge of autism spectrum disorders: Results of a parent survey,” BMC Pediatr, vol. 7, p. 37, 2007.
    OpenUrlPubMed
  14. [14].
    D. Mandell, J. Listerud, and S. Levy, “Race differences in the age at diagnosis among medicaid-eligible children with autism,” J Am Acad Child Adolesc Psychiatry, vol. 41, p. 1447, 2002.
    OpenUrlCrossRefPubMedWeb of Science
  15. [15].
    D. S. Mandell, M. M. Novak, and C. Zubritsky, “Factors associated with age of diagnosis among children with autism spectrum disorders,” Pediatrics, vol. 116, p. 1480, 2005.
    OpenUrlAbstract/FREE Full Text
  16. [16].
    M. J. Morrier, K. L. Hess, and L. J. Heflin, “Ethnic disproportionality in students with autism spectrum disorders,” Multicultural Educ Fall, vol. 16, p. 31, 2008.
    OpenUrl
  17. [17].
    L. D. Wiggins, J. Baio, and C. Rice, “Examination of the time between first evaluation and first autism spectrum diagnosis in a population-based sample,” J Dev Behav Pediatr, vol. 27, p. 2, 2006.
    OpenUrl
  18. [18].
    R. Bernier, A. Mao, and J. Yen, “Psychopathology, families and culture: Autism,” Child Adolesc Psychiatr Clin N Am, vol. 19, p. 4, 2010.
    OpenUrl
  19. [19].↵
    F. Thabtah and D. Peebles, “A new machine learning model based on induction of rules for autism detection.,” Health Informatics Journal, vol. 1460, 2019. doi: 10.1177/146045821882471100.
    OpenUrlCrossRef
  20. [20].↵
    M. N. Parikh, H. Li, and L. He, “Enhancing diagnosis of autism with optimized machine learning models and personal characteristic data.,” Frontiers in Computational Neuroscience, vol. 13, 2019.
  21. [21].↵
    M. Duda, R. Ma, N. Haber, and D. P. Wall, “Use of machine learning for behavioral distinction of autism and adhd.,” Transl Psychiatry, vol. 6, 2016.
  22. [22].↵
    D. Bone, M. S. Goodwin, M. P. Black, C. C. Lee, K. Audhkhasi, and S. Narayanan, “Applying machine learning to facilitate autism diagnostics: Pitfalls and promises.,” J Autism Dev Disord., vol. 45, p. 1121, 2015. doi: 10.1007/s10803-014-2268-6.
    OpenUrlCrossRef
  23. [23].↵
    E. Feczko, N. Balba, O. Miranda-Dominguez, M. Cordova, S. Karalunas, L. Irwin, and D. Fair, “Sub-typing cognitive profiles in autism spectrum disorder using a functional random forest algorithm.,” NeuroImage, vol. 172, 2018.
  24. [24].↵
    R. Chen, Y. Jiao, and H. E.H., “Structural mri in autism spectrum disorder,” Pediatr Res, vol. 69, p. 63, 2011.
    OpenUrl
  25. [25].
    D. Amaral, C. Schumann, and C. Nordahl, “Neuroanatomy of autism,” Trends in Neurosci, 2008.
  26. [26].
    P. Brambilla, “Brain anatomy and development in autism: Review of structural mri studies.,” Brain Res, vol. 61, p. 557, 2003.
    OpenUrl
  27. [27].
    N. Lange, “Longitudinal volumetric brain changes in autism spectrum disorder ages 35 years,” Autism Res, vol. 8, p. 82, 2015.
    OpenUrlCrossRefPubMed
  28. [28].
    D. L. Dierker, “Analysis of cortical shape in children with simplex autism,” Cereb. Cortex, vol. 25, p. 1042, 2015.
    OpenUrlCrossRefPubMed
  29. [29].
    C. W. Nordahl, “Cortical folding abnormalities in autism revealed by surface-based morphometry.,” J Neurosci, vol. 27, p. 11 725, 2007.
    OpenUrl
  30. [30].
    S. L. Valk, A. Di Martino, M. P. Milham, and B. BC., “Multicenter mapping of structural network alterations in autism,” Hum Brain Mapp, vol. 36, p. 2364, 2015.
    OpenUrlCrossRefPubMed
  31. [31].
    G. L. Wallace, “Longitudinal cortical development during adolescence and young adulthood in autism spectrum disorder: Increased cortical thinning but comparable surface area changes,” Acad. Child Adolesc. Psychiatry, vol. 54, p. 464, 2015.
    OpenUrl
  32. [32].
    R. Kucharsky Hiess, “Corpus callosum area and brain volume in autism spectrum disorder: Quantitative analysis of structural mri from the abide database,” J. Autism Dev Disord, vol. 45, p. 3107, 2015.
    OpenUrl
  33. [33].↵
    M. Shokouhi, J. H. Williams, G. D. Waiter, and B. Condon, “Changes in the sulcal size associated with autism spectrum disorder revealed by sulcal morphometry.,” Autism Res, vol. 5, p. 245, 2012.
    OpenUrlCrossRefPubMedWeb of Science
  34. [34].↵
    S. Ha, I.-J. Sohn, N. Kim, H. J. Sim, and K.-A. Cheon, “Characteristics of brains in autism spectrum disorder: Structure, function and connectivity across the lifespan,” Experimental Neurobiology, vol. 24, no. 4, p. 273, 2015. doi: 10.5607/en.2015.24.4.273.
    OpenUrlCrossRef
  35. [35].↵
    C. Ecker, A. Shahidiani, Y. Feng, and E. Daly, “The effect of age, diagnosis, and their interaction on vertex-based measures of cortical thickness and surface area in autism spectrum disorder,” J Neural Transm (Vienna), vol. 121, pp. 1157–1170, 2014.
    OpenUrl
  36. [36].↵
    I. Gori, A. Giuliano, F. Muratori, I. Saviozzi, P. Oliva, and R. Tancredi, “Gray matter alterations in young children with autism spectrum disorders: Comparing morphometry at the voxel and regional level,” J Neuroimaging, vol. 25, p. 866, 2015.
    OpenUrl
  37. [37].
    Y. Jin, C. Wee, F. Shi, K. Thung, D. Ni, and P. Yap, “Identification of infants at high-risk for autism spectrum disorder using multiparameter multiscale white matter connectivity networks”, Hum. Brain Mapp, vol. 36, p. 4880,
  38. [38].
    H. Chen, X. Duan, F. Liu, F. Lu, X. Ma, and Y. Zhang, “Multivariate classification of autism spectrum disorder using frequency-specific resting-state functional connectivity”, Prog. Neuropsychopharmacol. Biol. Psychiatry, vol. 64, p. 1,
  39. [39].
    P. Odriozola, L. Q. Uddin, C. J. Lynch, J. Kochalka, T. Chen, and V. Menon, “Insula response and connectivity during social and non-social attention in children with autism.,” Cogn. Affect. Neurosci., vol. 11, p. 433, 2015.
    OpenUrl
  40. [40].↵
    G. Chanel, S. Pichon, L. Conty, S. Berthoz, C. Chevallier, and G. J., “Classification of autistic individuals and controls using cross-task characterization of fmri activity.,” NeuroImage Clin.,
  41. [41].↵
    G. J. Katuwal, S. A. Baum, N. D. Cahill, and M. AM., “Divide and conquer: Sub-grouping of asd improves asd detection based on brain morphometry,” PLoS One, vol. 11, p. 1, 2016.
    OpenUrlCrossRefPubMed
  42. [42].↵
    C. P. Chen, “Diagnostic classification of intrinsic functional connectivity highlights somatosensory, default mode, and visual regions in autism.,” Clin NeuroImage, vol. 8, p. 238, 2015.
    OpenUrl
  43. [43].↵
    C. Craddock, Y. Benhajali, C. Chu, F. Chouinard, A. Evans, and A. S. Jakab, “The neuro bureau preprocessing initiative: Open sharing of preprocessed neuroimaging data and derivatives,” Front. Neuroinform., vol. 7, p. 41, 2013.
    OpenUrl
  44. [44].↵
    C. Kelly, B. B. Biswal, R. C. Craddock, F. Castellanos, and M. Milham, “Characterizing variation in the functional connectome: Promise and pitfalls,” Trends in Cognitive Sciences, vol. 16, p. 3, 2012.
    OpenUrlPubMed
  45. [45].↵
    W. Huf, K. Kalcher, R. N. Boubela, G. Rath, A. Vecsei, P. Filzmoser, and E. Moser, “On the generalizability of resting-state fmri machine learning classifiers,” Frontiers in Human Neuroscience, vol. 8, p. 502, 2014.
    OpenUrl
  46. [46].↵
    P. Lanka, D. Rangaprakash, M. N. Dretsch, J. S. Katz, T. S. Denney, and G. Deshpande, “Supervised machine learning for diagnostic classification from large-scale neuroimaging datasets,” Brain Imaging and Behavior, vol. 720, 2019. doi: 10.1007/s11682-019-00191-8cf0.
    OpenUrlCrossRef
  47. [47].↵
    E. Carrow-Woolfolk, “Comprehensive assessment of spoken language second edition,” 2017.
  48. [48].↵
    B. A. Bracken and R. S. McCallum, “Universal nonverbal intelligence test (2nd ed.),” 2016.
  49. [49].↵
    F. Moore, M. A. R. S., and B. A. Bracken, Handbook for Nonverbal Assessment. Springer, 2017.
  50. [50].↵
    T. L. Hutchins and P. A. Prelock, “Technical manual for the theory of mind inventory-2. copyrighted manuscript,” Theoryofmindinventory.com, 2016.
  51. [51].↵
    B. Li, A. Sharma, J. Meng, S. Purushwalkam, and E. Gowen, “Applying machine learning to identify autistic adults using imitation: An exploratory study.,” Plos One, vol. 12, 2017.
  52. [52].↵
    M. Rutter, A. Bailey, and C. S. C. Q. Lord, The Social Communication Questionnaire Manual. Western Psychological Services, 2003.
  53. [53].↵
    M. Baron-Cohen, An essay on autism and theory of mind. MIT Press, 1995.
  54. [54].↵
    S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the autistic-child have a theory of mind,” Cognition, vol. 277, p. 85, 1985.
    OpenUrl
  55. [55].↵
    T. L. Hutchins, P. Prelock, and W. Chase, “Test-retest reliability of a theory of mind task battery for children with autism spectrum disorders,” Focus Autism Other Dev Disabil, vol. 23, p. 195, 2008.
    OpenUrl
  56. [56].
    T. L. Hutchins, P. A. Prelock, and L. Bonazinga, “Psychometric evaluation of the theory of mind inventory (tomi): A study of typically developing children and children with autism spectrum disorder,” J Autism Dev Disord, vol. 42, p. 327, 2012.
    OpenUrlPubMed
  57. [57].↵
    T. L. H. M. Lerner and P. A. Prelock, “Brief report: Preliminary evaluation of the theory of mind inventory and its relationship to measures of social skills,” J Autism Dev Disord, vol. 41, p. 4, Apr. 2011.
    OpenUrl
  58. [58].↵
    J. Hadwin and J. Perner, “Pleased and surprised: Childrens cognitive theory of emotion,” British Journal of Developmental Psychology, vol. 9, p. 215, 1991.
    OpenUrl
  59. [59].
    T. Ruffman and T. R. Keenan, “The belief-based emotion of surprise: The case for a lag in understanding relative to false belief,” Developmental Psychology, vol. 32, p. 1, 1996.
    OpenUrl
  60. [60].
    T. Russel and T. R. Keenan, “The belief-based emotion of surprise: The case for a lag in understanding relative to false belief,” Developmental Psychology, vol. 32, 1996.
  61. [61].
    B. Seider and S. L. D., “A developmental analysis of elementary school-aged childrens concepts of pride and embarrassment,” Child Development, vol. 59, p. 367, 1988.
    OpenUrlCrossRefPubMedWeb of Science
  62. [62].↵
    A. Hillier and L. Allinson, “Understanding embarrassment among those with autism: Breaking down the complex emotion of embarrassment among those with autism,” Journal of Autism and Developmental Disorders, vol. 32, p. 6, 2002.
    OpenUrl
  63. [63].↵
    Youtube, [Online]. Available: https://www.youtube.com/watch?v=Q_Pa6KFL1Nw%5C&t=139s%5Ccf2.
  64. [64].↵
    M. F. Glasser, S. N. Sotiropoulos, J. A. Wilson, T. S. Coalson, B. Fischl, and J. L. Andersson, “The minimal preprocessing pipelines for the human connectome project,” NeuroImage, vol. 80, p. 105, 2013.
    OpenUrlCrossRefPubMedWeb of Science
  65. [65].↵
    FreeSurfer, [Online]. Available: https://surfer.nmr.mgh.harvard.edu/fswiki/aparcstats2table.
  66. [66].↵
    F. B, “Whole brain segmentation: Automated labeling of neuroanatomical structures in the human brain.,” Neuron, vol. 33, pp. 341–355, 2002.
    OpenUrlCrossRefPubMedWeb of Science
  67. [67].↵
    J. P. Hanley, D. M. Rizzo, J. S. Buzas, and M. J. Eppstein, “A tandem evolutionary algorithm for identifying causal rules for complex data,” Evol Comput, vol. 28, no. 1, pp. 87–114, 2020.
    OpenUrl
  68. [68].↵
    R. L. Wasserstein, A. L. Schirm, and N. A. Lazar, “Moving to a world beyond p < 0.05.,” The American Statistician, vol. 93, pp. 1–19, 2019.
    OpenUrl
  69. [69].
    R. Wasserstein and N. Lazar, “The asa 92s statement on p-values: Context, process, and purpose,” Amer Statist, vol. 70, pp. 129–133, 2016.
    OpenUrl
  70. [70].↵
    R. Nuzzo, “Scientific method: Statistical errors,” Nature, vol. 506, pp. 150–152, 2014.
    OpenUrlCrossRefPubMedWeb of Science
  71. [71].
    Y. Zhang and L. Wu, “Classification of fruits using computer vision and a multiclass support vector machine.,” Sensors, vol. 1248, 2012.
  72. [72].
    D. Li, W. Yang, and S. Wang, “Classification of foreign fibers in cotton lint using machine vision and multi-class support vector machine.,” Electron. Agric., vol. 74, p. 274, 2010.
    OpenUrl
  73. [73].
    K. Nickel, L. T. Elst, J. Manko, J. Unterrainer, R. Rauh, and C. Klein, “Volume loss distinguishes between autism and (comorbid) attention-deficit/hyperactivity disorder freesurfer analysis in children,” Frontiers in Psychiatry, vol. 2, p. 9, 2018.
    OpenUrl
  74. [74].
    B. A. Zielinski, M. B. Prigge, J. A. Nielsen, A. L. Froehlich, T. J. Abildskov, and J. S. Anderson, “Longitudinal changes in cortical thickness in autism and typical development,” Brain, vol. 2, no. 137, pp. 1799–1812, 2014.
    OpenUrl
  75. [75].
    M. C. Postema, D. van Rooij, and E. Anagnostou, “Altered structural brain asymmetry in autism spectrum disorder in a study of 54 datasets,” Nat Commun, vol. 10, p. 4958, 2019. [Online]. Available: https://doi.org/10.1038/s41467-019-13005-8.
    OpenUrl
Back to top
PreviousNext
Posted November 12, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Classification and Biomarker Identification In Autism Using Conjunctive Clause Evolutionary Algorithm
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Classification and Biomarker Identification In Autism Using Conjunctive Clause Evolutionary Algorithm
Yu Han, Patricia A. Prelock, John P. Hanley, Emily L. Coderre, Donna M. Rizzo
medRxiv 2020.11.09.20227843; doi: https://doi.org/10.1101/2020.11.09.20227843
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Classification and Biomarker Identification In Autism Using Conjunctive Clause Evolutionary Algorithm
Yu Han, Patricia A. Prelock, John P. Hanley, Emily L. Coderre, Donna M. Rizzo
medRxiv 2020.11.09.20227843; doi: https://doi.org/10.1101/2020.11.09.20227843

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Neurology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)