Comparison of three bioinformatics tools in the detection of ASD candidate variants from whole exome sequencing data

Apurba Shil; Liron Levin; Hava Golan; Gal Meiri; Analya Michaelovski; Yair Tsadaka; Adi Aran; Ilan Dinstein; Idan Menashe

doi:10.1101/2023.04.23.23288995

Abstract

Background Autism spectrum disorder (ASD) is a heterogenous multifactorial neurodevelopmental condition with a significant genetic susceptibility component. Thus, identifying genetic variations associated with ASD is a complex task. Whole-exome sequencing (WES) is an effective approach for detecting extremely rare protein-coding single-nucleotide variants (SNVs) and short insertions/deletions (INDELs). However, interpreting these variants’ functional and clinical consequences requires integrating multifaceted genomic information.

Methods We compared the concordance and effectiveness of three bioinformatics tools in detecting ASD candidate variants (SNVs and short INDELs) from WES data of 220 ASD family trios registered in the National Autism Database of Israel. We studied only rare (<1% population frequency) proband-specific variants. According to the American College of Medical Genetics (ACMG) guidelines, the pathogenicity of variants was evaluated by the InterVar and TAPES tools. In addition, likely gene-disrupting (LGD) variants were detected based on an in-house bioinformatics tool, Psi-Variant, that integrates results from seven in-silico prediction tools.

Results Overall, 605 variants in 499 genes distributed in 193 probands were detected by these tools. The overlap between the tools was 64.1%, 17.0%, and 21.6% for InterVar–TAPES, InterVar– Psi-Variant, and TAPES–Psi-Variant, respectively. The intersection between InterVar and Psi-Variant (I∩P) was the most effective approach in detecting variants in known ASD genes (OR = 5.38, 95% C.I. = 3.25–8.53), while the union of InterVar and Psi-Variant (I U P) achieved the highest diagnostic yield (30.9%).

Conclusions Our results suggest that integrating different variant interpretation approaches in detecting ASD candidate variants from WES data is superior to each approach alone. The inclusion of additional criteria could further improve the detection of ASD candidate variants.

Background

Autism spectrum disorder (ASD) comprises a collection of heterogeneous neurodevelopmental disorders that share two behavioral characteristics—difficulties in social communication and restricted, repetitive behaviors and interests^1,2. The etiology of ASD has a significant genetic component, as is evident from multiple twin and family studies^3–6. Yet, over the years, very few genetic causes of ASD have been discovered; thus, today, despite extensive research, an understanding of the overall genetic architecture of ASD remains obscure^6,7.

The emergence of next-generation sequencing (NGS) approaches in the past decade has transformed the genetic research of complex traits⁸. These NGS technologies have facilitated high-throughput DNA sequencing for large cohorts of patients, allowing the comparison of multiple variants that includes single-nucleotide variants (SNVs) and short insertions/deletions (INDELs) between large groups of patients^9–12. In this realm, whole-exome sequencing (WES) is particularly suitable for studying the genetics of heterogenous traits such as ASD, as it focuses on a relatively limited number of protein-coding variants^9,10,13–18.

However, understanding the functional consequences of coding Variants is not a trivial task. In 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published standards and guidelines to generalize sequence variant interpretation and to address the issue of inconsistent interpretation across laboratories⁸. The resulting system for classifying variants recommends 28 criteria (16 for pathogenic and 12 for benign variants) and provides a set of scoring rules based on variant population allele frequency, variant functional annotation, variant familial segregation, etc.^8,19; Variants are classified as pathogenic (P), likely pathogenic (LP), variants of uncertain significance (VUS), likely benign (LB) or benign (B). Subsequently, multiple in-silico tools were developed to implement these ACMG/AMP criteria for annotating the prospective pathogenicity of variants detected in WES studies.

While the ACMG/AMP scoring approach is highly effective for detecting de-novo highly penetrant mutations for rare Mendelian disorders, it is less suitable for detecting inherited partially penetrant variants²⁰. Such variants, usually annotated as VUS in terms of the ACMP/AMP criteria, are expected to contribute significantly to the risk of developing neurodevelopmental conditions, including ASD^{9,17,18,21,22}. Thus, relying solely on the ACMG/AMP criteria for variant annotation in WES studies of ASD may result in an under-representation of susceptibility variants, which will lead to a lower diagnostic yield for ASD. To overcome this potential limitation, we have developed “Psi-Variant,” a pipeline to detect different types of likely gene-disrupting (LGD) variants, including protein truncating and deleterious missense variants. We applied Psi-Variant – in comparison with InterVar and TAPES, two variant interpretation tools that use the ACMG/AMP criteria – to a large WES dataset of ASD to evaluate the concordance between these tools to detect variants and to assess their effectiveness in detecting ASD susceptibility variants.

Methods

Study sample

Initially, the study sample comprised 250 children diagnosed with ASD who are registered in the National Autism Database of Israel (NADI)^23,24 and whose parents gave consent for participation in this study. Based on our clinical records, none of the parents in the study has been diagnosed with ASD, intellectual disability, or any other type of neurodevelopmental disorder. Genomic DNA was extracted from saliva samples from participating children and their parents using Oragene®•DNA (OG-500/575) collection kits (DNA Genotek, Canada).

Whole exome sequencing

WES analysis was performed on the above-mentioned samples with Illumina HiSeq sequencers, followed by the Illumina Nextera exome capture kit at the Broad Institute as part of the Autism Sequencing Consortium, described previously¹¹. Sequencing reads aligned to Genome Reference Consortium Human Build 38 and aggregated into BAM/CRAM files were analyzed using the Genome Analysis Toolkit (GATK)²⁵ to generate a joint variant calling format (vcf) file for all subjects in the study. We excluded data for 30 probands from the raw vcf file due to incomplete pedigree information or low-quality WES data. Thus, WES data for 220 ASD trios were analyzed in this study (Fig. 1).

Fig. 1

Analysis workflow for detecting LP/P/LGD Variants from the WES data. InterVar and TAPES detected LP/P Variants by implementing ACMG/AMP criteria. Psi-Variant detected LGD Variants by utilizing in-house criteria.

Data analysis

The SNV detection process in this study is outlined in Fig. 1. and explained below.

Data cleaning

The raw vcf file contained 1,935,632 variants. From this file, we removed variants with missing genotypes and/or variants in regions with low read coverage (≤ 20 reads) and/or with low genotype quality (GQ ≤ 50). In addition, we removed all common variants (i.e., those with a population minor allele frequency >1%)²⁶ as well as those that did not pass the GATK’s “VQSR” and “ExcessHet” filters. Thereafter, we used an in-house machine learning (ML) algorithm to remove other potentially false-positive variants. The details of this ML algorithm and its efficiency in classifying true positive and false positive variants are summarized in the supplementary file S1. Finally, we used the pedigree structure of the families to identify proband-specific genotypes, including de-novo variants, recessively inherited variants, and X-linked variants (in males). Recessively inherited variants occur in the same loci of both copies of a gene in autosomes (where both the parents are carriers). Whereas one altered copy of the gene in chromosome X among males is defined as X-linked (males). We removed multiallelic variants from these genotypes and those classified as “de-novo” that appeared in more than two individuals in the sample. In this study, we haven’t considered compound heterozygote variants (in cis/trans).

Identifying ASD candidate variants

We searched for candidate ASD Variants using three complementary approaches. First, we applied InterVar¹⁹ and TAPES²⁷, two commonly used publicly available command-line tools that use ACMG/AMP criteria⁸, to detect LP/P Variants. In addition, we assigned the ACMG/AMP PS2 criterion to all the de-novo Variants to detect additional LP/P Variants from the list of VUS. Since InterVar and TAPES are less sensitive tools for detecting recessive possible gene disrupting (LGD) variants²⁰, we developed an integrated in-house tool, Psi-Variant, to detect LGD variants. The Psi-Variant workflow starts using Ensembl’s Variant Effect Predictor (VEP)²⁶ to annotate the functional consequences for each variant in a multi-sample vcf file. Then, all frameshift indels, nonsense, and splice acceptor/donor variants are further analyzed by the LoFtool²⁸ with scores of < 0.25 are annotated as intolerant variants. In addition, it applies six different in-silico tools to all missense substitutions and annotates them as “deleterious/damaging” if at least three (≥ 50%) of them exceed the following cutoffs: SIFT²⁹ (< 0.05), PolyPhen-2³⁰ (≥ 0.15), CADD³¹ (> 20), REVEL³² (> 0.50), M_CAP³³ (> 0.025) and MPC³⁴ (≥ 2). These scores were extracted by utilizing the dbNSFP database³⁵.

Comparison between InterVar, TAPES, and Psi-Variant

We compared the number of variants detected by the three tools and the percentages of variants detected by different combinations. Thereafter, we used the list of ASD genes (n = 1031) from the SFARI Gene database³⁶ (accessed on 11 January 2022) as the gold standard to compute the odds ratio (OR) and positive predictive value (PPV) for detecting candidate ASD variants in SFARI genes. In addition, we assessed the detection yield for each tool combination by computing the proportion of children with detected candidate ASD variants in SFARI genes.

Software

Data storage, management, and analysis were conducted on a high-performing computer cluster in a Linux environment using Python version 3.5 and R Studio version 1.1.456. All the statistical analyses and data visualizations were incorporated into R Studio.

Results

Detection of candidate variants by the different tools

A total of 605 variants in 193 probands (highlighted in the supplementary Table S2) were detected by at least one of InterVar (n = 220), TAPES (n = 199), or Psi-Variant (n = 483) from a dataset of 2,035 high-quality, ultra-rare variants with proband-specific genotypes (Fig. 1). Of these, 90 variants (14.9%) were detected by all three tools. The highest concordance in detected variants was observed between InterVar and TAPES (64.3%), followed by TAPES and Psi-Variant (21.6%) and InterVar and Psi-Variant (17.0%).

The characteristics of the detected variants are shown in Table 1. Significantly higher rates of LoF and missense variants were detected by all three tools compared to the rates of these variants in the clean vcf file (P < 0.001). As expected, missense variants comprised the majority of detected variants, with 81.6%, 58.8%, and 51.4% of the variants detected by Psi-Variant, TAPES, and InterVar, respectively. Notably, a higher number of frameshift variants were detected by Psi-Variant than by InterVar and TAPES (58 vs. 39 and 22, respectively), but the percentages of these variants out of the total number of detected variants were lower due to the markedly higher number of variants detected by Psi-Variant.

View this table:

Table 1 Characteristics of the detected Variants by InterVar, TAPES, and Psi-Variant from the WES data

Almost all (≥ 95%) variants detected by either InterVar or TAPES were de-novo variants, while de-novo variants comprised only 36.2% of the variants detected by Psi-Variant, which also detected a high portion of X-linked and autosomal recessive variants (37.1% and 26.7%, respectively). Examination of the distribution of the detected variants in genes associated with ASD according to the SFARI Gene database³⁶ revealed a two-fold enrichment of variants distributed in ASD genes (for all detection tools) compared to their portion in the clean vcf file and even a higher enrichment of Variants distributed in high-confidence ASD genes (P < 0.001).

Effectiveness of ASD candidate Variants detection

To assess the effectiveness of the different tools in detecting ASD candidate SNVs, we calculated the PPV and the OR for detecting ASD genes (i.e., those listed in the SFARI Gene database³⁶) for different combinations of utilization of the three tools. The results of these analyses are shown in Fig. 2. Utilization of any of the three tools resulted in a significant enrichment of ASD genes, with the highest enrichment being observed in SNVs detected by InterVar (PPV = 0.178; OR = 4.10, 95% confidence interval (C.I.) = 2.77–5.90) followed by TAPES (PPV = 0.158; OR = 3.53, 95% C.I. = 2.28–5.27) and Psi-Variant (PPV = 0.143; OR = 3.21, 95% C.I. = 2.39–4.22). Notably, better performance in detecting ASD candidate SNVs was obtained at the intersection of the detected SNVs between InterVar and Psi-Variant (I ∩ P) (PPV = 0.222; OR = 5.38, 95% CI = 3.25–8.53). The I ∩ P combination was also the most effective in detecting SNVs in high-confidence ASD genes (i.e., those with a score of 1 in the SFARI Gene database ³⁶ (Fig. 2A -2B). However, the I ∩ P combination had a relatively low diagnostic yield of 9.1% for SFARI genes. On the other hand, the union of InterVar and Psi-Variant (I U P) achieved a diagnostic yield of 30.9% (Fig. 2C) (three times more than I ∩ P) but had a reduced effectiveness in detecting variants in SFARI genes (PPV = 0.141; OR = 3.18, 95% C.I. = 2.43–4.10) (Fig. 2A - 2B).

Fig. 2

Effectiveness of InterVar (I), TAPES (T), Psi-Variant (P), and their combinations in detecting candidate variants in ASD genes. A Positive predictive value (PPV) of detecting candidate variants in SFARI 1 and all SFARI genes. B Odds Ratios (ORs) of detecting candidate variants in SFARI 1 and all SFARI genes. C Diagnostic yield (%) achieved by the different tool combinations for detecting candidate variants in SFARI 1 and all SFARI genes.

Discussion

In this study, we assessed the concordance and effectiveness of three bioinformatics tools in the interpretation of variants detected in the WES of children with ASD. There was better agreement in variant detection between InterVar and TAPES than between Psi-Variant and each of these two tools, probably because both InterVar and TAPES are based on the ACMG/AMP guidelines⁸, while Psi-Variant uses the interpretation of seven in-silico tools in assessing the functional consequences of LGD variants. In addition, most (94%) of the variants detected by either InterVar or TAPES were de-novo variants, compared to only 36% of the variants detected by Psi-Variant. This difference may be attributed to the fact that ACMG/AMP guidelines are particularly designed to detect de-novo highly penetrant variants, while inherited variants (autosomal recessive and X-linked) are usually classified as VUS²⁰. Importantly, such rare inherited variants have been found to be associated with a variety of neurodevelopmental conditions, including ASD^{9,17,18,21,22}. Another major difference between these tools lies in the detection of in-frame insertions/deletions that comprised ∼20% of the variants detected by either InterVar or TAPES, while such SNVs were discarded by Psi-Variant. We decided to exclude these variants from Psi-Variant because their clinical relevance has been demonstrated in several genetic disorders^37,38 but not in ASD^39–41.

Another important factor that could affect the concordance between the three tools is the annotation tools they use. Specifically, both InterVar and TAPES use AnnoVar⁴² for their variant annotation, while Psi-Variant uses Ensembl’s VEP²⁶. It has already been shown that AnnoVar and VEP have a low concordance in the classification of LoF variants⁴³. In addition, each tool, InterVar, TAPES, and Psi-Variant, utilizes a different set of in-silico tools for the classification of missense variants, with SIFT²⁹ alone being shared by all three tools. These differences are probably the reason for the large differences in the detection of missense variants between the three tools (Table 1).

Today, there are no accepted guidelines for the detection of ASD susceptibility variants from WES data. Many genetic labs use the ACMG/AMP guidelines⁸, leading to a relatively low diagnostic yield^44,45. Our findings suggest that different combinations of bioinformatics tools for variant interpretation may improve the detection of ASD susceptibility variants. Furthermore, combining these tools provides more flexibility in selecting the desired proportion between the detection yield and false positives. Thus, future guidelines for the detection of ASD susceptibility variants should consider the integration of different variant interpretation criteria.

Of note, many of the variants detected by the integrative pipeline affect genes with no known association with ASD, according to the SFARI Gene database³⁶. This finding highlights the capability of the integrative pipeline to detect novel ASD genes. Obviously, the association of these genes and variants with ASD susceptibility needs to be validated in additional studies.

The results of this study should be considered under the following limitations. First, the effectiveness assessments of the different tools and their combinations were based on ASD genes from the SFARI Gene database³⁶. While this is the most commonly used database for ASD genes and is continuously updated, it is based on data curated from the literature and may thus include genes falsely associated with ASD. Second, the variant detection analyses were performed on WES data of a cohort from the Israeli population, which may not necessarily be representative of the genetic architecture of ASD. Third, the tools used in this study were designed to detect only extremely rare variants with relatively large functional effects. Thus, a more effective approach for the detection of ASD susceptibility variants should also include the interpretation of other types of genomic variations, such as copy-number and compound heterozygote variants^46–51, as well as other variants with milder functional effects^17,52,53. Finally, it should be noted that there are many other approaches for variant interpretation from WES data. Thus, it is possible that combinations of other approaches would be more effective in the detection of ASD susceptibility variants from WES data than the approaches investigated in this study.

Conclusions

Our findings suggest that combination of different bioinformatics tools is more effective in the detection of ASD candidate variants from WES data than each of the examined tools alone. Future guidelines for the detection of ASD susceptibility variants should consider integrating different variant interpretation approaches to improve the effectiveness of ASD candidate variants detection from whole exome sequencing data.

Declarations

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Soroka University Medical Center (SOR-076-15; 17 April 2016).

Ethics approval and consent to participate

Informed consent was obtained from all the families involved in the study.

Consent to publication

Not applicable.

Availability of data and materials

WES data were generated as part of the ASC and are available in dbGaP with study accession: phs000298.v4.p3. The generated results and codes are available in a GitHub public repository: https://github.com/AppWick-hub/Psi-Variant or available upon reasonable request to the corresponding author Prof. Idan Menashe (idanmen{at}bgu.ac.il).

Competing interests

The authors declare no competing interests.

Funding

This study was supported by a grant from the Israel Science Foundation (1092/21).

Authors’ contributions

Conceptualization: A.S. and I.M.; methodology: A.S. and I.M.; software: A.S. and L.L.; validation: A.S. and I.M.; formal analysis: A.S.; resources: H.G., G.M., A.M., Y.T., A.A. and I.D.; data curation: A.S.; writing—original draft preparation: A.S. and I.M.; writing— review and editing: I.M., and A.S.; supervision: I.M.; project administration: I.M.; funding acquisition: I.M. All the authors have read and agreed to the published version of the manuscript.

Acknowledgments

We thank the families who participated in this research, without their contributions, genetic studies would be impossible.

Footnotes

Additional results of tool-specific diagnostic yield (highlighted in Fig. 2(C)) have been added. Also, a list of (Supplementary Table S2) LP/P/LGD variants (N=605) detected by at least one of InterVar, TAPES, and Psi-Variant, has been provided.

List of abbreviations

ACMG/AMP: American College of Medical Genetics and Genomics/Association of Molecular Pathology
ASD: autism spectrum disorder
C.I.: confidence interval
GATK: Genome Analysis Toolkit
LGD: likely gene disrupting
LoF: loss of function
LP: likely pathogenic
ML: machine learning
NADI: National Autism Database in Israel
NGS: next-generation sequencing
OR: odds ratio
P: pathogenic
PPV: positive predictive value
SNV: single nucleotide variants
VEP: Variant Effect Predictor
vcf: variant calling format
VUS: variants of uncertain significance
WES: whole exome sequencing.

References

1.↵
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th edn. American Psychiatric Publishing (2013).
2.↵
Meng-Chuan Lai, Michael V Lombardo, S. B.-C. Autism. Lancet (2014).
3.↵
Yoo, H. Genetics of Autism Spectrum Disorder: Current Status and Possible Clinical Applications. Exp Neurobiol 24, 257 (2015).
OpenUrl CrossRef
4.
Lord, C. et al. Autism spectrum disorder. The Lancet 392, 508–520 (2018).
OpenUrl CrossRef
5.
Ronald, A. & Hoekstra, R. A. Autism spectrum disorders and autistic traits: A decade of new twin studies. American Journal of Medical Genetics, Part B: Neuropsychiatric Genetics 156, 255–274 (2011).
OpenUrl
6.↵
Hallmayer, J. et al. Genetic Heritability and Shared Environmental Factors Among Twin Pairs With Autism. Arch Gen Psychiatry 68, 1095–1102 (2011).
OpenUrl CrossRef PubMed Web of Science
7.↵
Devlin, B. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–246 (2012).
OpenUrl CrossRef PubMed Web of Science
8.↵
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine 17, 405–424 (2015).
OpenUrl CrossRef PubMed
9.↵
Satterstrom, F. K. et al. Autism spectrum disorder and attention deficit hyperactivity disorder have a similar burden of rare protein-truncating variants. Nat Neurosci 22, 1961–1965 (2019).
OpenUrl PubMed
10.↵
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat Genet 54, (2022).
11.↵
Satterstrom, F. K. et al. Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism Article Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell 1–17 (2020) doi:10.1016/j.cell.2019.12.036.
OpenUrl CrossRef PubMed
12.↵
Wu, D. et al. Large-Scale Whole-Genome Sequencing of Three Diverse Asian Populations in Singapore. Cell 179, 736–749.e15 (2019).
OpenUrl CrossRef PubMed
13.↵
Satterstrom, F. K. et al. Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism Article Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell 1–17 (2020) doi:10.1016/j.cell.2019.12.036.
OpenUrl CrossRef PubMed
14.
Feliciano, P., et al. Exome sequencing of 457 autism families recruited online provides evidence for autism risk genes. NPJ Genom Med 4, (2019).
15.
De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 209–215 (2014).
OpenUrl CrossRef PubMed Web of Science
16.
Ishay, R. T., et al. Diagnostic Yield and Economic Implications of Whole-Exome Sequencing for ASD Diagnosis in Israel. (2022).
17.↵
Wang, T., Zhao, P. A. & Eichler, E. E. Rare variants and the oligogenic architecture of autism. Trends in Genetics 1–9 (2022) doi:10.1016/j.tig.2022.03.009.
OpenUrl CrossRef
18.↵
Doan, R. N. et al. Recessive gene disruptions in autism spectrum disorder. Nat Genet 51, (2019).
19.↵
Li, Q. & Wang, K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am J Hum Genet 100, 267–280 (2017).
OpenUrl PubMed
20.↵
Houge, G. et al. Stepwise ABC system for classification of any type of genetic variant. European Journal of Human Genetics 30, 150–159 (2022).
OpenUrl
21.↵
Wilfert, A. B. et al. Recent ultra-rare inherited variants implicate new autism candidate risk genes. Nat Genet 53, 1125–1134 (2021).
OpenUrl CrossRef
22.↵
Halvorsen, M. et al. Exome sequencing in obsessive–compulsive disorder reveals a burden of rare damaging coding variants. Nat Neurosci 24, 1071–1076 (2021).
OpenUrl CrossRef
23.↵
Dinstein, I. et al. The National Autism Database of Israel: a Resource for Studying Autism Risk Factors, Biomarkers, Outcome Measures, and Treatment Efficacy. Journal of Molecular Neuroscience 70, 1303–1312 (2020).
OpenUrl CrossRef
24.↵
Meiri, G. et al. Brief Report: The Negev Hospital-University-Based (HUB) Autism Database. J Autism Dev Disord 47, 2918–2926 (2017).
OpenUrl CrossRef
25.↵
McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297 (2010).
OpenUrl Abstract/FREE Full Text
26.↵
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol 17, 1–14 (2016).
OpenUrl CrossRef PubMed
27.↵
Xavier, A., Scott, R. J. & Talseth-Palmer, B. A. TAPES: A tool for assessment and prioritisation in exome studies. PLoS Comput Biol 15, 1–9 (2019).
OpenUrl CrossRef
28.↵
Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: A gene intolerance score based on loss-of-function variants in 60 706 individuals. Bioinformatics 33, 471–474 (2017).
OpenUrl CrossRef
29.↵
Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31, 3812–3814 (2003).
OpenUrl CrossRef PubMed Web of Science
30.↵
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Current Protocols in Human Genetics vol. 2 (2013).
31.↵
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res 47, D886–D894 (2019).
OpenUrl CrossRef PubMed
32.↵
Ioannidis, N. M. et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum Genet 99, 877–885 (2016).
OpenUrl CrossRef PubMed
33.↵
Jagadeesh, K. A. et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat Genet 48, 1581–1586 (2016).
OpenUrl CrossRef PubMed
34.↵
Samocha, K. E., et al. Regional missense constraint improves variant deleteriousness prediction. bioRxiv (2017) doi:10.1101/148353.
OpenUrl Abstract/FREE Full Text
35.↵
Liu, X., Li, C., Mou, C., Dong, Y. & Tu, Y. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med 12, 1–8 (2020).
OpenUrl CrossRef PubMed
36.↵
Abrahams, B. S. et al. SFARI Gene 2.0: A community-driven knowledgebase for the autism spectrum disorders (ASDs). Mol Autism 4, 2–4 (2013).
OpenUrl CrossRef PubMed
37.↵
Sergouniotis, P. I. et al. The role of small in-frame insertions/deletions in inherited eye disorders and how structural modelling can help estimate their pathogenicity. Orphanet J Rare Dis 11, 1–8 (2016).
OpenUrl CrossRef
38.↵
Sallah, S. R. et al. Assessing the Pathogenicity of In-Frame CACNA1F Indel Variants Using Structural Modeling. Journal of Molecular Diagnostics 24, 1232–1239 (2022).
OpenUrl
39.↵
Iossifov, I. et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron 74, 285–299 (2012).
OpenUrl CrossRef PubMed Web of Science
40.
Dong, S. et al. De novo insertions and deletions of predominantly paternal origin are associated with autism spectrum disorder. Cell Rep 9, 16–23 (2014).
OpenUrl PubMed
41.↵
Kopp, N., Amarillo, I., Martinez-Agosto, J. & Quintero-Rivera, F. Pathogenic paternally inherited NLGN4X deletion in a female with autism spectrum disorder: Clinical, cytogenetic, and molecular characterization. Am J Med Genet A 185, 894–900 (2021).
OpenUrl PubMed
42.↵
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, 1–7 (2010).
OpenUrl CrossRef PubMed Web of Science
43.↵
McCarthy, D. J. et al. Choice of transcripts and software has a large effect on variant annotation. Genome Med 6, (2014).
44.↵
Trost, B. et al. Genomic architecture of autism from comprehensive whole-genome sequence annotation. Cell 185, 4409–4427.e18 (2022).
OpenUrl CrossRef
45.↵
Tammimies, K. et al. Molecular diagnostic yield of chromosomal microarray analysis and whole-exome sequencing in children with autism spectrum disorder. JAMA - Journal of the American Medical Association 314, 595–903 (2015).
OpenUrl
46.↵
Husson, T. et al. Rare genetic susceptibility variants assessment in autism spectrum disorder: detection rate and practical use. Transl Psychiatry 10, (2020).
47.
Turner, T. N. et al. Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 171, 710–722.e12 (2017).
OpenUrl CrossRef PubMed
48.
Leppa, V. M. M. et al. Rare Inherited and De Novo CNVs Reveal Complex Contributions to ASD Risk in Multiplex Families. Am J Hum Genet 99, 540–554 (2016).
OpenUrl CrossRef PubMed
49.
Krumm, N. et al. Excess of rare, inherited truncating mutations in autism. Nat Genet 47, 582– 588 (2015).
OpenUrl CrossRef PubMed
50.
Lin, B. D. et al. The role of rare compound heterozygous events in autism spectrum disorder. doi:10.1038/s41398-020-00866-7.
OpenUrl CrossRef
51.↵
Tuncay, I. O. et al. The genetics of autism spectrum disorder in an East African familial cohort. Cell Genomics 3, 100322 (2023).
OpenUrl
52.↵
Du, Y. et al. Nonrandom occurrence of multiple de novo coding variants in a proband indicates the existence of an oligogenic model in autism. Genetics in Medicine 22, 170–180 (2020).
OpenUrl
53.↵
Guo, H. et al. Genome sequencing identifies multiple deleterious variants in autism patients with more severe phenotypes. Genetics in Medicine 21, 1611–1620 (2019).
OpenUrl CrossRef PubMed

View the discussion thread.

Posted August 16, 2023.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Genetic and Genomic Medicine

Subject Areas

All Articles

Addiction Medicine (349)
Allergy and Immunology (668)
Allergy and Immunology (668)
Anesthesia (181)
Cardiovascular Medicine (2648)
Dentistry and Oral Medicine (316)
Dermatology (223)
Emergency Medicine (399)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
Epidemiology (12228)
Forensic Medicine (10)
Gastroenterology (759)
Genetic and Genomic Medicine (4103)
Geriatric Medicine (387)
Health Economics (680)
Health Informatics (2657)
Health Policy (1005)
Health Systems and Quality Improvement (985)
Hematology (363)
HIV/AIDS (851)
Infectious Diseases (except HIV/AIDS) (13695)
Intensive Care and Critical Care Medicine (797)
Medical Education (399)
Medical Ethics (109)
Nephrology (436)
Neurology (3882)
Nursing (209)
Nutrition (577)
Obstetrics and Gynecology (739)
Occupational and Environmental Health (695)
Oncology (2030)
Ophthalmology (585)
Orthopedics (240)
Otolaryngology (306)
Pain Medicine (250)
Palliative Medicine (75)
Pathology (473)
Pediatrics (1115)
Pharmacology and Therapeutics (466)
Primary Care Research (452)
Psychiatry and Clinical Psychology (3432)
Public and Global Health (6527)
Radiology and Imaging (1403)
Rehabilitation Medicine and Physical Therapy (814)
Respiratory Medicine (871)
Rheumatology (409)
Sexual and Reproductive Health (410)
Sports Medicine (342)
Surgery (448)
Toxicology (53)
Transplantation (185)
Urology (165)

[1] 1.↵
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th edn. American Psychiatric Publishing (2013).

[2] 2.↵
Meng-Chuan Lai, Michael V Lombardo, S. B.-C. Autism. Lancet (2014).

[3] 3.↵
Yoo, H. Genetics of Autism Spectrum Disorder: Current Status and Possible Clinical Applications. Exp Neurobiol 24, 257 (2015).
OpenUrl CrossRef

[4] 4.
Lord, C. et al. Autism spectrum disorder. The Lancet 392, 508–520 (2018).
OpenUrl CrossRef

[5] 5.
Ronald, A. & Hoekstra, R. A. Autism spectrum disorders and autistic traits: A decade of new twin studies. American Journal of Medical Genetics, Part B: Neuropsychiatric Genetics 156, 255–274 (2011).
OpenUrl

[6] 6.↵
Hallmayer, J. et al. Genetic Heritability and Shared Environmental Factors Among Twin Pairs With Autism. Arch Gen Psychiatry 68, 1095–1102 (2011).
OpenUrl CrossRef PubMed Web of Science

[7] 7.↵
Devlin, B. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–246 (2012).
OpenUrl CrossRef PubMed Web of Science

[8] 8.↵
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine 17, 405–424 (2015).
OpenUrl CrossRef PubMed

[9] 9.↵
Satterstrom, F. K. et al. Autism spectrum disorder and attention deficit hyperactivity disorder have a similar burden of rare protein-truncating variants. Nat Neurosci 22, 1961–1965 (2019).
OpenUrl PubMed

[10] 10.↵
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat Genet 54, (2022).

[11] 11.↵
Satterstrom, F. K. et al. Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism Article Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell 1–17 (2020) doi:10.1016/j.cell.2019.12.036.
OpenUrl CrossRef PubMed

[12] 12.↵
Wu, D. et al. Large-Scale Whole-Genome Sequencing of Three Diverse Asian Populations in Singapore. Cell 179, 736–749.e15 (2019).
OpenUrl CrossRef PubMed

[13] 13.↵
Satterstrom, F. K. et al. Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism Article Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell 1–17 (2020) doi:10.1016/j.cell.2019.12.036.
OpenUrl CrossRef PubMed

[14] 14.
Feliciano, P., et al. Exome sequencing of 457 autism families recruited online provides evidence for autism risk genes. NPJ Genom Med 4, (2019).

[15] 15.
De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 209–215 (2014).
OpenUrl CrossRef PubMed Web of Science

[16] 16.
Ishay, R. T., et al. Diagnostic Yield and Economic Implications of Whole-Exome Sequencing for ASD Diagnosis in Israel. (2022).

[17] 17.↵
Wang, T., Zhao, P. A. & Eichler, E. E. Rare variants and the oligogenic architecture of autism. Trends in Genetics 1–9 (2022) doi:10.1016/j.tig.2022.03.009.
OpenUrl CrossRef

[18] 18.↵
Doan, R. N. et al. Recessive gene disruptions in autism spectrum disorder. Nat Genet 51, (2019).

[19] 19.↵
Li, Q. & Wang, K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am J Hum Genet 100, 267–280 (2017).
OpenUrl PubMed

[20] 20.↵
Houge, G. et al. Stepwise ABC system for classification of any type of genetic variant. European Journal of Human Genetics 30, 150–159 (2022).
OpenUrl

[21] 21.↵
Wilfert, A. B. et al. Recent ultra-rare inherited variants implicate new autism candidate risk genes. Nat Genet 53, 1125–1134 (2021).
OpenUrl CrossRef

[22] 22.↵
Halvorsen, M. et al. Exome sequencing in obsessive–compulsive disorder reveals a burden of rare damaging coding variants. Nat Neurosci 24, 1071–1076 (2021).
OpenUrl CrossRef

[23] 23.↵
Dinstein, I. et al. The National Autism Database of Israel: a Resource for Studying Autism Risk Factors, Biomarkers, Outcome Measures, and Treatment Efficacy. Journal of Molecular Neuroscience 70, 1303–1312 (2020).
OpenUrl CrossRef

[24] 24.↵
Meiri, G. et al. Brief Report: The Negev Hospital-University-Based (HUB) Autism Database. J Autism Dev Disord 47, 2918–2926 (2017).
OpenUrl CrossRef

[25] 25.↵
McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297 (2010).
OpenUrl Abstract/FREE Full Text

[26] 26.↵
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol 17, 1–14 (2016).
OpenUrl CrossRef PubMed

[27] 27.↵
Xavier, A., Scott, R. J. & Talseth-Palmer, B. A. TAPES: A tool for assessment and prioritisation in exome studies. PLoS Comput Biol 15, 1–9 (2019).
OpenUrl CrossRef

[28] 28.↵
Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: A gene intolerance score based on loss-of-function variants in 60 706 individuals. Bioinformatics 33, 471–474 (2017).
OpenUrl CrossRef

[29] 29.↵
Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31, 3812–3814 (2003).
OpenUrl CrossRef PubMed Web of Science

[30] 30.↵
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Current Protocols in Human Genetics vol. 2 (2013).

[31] 31.↵
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res 47, D886–D894 (2019).
OpenUrl CrossRef PubMed

[32] 32.↵
Ioannidis, N. M. et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum Genet 99, 877–885 (2016).
OpenUrl CrossRef PubMed

[33] 33.↵
Jagadeesh, K. A. et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat Genet 48, 1581–1586 (2016).
OpenUrl CrossRef PubMed

[34] 34.↵
Samocha, K. E., et al. Regional missense constraint improves variant deleteriousness prediction. bioRxiv (2017) doi:10.1101/148353.
OpenUrl Abstract/FREE Full Text

[35] 35.↵
Liu, X., Li, C., Mou, C., Dong, Y. & Tu, Y. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med 12, 1–8 (2020).
OpenUrl CrossRef PubMed

[36] 36.↵
Abrahams, B. S. et al. SFARI Gene 2.0: A community-driven knowledgebase for the autism spectrum disorders (ASDs). Mol Autism 4, 2–4 (2013).
OpenUrl CrossRef PubMed

[37] 37.↵
Sergouniotis, P. I. et al. The role of small in-frame insertions/deletions in inherited eye disorders and how structural modelling can help estimate their pathogenicity. Orphanet J Rare Dis 11, 1–8 (2016).
OpenUrl CrossRef

[38] 38.↵
Sallah, S. R. et al. Assessing the Pathogenicity of In-Frame CACNA1F Indel Variants Using Structural Modeling. Journal of Molecular Diagnostics 24, 1232–1239 (2022).
OpenUrl

[39] 39.↵
Iossifov, I. et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron 74, 285–299 (2012).
OpenUrl CrossRef PubMed Web of Science

[40] 40.
Dong, S. et al. De novo insertions and deletions of predominantly paternal origin are associated with autism spectrum disorder. Cell Rep 9, 16–23 (2014).
OpenUrl PubMed

[41] 41.↵
Kopp, N., Amarillo, I., Martinez-Agosto, J. & Quintero-Rivera, F. Pathogenic paternally inherited NLGN4X deletion in a female with autism spectrum disorder: Clinical, cytogenetic, and molecular characterization. Am J Med Genet A 185, 894–900 (2021).
OpenUrl PubMed

[42] 42.↵
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, 1–7 (2010).
OpenUrl CrossRef PubMed Web of Science

[43] 43.↵
McCarthy, D. J. et al. Choice of transcripts and software has a large effect on variant annotation. Genome Med 6, (2014).

[44] 44.↵
Trost, B. et al. Genomic architecture of autism from comprehensive whole-genome sequence annotation. Cell 185, 4409–4427.e18 (2022).
OpenUrl CrossRef

[45] 45.↵
Tammimies, K. et al. Molecular diagnostic yield of chromosomal microarray analysis and whole-exome sequencing in children with autism spectrum disorder. JAMA - Journal of the American Medical Association 314, 595–903 (2015).
OpenUrl

[46] 46.↵
Husson, T. et al. Rare genetic susceptibility variants assessment in autism spectrum disorder: detection rate and practical use. Transl Psychiatry 10, (2020).

[47] 47.
Turner, T. N. et al. Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 171, 710–722.e12 (2017).
OpenUrl CrossRef PubMed

[48] 48.
Leppa, V. M. M. et al. Rare Inherited and De Novo CNVs Reveal Complex Contributions to ASD Risk in Multiplex Families. Am J Hum Genet 99, 540–554 (2016).
OpenUrl CrossRef PubMed

[49] 49.
Krumm, N. et al. Excess of rare, inherited truncating mutations in autism. Nat Genet 47, 582– 588 (2015).
OpenUrl CrossRef PubMed

[50] 50.
Lin, B. D. et al. The role of rare compound heterozygous events in autism spectrum disorder. doi:10.1038/s41398-020-00866-7.
OpenUrl CrossRef

[51] 51.↵
Tuncay, I. O. et al. The genetics of autism spectrum disorder in an East African familial cohort. Cell Genomics 3, 100322 (2023).
OpenUrl

[52] 52.↵
Du, Y. et al. Nonrandom occurrence of multiple de novo coding variants in a proband indicates the existence of an oligogenic model in autism. Genetics in Medicine 22, 170–180 (2020).
OpenUrl

[53] 53.↵
Guo, H. et al. Genome sequencing identifies multiple deleterious variants in autism patients with more severe phenotypes. Genetics in Medicine 21, 1611–1620 (2019).
OpenUrl CrossRef PubMed