Abstract
Background Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. However, reliable splicing analysis often faces practical limitations, especially when the relevant tissues are challenging to access. While in silico predictions are valuable, they alone do not meet clinical classification standards. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis.
Results We initiated the study with a retrospective correlation analysis (involving 27 previously FLGSA-analyzed SPINK1 coding SNVs), progressed to a prospective correlation analysis (incorporating 35 newly FLGSA-tested SPINK1 coding SNVs), followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, representing 9.3% of all 720 possible coding SNVs and affecting 19.2% of the 240 coding nucleotides. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through extensive cross-correlation of the FLGSA-obtained and SpliceAI-predicted data, we reasonably extrapolated that none of the unanalyzed 653 coding SNVs in the SPINK1 gene are likely to exert a significant effect on splicing. Out of these 12 splice-altering events, nine produced both wild-type and aberrant transcripts, while the remaining three exclusively generated aberrant transcripts. These splice-altering SNVs were predominantly concentrated in exons 1 and 2, particularly affecting the first and/or last coding nucleotide of each exon. Among the 12 splice-altering events, 11 were missense variants, constituting 2.17% of the 506 potential missense variants, while one was synonymous, accounting for 0.61% of the 164 potential synonymous variants.
Conclusions Integrating FLGSA with SpliceAI, we conclude that less than 2% (1.67%) of all possible SPINK1 coding SNVs have a discernible influence on splicing outcomes. Our findings underscore the importance of performing splicing analysis in the broader genomic sequence context of the study gene, highlight the inherent uncertainties associated with intermediate SpliceAI scores (i.e., those ranging from 0.20 to 0.80), and have general implications for the shift from “retrospective” to “prospective” analysis in terms of variant classification.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research was funded by the National Natural Science Foundation of China (81800569 to HW, 82000611 to J-HL, and 82000606 to X-YT), the Shanghai Pujiang Program (2020PJD061 to J-HL), the Shanghai Sailing Program (20YF1459400 to X-YT). Support for this study also came from the Institut National de la Sante et de la Recherche Medicale (INSERM), the Association des Pancreatites Chroniques Hereditaires and the Association Gaetan Saleun, France.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* These authors share co-first authorship.
Data Availability
All data produced in the present work are contained in the manuscript.