Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Ensemble learning for higher diagnostic precision in schizophrenia using peripheral blood gene expression profile

View ORCID ProfileVipul Vilas Wagh, View ORCID ProfileSuchita Agrawal, View ORCID ProfileShruti Purohit, View ORCID ProfileTejaswini Pachpor, View ORCID ProfileLeelavati Narlikar, View ORCID ProfileVasudeo Paralikar, View ORCID ProfileSatyajeet Khare
doi: https://doi.org/10.1101/2023.02.11.23285788
Vipul Vilas Wagh
1Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune 412115, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vipul Vilas Wagh
Suchita Agrawal
2Psychiatry Unit, KEM Hospital Research Centre, Pune 411011, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Suchita Agrawal
Shruti Purohit
2Psychiatry Unit, KEM Hospital Research Centre, Pune 411011, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shruti Purohit
Tejaswini Pachpor
3MES Abasaheb Garware College, Pune 411004, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tejaswini Pachpor
Leelavati Narlikar
4Department of Data Science, Indian Institute of Science Education and Research, Pune 411008, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Leelavati Narlikar
Vasudeo Paralikar
2Psychiatry Unit, KEM Hospital Research Centre, Pune 411011, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vasudeo Paralikar
  • For correspondence: paralikarv2010{at}gmail.com satyajeetkhare{at}gmail.com
Satyajeet Khare
1Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune 412115, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Satyajeet Khare
  • For correspondence: paralikarv2010{at}gmail.com satyajeetkhare{at}gmail.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

The need for molecular biomarkers for schizophrenia has been well recognized. Peripheral blood gene expression profiling and machine learning (ML) tools have recently become popular for biomarker discovery. The stigmatization associated with schizophrenia advocates the need for diagnostic models with higher precision. In this study, we propose a strategy to develop higher-precision ML models using ensemble learning. We performed a meta-analysis using peripheral blood expression microarray data. The ML models, support vector machines (SVM), and prediction analysis for microarrays (PAM) were developed using differentially expressed genes as features. The ensemble of SVM-radial and PAM predicted test samples with a precision of 81.33% (SD: 0.078). The precision of the ensemble model was significantly higher than SVM-radial (63.83%, SD: 0.081) and PAM (66.89%, SD: 0.097). The feature genes identified were enriched for biological processes such as response to stress, response to stimulus, regulation of the immune system, and metabolism of organic nitrogen compounds. The network analysis of feature genes identified PRF1, GZMB, IL2RB, ITGAL, and IL2RG as hub genes. Additionally, the ensemble model developed using microarray data classified the RNA-Sequencing samples with moderately high precision (72.00%, SD: 0.08). The pipeline developed in this study allows the prediction of a single microarray and RNA-Sequencing sample. In summary, this study developed robust models for clinical application and suggested ensemble learning for higher diagnostic precision in psychiatric disorders.

Research highlights

  • Ensemble learning of Support Vector Machines (SVM) and Prediction Analysis for Microarrays (PAM) algorithms classified schizophrenia samples with higher precision.

  • The pipeline developed in this analysis produced robust models with the ability to classify single microarray sample.

  • Cross-platform validation of ensemble model using RNA-Sequencing data resulted in high precision.

Figure
  • Download figure
  • Open in new tab

Graphical abstractBlood based SCZ diagnosis using ensemble learning for higher precision

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The study was funded by an intramural research grant (MjRP/19-20/1516) from Symbiosis Centre for Research & Innovation (SCRI), SIU, Pune, India. The first author of this study received research fellowships from UGC, New Delhi, to carry out this work.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

KEM Hospital Research Centre Ethics Committee (KEMHRC ID No. 2001) and Symbiosis International (Deemed University) Independent Ethics Committee (SIU/IEC/99) approved the study protocol. Informed consent was obtained from all the participants. The consent for participants with schizophrenia was supported by the consent of a first-degree relative. Clinical interviews were administered by a trained psychiatrist and a psychologist. The diagnosis was confirmed by a senior psychiatrist. All the participants were compensated for their travel and time.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • 1. External test data validation: The title of section 3.4, "Cross-platform validation of ensemble models," has been changed to "External cross-platform validation of the ensemble models." 2. RNA-Sequencing QC included: The FastQC report has been included as a supplementary Figure 9 and the alignment percentage as a supplementary Table 1. 3. Results updated for 20 RNA-Seq samples: The results in section 3.4 and Figure 4 have been updated for 20 samples. 4. Updated Title: The title of the manuscript is updated to "Ensemble learning for higher diagnostic precision in schizophrenia using peripheral blood gene expression profile"

Data Availability

Data availability: The RNA-Sequencing data of this study will be available from the corresponding authors upon publication. Code availability: The R scripts used for the analysis are available on GitHub (https://github.com/macdlab/2023_VW_SCZ_Ensemble)

https://github.com/macdlab/2023_VW_SCZ_Ensemble

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted May 26, 2023.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Ensemble learning for higher diagnostic precision in schizophrenia using peripheral blood gene expression profile
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Ensemble learning for higher diagnostic precision in schizophrenia using peripheral blood gene expression profile
Vipul Vilas Wagh, Suchita Agrawal, Shruti Purohit, Tejaswini Pachpor, Leelavati Narlikar, Vasudeo Paralikar, Satyajeet Khare
medRxiv 2023.02.11.23285788; doi: https://doi.org/10.1101/2023.02.11.23285788
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Ensemble learning for higher diagnostic precision in schizophrenia using peripheral blood gene expression profile
Vipul Vilas Wagh, Suchita Agrawal, Shruti Purohit, Tejaswini Pachpor, Leelavati Narlikar, Vasudeo Paralikar, Satyajeet Khare
medRxiv 2023.02.11.23285788; doi: https://doi.org/10.1101/2023.02.11.23285788

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Psychiatry and Clinical Psychology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)