RT Journal Article SR Electronic T1 An artificial neural network approach integrating plasma proteomics and genetic data identifies PLXNA4 as a new susceptibility locus for pulmonary embolism JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.10.05.20207001 DO 10.1101/2020.10.05.20207001 A1 Razzaq, Misbah A1 Iglesias, Maria Jesus A1 Ibrahim-Kosta, Manal A1 Goumidi, Louisa A1 Soukarieh, Omar A1 Proust, Carole A1 Roux, Maguelonne A1 Suchon, Pierre A1 Boland, Anne A1 Daiain, Delphine A1 Olaso, Robert A1 Butler, Lynn A1 Deleuze, Jean-François A1 Odeberg, Jacob A1 Morange, Pierre-Emmanuel A1 Trégouët, David-Alexandre YR 2020 UL http://medrxiv.org/content/early/2020/10/06/2020.10.05.20207001.abstract AB Venous thromboembolism is the third common cardiovascular disease and is composed of two entities, deep vein thrombosis (DVT) and its fatal form, pulmonary embolism (PE). While PE is observed in ∼40% of patients with documented DVT, there is limited biomarkers that can help identifying patients at high PE risk.To fill this need, we implemented a two hidden-layers artificial neural networks (ANN) on 376 antibodies and 19 biological traits measured in the plasma of 1388 DVT patients, with or without PE, of the MARTHA study. We used the LIME algorithm to obtain a linear approximate of the resulting ANN prediction model. As MARTHA patients were typed for genotyping DNA arrays, a genome wide association study (GWAS) was conducted on the LIME estimate. Detected single nucleotide polymorphisms (SNPs) were tested for association with PE risk in MARTHA. Main findings were replicated in the EOVT study composed of 143 PE patients and 196 DVT only patients.The derived ANN model for PE achieved an accuracy of 0.89 and 0.79 in our training and testing sets, respectively. A GWAS on the LIME approximate identified a strong statistical association peak (p = 5.3×10−7) at the PLXNA4 locus, with lead SNP rs1424597 at which the minor A allele was further shown to associate with an increased risk of PE (OR = 1.49 [1.12 – 1.98], p = 6.1×10−3). Further association analysis in EOVT revealed that, in the combined MARTHA and EOVT samples, the rs1424597-A allele was associated with increased PE risk (OR = 1.74 [1.27 – 2.38, p = 5.42×10−4) in patients over 37 years of age but not in younger patients (OR = 0.96 [0.65 – 1.41], p = 0.848).Using an original integrated proteomics and genetics strategy, we identified PLXNA4 as a new susceptibility gene for PE whose exact role now needs to be further elucidated.Author Summary Pulmonary embolism is a severe and potentially fatal condition characterized by the presence of a blood clot (or thrombus) in the pulmonary artery. Pulmonary embolism is often the consequence of the migration of a thrombus from a deep vein to the lung. Together with deep vein thrombosis, pulmonary embolism forms the so-called venous thromboembolism, the third most common cardiovascular disease, and its prevalence strongly increases with age. While pulmonary embolism is observed in ∼40% of patients with deep vein thrombosis, there is currenly limited biomarkers that can help predicting which patients with deep vein thrombosis are at risk of pulmonary embolism. We here deployed an Artificial Intelligence based methodology integrating both plasma proteomics and genetics data to identify novel biomarkers for PE. We thus identified the PLXNA4 gene as a novel molecular player involved in the pathophysiology of pulmonary embolism. In particular, using two independent cohorts totalling 1,881 patients with venous thromboembolism among which 467 experienced pulmonary embolism, we identified a genetic polymorphism in the PLXNA4 gene that associates with ∼2 fold increased risk of pulmonary embolism in patients aged more than ∼40 years.Competing Interest StatementThe authors have declared no competing interest.Funding StatementMi.R, O.S, Ma.R, and the production of the MARTHA genomics data were financially supported by the GENMED Laboratory of Excellence on Medical Genomics [ANR-10-LABX-0013], a research program managed by the National Research Agency (ANR) as part of the French Investment for the Future. This work benefited from the financial support from the EPIDEMIOM-VTE Senior Chair from the Initiative of Excellence of the University of Bordeaux. Bioinformatics and statistical analyses benefit from the CBiB computing centre of the University of Bordeaux. The proteomics screening was financed by a grant from Stockholm County Council (SLL 2017-0842) and from Familjen Erling Perssons Foundation.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Each individual study on which the work is based was approved by its institutional ethics committee and informed written consent was obtained in accordance with the Declaration of Helsinki. Ethics approval were obtained from the Departement santé de la direction générale de la recherche et de l’innovation du ministère (Projects DC: 2008-880 & 09.576) and from the institutional ethics committees of the Kremlin-Bicetre Hospital.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData are available upon specific request to corresponding authors.