Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A corpus of GA4GH Phenopackets: case-level phenotyping for genomic diagnostics and discovery

Daniel Danis, View ORCID ProfileMichael J Bamshad, View ORCID ProfileYasemin Bridges, View ORCID ProfilePilar Cacheiro, View ORCID ProfileLeigh C Carmody, View ORCID ProfileJessica X Chong, View ORCID ProfileBen Coleman, View ORCID ProfileRaymond Dalgleish, View ORCID ProfilePeter J Freeman, View ORCID ProfileAdam S L Graefe, View ORCID ProfileTudor Groza, View ORCID ProfileJulius O B Jacobsen, View ORCID ProfileAdam Klocperk, View ORCID ProfileMaaike Kusters, View ORCID ProfileMarkus S Ladewig, Anthony J Marcello, View ORCID ProfileTeresa Mattina, View ORCID ProfileChristopher J Mungall, View ORCID ProfileMonica C Munoz-Torres, View ORCID ProfileJustin T Reese, View ORCID ProfileFilip Rehburg, View ORCID ProfileBárbara C S Reis, Catharina Schuetz, View ORCID ProfileDamian Smedley, View ORCID ProfileTimmy Strauss, View ORCID ProfileJagadish Chandrabose Sundaramurthi, View ORCID ProfileSylvia Thun, Kyran Wissink, View ORCID ProfileJohn F Wagstaff, View ORCID ProfileDavid Zocche, View ORCID ProfileMelissa A Haendel, View ORCID ProfilePeter N Robinson
doi: https://doi.org/10.1101/2024.05.29.24308104
Daniel Danis
1The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael J Bamshad
3Department of Pediatrics, Division of Genetic Medicine, University of Washington, 1959 NE Pacific Street, Box 357371, Seattle, WA 98195, USA
4Brotman-Baty Institute for Precision Medicine, 1959 NE Pacific Street, Box 357657, Seattle WA 98195, USA
5Department of Pediatrics, Division of Genetic Medicine, Seattle Children’s Hospital, Seattle, WA 98195, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael J Bamshad
Yasemin Bridges
6William Harvey Research Institute, Queen Mary University of London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yasemin Bridges
Pilar Cacheiro
6William Harvey Research Institute, Queen Mary University of London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pilar Cacheiro
Leigh C Carmody
1The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Leigh C Carmody
Jessica X Chong
3Department of Pediatrics, Division of Genetic Medicine, University of Washington, 1959 NE Pacific Street, Box 357371, Seattle, WA 98195, USA
4Brotman-Baty Institute for Precision Medicine, 1959 NE Pacific Street, Box 357657, Seattle WA 98195, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jessica X Chong
Ben Coleman
7Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
1The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ben Coleman
Raymond Dalgleish
8Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Raymond Dalgleish
Peter J Freeman
9Division of Informatics, Imaging and Data Science, The University of Manchester, Manchester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter J Freeman
Adam S L Graefe
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam S L Graefe
Tudor Groza
10Rare Care Centre, Perth Children’s Hospital, Nedlands, WA 6009, Australia
11SingHealth Duke-NUS Institute of Precision Medicine, 5 Hospital Drive Level 9, Singapore 169609, Singapore
12Telethon Kids Institute, Nedlands, WA 6009, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tudor Groza
Julius O B Jacobsen
6William Harvey Research Institute, Queen Mary University of London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julius O B Jacobsen
Adam Klocperk
13Department of Immunology, 2nd Faculty of Medicine, Charles University and University Hospital in Motol, Prague, Czech Republic
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam Klocperk
Maaike Kusters
14Department of Paediatric Immunology, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
15University College London Institute of Child Health, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Maaike Kusters
Markus S Ladewig
16Department of Ophthalmology, University Clinic Marburg - Campus Fulda, Fulda, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Markus S Ladewig
Anthony J Marcello
3Department of Pediatrics, Division of Genetic Medicine, University of Washington, 1959 NE Pacific Street, Box 357371, Seattle, WA 98195, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Teresa Mattina
17Medica Genetics University of Catania Italy
18Morgagni foundation and Clinic, Catania, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Teresa Mattina
Christopher J Mungall
19Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christopher J Mungall
Monica C Munoz-Torres
20Department of Biomedical Informatics, University of Colorado Anschutz Medical Ccampus
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Monica C Munoz-Torres
Justin T Reese
19Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Justin T Reese
Filip Rehburg
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Filip Rehburg
Bárbara C S Reis
21Department of Immunology, National Institute of Women’s, Children’s and Adolescents’ Health Fernandes Figueira, Rio de Janeiro, Brazil
22High Complexity Laboratory, National Institute of Women’s, Children’s and Adolescents’ Health Fernandes Figueira, Rio de Janeiro, Brazil
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Bárbara C S Reis
Catharina Schuetz
23Department of Pediatrics, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
24University Center for Rare Diseases, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Damian Smedley
6William Harvey Research Institute, Queen Mary University of London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Damian Smedley
Timmy Strauss
23Department of Pediatrics, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
24University Center for Rare Diseases, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Timmy Strauss
Jagadish Chandrabose Sundaramurthi
1The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jagadish Chandrabose Sundaramurthi
Sylvia Thun
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sylvia Thun
Kyran Wissink
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
25Utrecht University, Utrecht, the Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John F Wagstaff
26University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for John F Wagstaff
David Zocche
27North West Thames Regional Genetics Service, Northwick Park & St Mark’s Hospitals, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Zocche
Melissa A Haendel
28University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Melissa A Haendel
Peter N Robinson
2Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
1The Jackson Institute for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA
29ELLIS-European Laboratory for Learning and Intelligent Systems
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter N Robinson
  • For correspondence: peter.robinson{at}jax.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Summary

The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present phenopacket-store. Version 0.1.12 of phenopacket-store includes 4916 phenopackets representing 277 Mendelian and chromosomal diseases associated with 236 genes, and 2872 unique pathogenic alleles curated from 605 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.

Competing Interest Statement

Dr. Haendel is a founder of Alamya Health.

Funding Statement

Research reported in this publication was supported by the National Human Genome Research Institute (NHGRI) at the National Institutes of Health (NIH) under award nos. 1RM1HG010860 and 5U24HG011449 and by the National Institute of Child Health and Human Development (NICHD) at the NIH under award number 5R01HD103805.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study used only data that was available in published case reports or case series (605 publications with PubMed identifiers).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted May 29, 2024.
Download PDF
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A corpus of GA4GH Phenopackets: case-level phenotyping for genomic diagnostics and discovery
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A corpus of GA4GH Phenopackets: case-level phenotyping for genomic diagnostics and discovery
Daniel Danis, Michael J Bamshad, Yasemin Bridges, Pilar Cacheiro, Leigh C Carmody, Jessica X Chong, Ben Coleman, Raymond Dalgleish, Peter J Freeman, Adam S L Graefe, Tudor Groza, Julius O B Jacobsen, Adam Klocperk, Maaike Kusters, Markus S Ladewig, Anthony J Marcello, Teresa Mattina, Christopher J Mungall, Monica C Munoz-Torres, Justin T Reese, Filip Rehburg, Bárbara C S Reis, Catharina Schuetz, Damian Smedley, Timmy Strauss, Jagadish Chandrabose Sundaramurthi, Sylvia Thun, Kyran Wissink, John F Wagstaff, David Zocche, Melissa A Haendel, Peter N Robinson
medRxiv 2024.05.29.24308104; doi: https://doi.org/10.1101/2024.05.29.24308104
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A corpus of GA4GH Phenopackets: case-level phenotyping for genomic diagnostics and discovery
Daniel Danis, Michael J Bamshad, Yasemin Bridges, Pilar Cacheiro, Leigh C Carmody, Jessica X Chong, Ben Coleman, Raymond Dalgleish, Peter J Freeman, Adam S L Graefe, Tudor Groza, Julius O B Jacobsen, Adam Klocperk, Maaike Kusters, Markus S Ladewig, Anthony J Marcello, Teresa Mattina, Christopher J Mungall, Monica C Munoz-Torres, Justin T Reese, Filip Rehburg, Bárbara C S Reis, Catharina Schuetz, Damian Smedley, Timmy Strauss, Jagadish Chandrabose Sundaramurthi, Sylvia Thun, Kyran Wissink, John F Wagstaff, David Zocche, Melissa A Haendel, Peter N Robinson
medRxiv 2024.05.29.24308104; doi: https://doi.org/10.1101/2024.05.29.24308104

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)