RT Journal Article
SR Electronic
T1 An end-to-end workflow for statistical analysis and inference of large-scale biomedical datasets
JF medRxiv
FD Cold Spring Harbor Laboratory Press
SP 2020.01.09.20017095
DO 10.1101/2020.01.09.20017095
A1 Heidari, Elyas
A1 Sadeghi, Mohammad Amin
A1 Balazadeh-Meresht, Vahid
A1 Ahmadi, Nastaran
A1 Sadr, Mahmoud
A1 Sharifi-Zarchi, Ali
A1 Mirzaei, Masoud
YR 2020
UL http://medrxiv.org/content/early/2020/01/23/2020.01.09.20017095.abstract
AB Throughout time, as medical and epidemiological studies have grown larger in scale, the challenges associated with extracting useful and relevant information from these data has mounted. General health surveys provide a good example for such studies as they usually cover large populations and are conducted throughout long periods in multiple locations. The challenges associated with interpreting the results of such studies include: the presence of both categorical and continuous variables and the need to compare them within a single statistical framework; the presence of variations in data resulting from the technical limitations in data collection; the danger of selection and information biases in hypothesis-directed study design and implementation; and the complete inadequacy of p values in identifying significant relationships. As a solution to these challenges, we propose an end-to-end analysis workflow using the MUltivariate analysis and VISualization (MUVIS) package within R statistical software. MUVIS consists of a comprehensive set of statistical tools that follow the basic tenet of unbiased exploration of associations within a dataset. We validate its performance by applying MUVIS to data from the Yazd Health Study (YaHS). YaHS is a prospective cohort study consisting of a general health survey of more than 30 health-related measurements and a questionnaire with over 300 questions acquired from 10050 participants. Given the nature of the YaHS dataset, most of the identified associations are corroborated by a large body of medical literature. Nevertheless, some more interesting and less investigated connections were also found which are presented here. We conclude that MUVIS provides a robust statistical framework for extraction of useful and relevant information from medical datasets and their visualization in easily comprehensible ways.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was not funded by any academic/scientific institution.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.Yes- The source code for the MUVIS package in R: https://github.com/bAIo-lab/muvis - The source code for the application of MUVIS on the YaHS dataset: https://github.com/vdblm/YaHS - The YaHS questionnaire: http://www.yahs-ziba.com/index.php/colaboration/study-catalogue