RT Journal Article SR Electronic T1 An end-to-end workflow for statistical analysis and inference of large-scale biomedical datasets JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.01.09.20017095 DO 10.1101/2020.01.09.20017095 A1 Heidari, Elyas A1 Sadeghi, Mohammad Amin A1 Balazadeh-Meresht, Vahid A1 Ahmadi, Nastaran A1 Sadr, Mahmoud A1 Sharifi-Zarchi, Ali A1 Mirzaei, Masoud YR 2020 UL http://medrxiv.org/content/early/2020/01/23/2020.01.09.20017095.abstract AB Throughout time, as medical and epidemiological studies have grown larger in scale, the challenges associated with extracting useful and relevant information from these data has mounted. General health surveys provide a good example for such studies as they usually cover large populations and are conducted throughout long periods in multiple locations. The challenges associated with interpreting the results of such studies include: the presence of both categorical and continuous variables and the need to compare them within a single statistical framework; the presence of variations in data resulting from the technical limitations in data collection; the danger of selection and information biases in hypothesis-directed study design and implementation; and the complete inadequacy of p values in identifying significant relationships. As a solution to these challenges, we propose an end-to-end analysis workflow using the MUltivariate analysis and VISualization (MUVIS) package within R statistical software. MUVIS consists of a comprehensive set of statistical tools that follow the basic tenet of unbiased exploration of associations within a dataset. We validate its performance by applying MUVIS to data from the Yazd Health Study (YaHS). YaHS is a prospective cohort study consisting of a general health survey of more than 30 health-related measurements and a questionnaire with over 300 questions acquired from 10050 participants. Given the nature of the YaHS dataset, most of the identified associations are corroborated by a large body of medical literature. Nevertheless, some more interesting and less investigated connections were also found which are presented here. We conclude that MUVIS provides a robust statistical framework for extraction of useful and relevant information from medical datasets and their visualization in easily comprehensible ways.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was not funded by any academic/scientific institution.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.Yes- The source code for the MUVIS package in R: https://github.com/bAIo-lab/muvis - The source code for the application of MUVIS on the YaHS dataset: https://github.com/vdblm/YaHS - The YaHS questionnaire: http://www.yahs-ziba.com/index.php/colaboration/study-catalogue