RT Journal Article SR Electronic T1 A Comprehensive Study of GPT-4V’s Multimodal Capabilities in Medical Imaging JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.11.03.23298067 DO 10.1101/2023.11.03.23298067 A1 Li, Yingshu A1 Liu, Yunyi A1 Wang, Zhanyu A1 Liang, Xinyu A1 Liu, Lingqiao A1 Wang, Lei A1 Cui, Leyang A1 Tu, Zhaopeng A1 Wang, Longyue A1 Zhou, Luping YR 2023 UL http://medrxiv.org/content/early/2023/11/04/2023.11.03.23298067.abstract AB This paper presents a comprehensive evaluation of GPT-4V’s capabilities across diverse medical imaging tasks, including Radiology Report Generation, Medical Visual Question Answering (VQA), and Visual Grounding. While prior efforts have explored GPT-4V’s performance in medical imaging, to the best of our knowledge, our study represents the first quantitative evaluation on publicly available benchmarks. Our findings highlight GPT-4V’s potential in generating descriptive reports for chest X-ray images, particularly when guided by well-structured prompts. However, its performance on the MIMIC-CXR dataset benchmark reveals areas for improvement in certain evaluation metrics, such as CIDEr. In the domain of Medical VQA, GPT-4V demonstrates proficiency in distinguishing between question types but falls short of prevailing benchmarks in terms of accuracy. Furthermore, our analysis finds the limitations of conventional evaluation metrics like the BLEU score, advocating for the development of more semantically robust assessment methods. In the field of Visual Grounding, GPT-4V exhibits preliminary promise in recognizing bounding boxes, but its precision is lacking, especially in identifying specific medical organs and signs. Our evaluation underscores the significant potential of GPT-4V in the medical imaging domain, while also emphasizing the need for targeted refinements to fully unlock its capabilities.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study did not receive any funding.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study used ONLY openly available human data that were originally located at: https://physionet.org/content/mimic-cxr/2.0.0/.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced are available online at https://physionet.org/content/mimic-cxr/2.0.0/.