Influence of Prior Probability Information on Large Language Model Performance in Radiological Diagnosis

Takahiro Fukushima; Ryo Kurokawa; Akifumi Hagiwara; Yuki Sonoda; Yusuke Asari; Mariko Kurokawa; Jun Kanzawa; Wataru Gonoi; Osamu Abe

doi:10.1101/2024.08.27.24312693

Abstract

Background Large language models (LLMs) show promise in radiological diagnosis, but their performance may be affected by the context of the cases presented.

Purpose To investigate how providing information about prior probabilities influences the diagnostic performance of an LLM in radiological quiz cases.

Materials and Methods We analyzed 322 consecutive cases from Radiology’s “Diagnosis Please” quiz using Claude 3.5 Sonnet under three conditions: without context (Condition 1), informed as quiz cases (Condition 2), and presented as primary care cases (Condition 3). Diagnostic accuracy was compared using McNemar’s test.

Results The overall accuracy rate significantly improved in Condition 2 compared to Condition 1 (70.2% vs. 64.9%, p=0.029). Conversely, the accuracy rate significantly decreased in Condition 3 compared to Condition 1 (59.9% vs. 70.2%, p<0.001).

Conclusion Providing context about prior probabilities significantly affects the diagnostic performance of the LLM in radiological cases. This suggests that LLMs may incorporate Bayesian-like principles in their diagnostic approach, highlighting the potential for optimizing LLM’s performance in clinical settings by providing relevant contextual information.

Key Results LLM’s overall accuracy improved from 64.9% to 70.2% when informed about quiz case nature (p=0.029).

LLM’s overall accuracy decreased to 59.9% when presented with incorrect primary care context (p<0.001).

Results suggest LLMs may utilize Bayesian-like principles in diagnostic reasoning, similar to human radiologists.

Summary Statement Providing context about prior probabilities significantly influences LLM’s diagnostic performance in radiological cases, suggesting potential for optimizing LLM use in clinical practice through contextual information.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study did not receive any funding.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

This study exclusively used data from previously published articles.

Abbreviations

LLM: large language model

The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.