Guidelines For Rigorous Evaluation of Clinical LLMs For Conversational Reasoning
View ORCID ProfileShreya Johri, Jaehwan Jeong, Benjamin A. Tran, Daniel I. Schlessinger, Shannon Wongvibulsin, Zhuo Ran Cai, Roxana Daneshjou, View ORCID ProfilePranav Rajpurkar
doi: https://doi.org/10.1101/2023.09.12.23295399
Shreya Johri
1Department of Biomedical Informatics, Harvard Medical School, Boston, United States
Jaehwan Jeong
1Department of Biomedical Informatics, Harvard Medical School, Boston, United States
4Department of Computer Science, Stanford University, Stanford, United States
Benjamin A. Tran
5Medstar Georgetown University Hospital/Washington Hospital Center, Department of Dermatology, Washington, DC, United States
MDDaniel I. Schlessinger
6Department of Dermatology, Northwestern University, Chicago, IL, United States
MDShannon Wongvibulsin
7Division of Dermatology, David Geffen School of Medicine at the University of California, Los Angeles, California, United States
MD PhDZhuo Ran Cai
3Department of Dermatology, Stanford University, Stanford, United States
MDRoxana Daneshjou
2Department of Biomedical Data Science, Stanford University, Stanford, United States
3Department of Dermatology, Stanford University, Stanford, United States
MD PhDPranav Rajpurkar
1Department of Biomedical Informatics, Harvard Medical School, Boston, United States
PhD
Data Availability
Data used in the study is available on the following repository: https://github.com/rajpurkarlab/craft-md
Posted January 23, 2024.
Guidelines For Rigorous Evaluation of Clinical LLMs For Conversational Reasoning
Shreya Johri, Jaehwan Jeong, Benjamin A. Tran, Daniel I. Schlessinger, Shannon Wongvibulsin, Zhuo Ran Cai, Roxana Daneshjou, Pranav Rajpurkar
medRxiv 2023.09.12.23295399; doi: https://doi.org/10.1101/2023.09.12.23295399
Guidelines For Rigorous Evaluation of Clinical LLMs For Conversational Reasoning
Shreya Johri, Jaehwan Jeong, Benjamin A. Tran, Daniel I. Schlessinger, Shannon Wongvibulsin, Zhuo Ran Cai, Roxana Daneshjou, Pranav Rajpurkar
medRxiv 2023.09.12.23295399; doi: https://doi.org/10.1101/2023.09.12.23295399
Subject Area
Subject Areas
- Addiction Medicine (349)
- Allergy and Immunology (668)
- Allergy and Immunology (668)
- Anesthesia (181)
- Cardiovascular Medicine (2648)
- Dermatology (223)
- Emergency Medicine (399)
- Epidemiology (12228)
- Forensic Medicine (10)
- Gastroenterology (759)
- Genetic and Genomic Medicine (4103)
- Geriatric Medicine (387)
- Health Economics (680)
- Health Informatics (2657)
- Health Policy (1005)
- Hematology (363)
- HIV/AIDS (851)
- Medical Education (399)
- Medical Ethics (109)
- Nephrology (436)
- Neurology (3882)
- Nursing (209)
- Nutrition (577)
- Oncology (2030)
- Ophthalmology (585)
- Orthopedics (240)
- Otolaryngology (306)
- Pain Medicine (250)
- Palliative Medicine (75)
- Pathology (473)
- Pediatrics (1115)
- Primary Care Research (452)
- Public and Global Health (6527)
- Radiology and Imaging (1403)
- Respiratory Medicine (871)
- Rheumatology (409)
- Sports Medicine (342)
- Surgery (448)
- Toxicology (53)
- Transplantation (185)
- Urology (165)