Leveraging Large Language Models for Identifying Interpretable Linguistic Markers and Enhancing Alzheimer’s Disease Diagnostics

Tingyu Mo; Jacqueline C. K. Lam; Victor O. K. Li; Lawrence Y. L. Cheung

doi:10.1101/2024.08.22.24312463

Abstract

Alzheimer’s disease (AD) is a progressive and irreversible neurodegenerative disorder. Early detection of AD is crucial for timely disease intervention. This study proposes a novel LLM frame-work, which extracts interpretable linguistic markers from LLM models and incorporates them into supervised AD detection models, while evaluating their model performance and interpretability. Our work consists of the following novelties: First, we design in-context few-shot and zero-shot prompting strategies to facilitate LLMs in extracting high-level linguistic markers discriminative of AD and NC, providing interpretation and assessment of their strength, reliability and relevance to AD classification. Second, we incorporate linguistic markers extracted by LLMs into a smaller AI-driven model to enhance the performance of downstream supervised learning for AD classification, by assigning higher weights to the high-level linguistic markers/features extracted from LLMs. Third, we investigate whether the linguistic markers extracted by LLMs can enhance theaccuracy and interpretability of the downstream supervised learning-based models for AD detection. Our findings suggest that the accuracy of the LLM-extracted linguistic markers-led supervised learning model is less desirable as compared to their counterparts that do not incorporate LLM-extracted markers, highlighting the tradeoffs between interpretability and accuracy in supervised AD classification. Although the use of these interpretable markers may not immediately lead to improved detection accuracy, they significantly improve medical diagnosis and trustworthiness. These interpretable markers allow healthcare professionals to gain a deeper understanding of the linguistic changes that occur in individuals with AD, enabling them to make more informed decisions and provide better patient care.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study did not receive any funding

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

https://dementia.talkbank.org/ADReSSo-2021/

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All data produced are available online at https://dementia.talkbank.org/ADReSSo-2021/

The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.