Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications

David Restrepo, Chenwei Wu, Sebastián Andrés Cajas, Luis Filipe Nakayama, Leo Anthony Celi, Diego M López
doi: https://doi.org/10.1101/2024.06.03.24308401
David Restrepo
1Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
2Departamento de Telemática, Universidad del Cauca, Popayán, Cauca, Colombia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: davidres{at}mit.edu
Chenwei Wu
3Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sebastián Andrés Cajas
4John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, Massachusetts, United States of America
5School of Computer Science, University College Dublin, Belfield, Dublin, Ireland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Luis Filipe Nakayama
1Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
6Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Leo Anthony Celi
1Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
7Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, Massachusetts, United States of America
8Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Diego M López
2Departamento de Telemática, Universidad del Cauca, Popayán, Cauca, Colombia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Objective Large-scale multi-modal deep learning models and datasets have revolutionized various domains such as healthcare, underscoring the critical role of computational power. However, in resource-constrained regions like Low and Middle-Income Countries (LMICs), GPU and data access is limited, leaving many dependent solely on CPUs. To address this, we advocate leveraging vector embeddings for flexible and efficient computational methodologies, aiming to democratize multimodal deep learning across diverse contexts.

Background and Significance Our paper investigates the computational efficiency and effectiveness of leveraging vector embeddings, extracted from single-modal foundation models and multi-modal Vision-Language Models (VLM), for multimodal deep learning in low-resource environments, particularly in health-care applications. Additionally, we propose an easy but effective inference-time method to enhance performance by further aligning image-text embeddings.

Materials and Methods By comparing these approaches with traditional multimodal deep learning methods, we assess their impact on computational efficiency and model performance using accuracy, F1-score, inference time, training time, and memory usage across 3 medical modalities such as BRSET (ophthalmology), HAM10000 (dermatology), and SatelliteBench (public health).

Results Our findings indicate that embeddings reduce computational demands without compromising the model’s performance, and show that our embedding alignment method improves the performance of the models in medical tasks.

Discussion This research contributes to sustainable AI practices by optimizing computational resources in resource-constrained environments. It highlights the potential of embedding-based approaches for efficient multimodal learning.

Conclusion Vector embeddings democratize multimodal deep learning in LMICs, especially in healthcare. Our study showcases their effectiveness, enhancing AI adaptability in varied use cases.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This research received no external funding.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All the data used in this study is publicly available.

https://physionet.org/content/brazilian-ophthalmological/1.0.0/

https://physionet.org/content/multimodal-satellite-data/1.0.0/

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted June 04, 2024.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications
David Restrepo, Chenwei Wu, Sebastián Andrés Cajas, Luis Filipe Nakayama, Leo Anthony Celi, Diego M López
medRxiv 2024.06.03.24308401; doi: https://doi.org/10.1101/2024.06.03.24308401
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications
David Restrepo, Chenwei Wu, Sebastián Andrés Cajas, Luis Filipe Nakayama, Leo Anthony Celi, Diego M López
medRxiv 2024.06.03.24308401; doi: https://doi.org/10.1101/2024.06.03.24308401

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)