Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset

Abdulmateen Adebiyi, Nader Abdalnabi, Emily Hoffman Smith, Jesse Hirner, Eduardo J. Simoes, Mirna Becevic, Praveen Rao
doi: https://doi.org/10.1101/2024.05.30.24308213
Abdulmateen Adebiyi
1Department of Electrical Engineering and Computer Science, University of Missouri, United States
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: aaangd{at}umsystem.edu
Nader Abdalnabi
2MU Institute for Data Science and Informatics, University of Missouri, United States
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Emily Hoffman Smith
5Department of Dermatology, Saint Louis University, United States
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jesse Hirner
3Department of Dermatology, University of Missouri, United States
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eduardo J. Simoes
2MU Institute for Data Science and Informatics, University of Missouri, United States
4Department of Biomedical Informatics, Biostatistics and Medical Epidemiology, University of Missouri, United States
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mirna Becevic
2MU Institute for Data Science and Informatics, University of Missouri, United States
3Department of Dermatology, University of Missouri, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Praveen Rao
1Department of Electrical Engineering and Computer Science, University of Missouri, United States
2MU Institute for Data Science and Informatics, University of Missouri, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: aaangd{at}umsystem.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Objectives Our aim is to evaluate the performance of multimodal deep learning to classify skin lesions using both images and textual descriptions compared to learning only on images.

Materials and Methods We used the HAM10000 dataset in our study containing 10,000 skin lesion images. We combined the images with patients’ data (sex, age, and lesion location) for training and evaluating a multimodal deep learning classification model. The dataset was split into 70% for training the model, 20% for the validation set, and 10% for the testing set. We compared the multimodal model’s performance to well-known deep learning models that only use images for classification.

Results We used accuracy and area under the curve (AUC) receiver operating characteristic (ROC) as the metrics to compare the models’ performance. Our multimodal model achieved the best accuracy (94.11%) and AUCROC (0.9426) compared to its competitors.

Conclusion Our study showed that a multimodal deep learning model can outperform traditional deep learning models for skin lesion classification on the HAM10000 dataset. We believe our approach can enable primary care clinicians to screen for skin cancer in patients (residing in areas lacking access to expert dermatologists) with higher accuracy and reliability.

Lay Summary Skin cancer, which includes basal cell carcinoma, squamous cell carcinoma, melanoma, and less frequent lesions, is the most frequent type of cancer. Around 9,500 people in the United States are diagnosed with skin cancer every day. Recently, multimodal learning has gained a lot of traction for classification tasks. Many of the previous works used only images for skin lesion classification. In this work, we used the images and patient metadata (sex, age, and lesion location) in HAM10000, a publicly available dataset, for multimodal deep learning to classify skin lesions. We used the model ALBEF (Align before Fuse) for multimodal deep learning. We compared the performance of ALBEF to well-known deep learning models that only use images (e.g., Inception-v3, DenseNet121, ResNet50). The ALBEF model outperformed all other models achieving an accuracy of 94.11% and an AUROC score of 0.9426 on HAM10000. We believe our model can enable primary care clinicians to accurately screen for skin cancer in patients.

Background and Significance

Skin cancer is the most common type of cancer diagnosed worldwide1. It is estimated that approximately 9,500 people in the United States (US) are diagnosed with skin cancer each year2. It is predicted that around 20% of people in the US will develop skin cancer2. The two most common skin cancer types are basal cell cancer and squamous cell cancer, while melanoma is the third most common skin cancer2. However, melanoma has the highest mortality, with a long-term survival of less than 10%, despite recent decline in mortality attributed to better treatment3.

There are geographical differences in the incidence and mortality of skin cancer4. The incidence and mortality of melanoma is higher for individuals living in rural and underserved areas, than their urban counterparts4. Many factors may contribute to this geographical disparity in melanoma incidence and death, including increased ultraviolet radiation exposure and lower adoption of sun protection strategies in rural compared to urban residents5. In addition, barriers to health care access and availability contributes to late detection and effective treatment of skin cancers. The lack of adequate number and distribution of dermatologists contributes to late detection of melanoma. Patients face considerable wait times ranging from 33.9 to 73.4 days to consult with a dermatologist regarding changing moles. Interestingly, even when medical care is offered at no cost, a significant number of patients decline it if the travel distance for their appointment exceeds 20 miles6.

This situation is accentuated in isolated rural areas. Tele-dermatology has proven effective in mitigating geographic isolation, and various studies have affirmed its diagnostic and treatment accuracy and reliability through telemedicine. Nonetheless, numerous obstacles hinder its widespread adoption and implementation. Alongside privacy and liability concerns, dermatologists identify the absence of a consistent reimbursement system as the primary impediment for both store-and-forward and live-interactive tele-dermatology7. Karavan et al. discovered that 40% of patients diagnosed with melanoma in traditional clinics resided in areas where tele-dermatology services were underutilized8.

Primary care clinicians based (PCCs) in the community often serve as the initial point of contact for patients and may assume a crucial role in offering screening and early diagnosis, particularly for individuals lacking sufficient access to dermatologists7. PCCs, while essential, face limitations compared to dermatologists in terms of early detection and have identified insufficient training during medical school and residency as hindrances to effective skin screening9. In primary care settings, the rapid and accurate diagnosis of skin conditions is of paramount importance for patient care9. However, distinguishing between various skin lesions can be a challenging task, and misdiagnosis can lead to serious consequences10.

Prior work showed the effectiveness of state-of-the-art deep learning models for skin lesion classification (into malignant and benign classes) using dermascopy images11. With growing interest in multimodal deep learning models12, it is now possible to combine skin lesion images and textual data (e.g., lesion location, patient age) for model training and inference. In this work, we investigate whether multimodal models can improve the accuracy of skin cancer diagnosis compared to models that only use images. Our approach, uses the Human Against Machine with 10000 training Images (HAM10000), a dataset comprising a diverse range of dermatological images13 including basal cell cancer and melanoma.

Methods

Dataset

In this study, we used the well-known HAM10000 dataset13, which is a large dermatoscopic image collection of common pigmented skin lesions. The images were categorized into 7 classes, namely, Actinic Keratoses (AKIEC), Basal Cell Carcinoma (BCC), Benign Keratosis (BKL), Dermatofibroma (DF), Melanocytic Nevi (NV), Melanoma (MEL), and Vascular Skin Lesion (VASC). Table 1 shows examples of skin lesion images from each class along with the number of images per class.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1: Examples of skin lesions in HAM10000

In addition to the images, the dataset had 7 variables, which are described in Table 2.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2: Brief description of the variables in HAM10000
View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3: Different metrics used in our evaluation
View this table:
  • View inline
  • View popup
Table 4: Comparison of the performance of the different models on the HAM10000 dataset (the best model is shown in bold)

Deep Learning Models

In recent years, deep learning has gained a lot of traction in image understanding, image classification, language translation, speech recognition, and natural language processing. Convolutional neural networks (CNNs) have shown excellent performance in large-scale image classification and object detection competitions such as the ImageNet Large Scale Visual Recognition Challenge (ILSVRC)14. We present a few popular CNNs and the Multimodal that we used in our work.

Inception-V3

CNNs that are deep with many layers are prone to overfitting and consume a lot of computational resources. Inception-V3 was introduced by Szegedy et. al. in 201415 to solve these problems by using sparsely connected architectures. It uses the inception module that applies multiple convolutions (e.g., 1×1 convolution, 3×3 convolution, 5×5 convolution) and a maximum pooling layer. The outputs are concatenated to create the input for the next stage. A version of Inception-V3 called GoogLeNet with 22 layers won the ILSVRC 2014 competition16. Inception-V3 is an image recognition model that achieved around 78.1% accuracy on the ImageNet dataset. The model was first introduced and implemented in the paper “Rethinking the inception Architecture for Computer Vision”

ResNet

He et. al. introduced the deep residual neural network (ResNet) architecture, which won first place in the ILSVRC 2015 competition17. ResNet uses skip connections between layers to solve the vanishing gradient problem. Residual blocks reduced the total parameters by allowing the gradient to flow directly through the skipped connections backward from later layers to the initial filter. In our work, we use ResNet50 which has 50 layers.

DenseNet

Huang et. al. introduced the DenseNet18 architecture where every layer in the model is connected to every other layer in a feed-forward manner. DenseNet was proposed to solve the problem of vanishing gradient while being computationally efficient. It promotes feature reuse resulting in a more compact model. In our work, we use DenseNet121 which has 121 layers.

ALBEF

Junnan Li et al introduced ALBEF model19.It is a state-of-the-art deep learning model that learns the joint representation of image and text data.The model combines the Vision Transformer (ViT-b/16) as the image encoder and BERT as the text encoder. We used the joint text-image encoder to encode both the text and images, and then add a linear fully connected layer to it. The Image encoder are initialized with weights pre-trained on ImageNet-1k20. The input image is encoded into a sequence of embeddings.

In our work, we used a joint text-image encoder which aligns the BERT text encoder’s embeddings with the image encoder’s (Vision Transformers). We then added a linearly fully connected layer and then predict the 7 output classes of AKIEC, BCC, BKL, DF, NV, MEL and VASC.

Data Pre-processing

Each original image in HAM10000 was of size 600×450 pixels. As different deep learning models used different image sizes as input, we had to resize the original images. For ResNet50 and DenseNet121, the images were resized to 224×224 pixels. For Inception-V3, the images were resized to 299×299 pixels. Finally, for ALBEF, the images were resized to 256×256 pixels. We also applied data augmentation techniques such as color jitter, random rotation, random horizontal flip, and random vertical flip to improve the performance of the trained models and prevent overfitting.

Skin Lesion Classification Approach

We split the HAM10000 dataset into three sets randomly: the training set (containing 70% of the images), the validation set (containing 20% of the images), and the testing set (containing 10% of the images). The training and validation sets were used for the training phase. We then used the testing set to evaluate the trained models. The test set had the following number of images for each class: AKIEC: 38, BCC: 49, BKL: 110, DF: 11, MEL: 109, NV: 667, and VASC: 18.

Figure 1 shows the overall approach for skin lesion classification. Our system was implemented in Python using PyTorch21, CUDA, Numpy22, and OpenCV23 libraries. We used existing implementations of the ALBEF model for our multimodal lesion classification24. The models were trained and tested on a Dell Precision server with an Intel Xeon processor,96 GB RAM, 2 TB disk storage, and two NVIDIA Quadro RTX4000 (8GB) graphics processing units (GPUs).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Skin lesion classification approach

Model Training Settings

The Inception-V3, ResNet50, DenseNet121 and ALBEF models were trained with the same hyperparameters: (a) batch size of 16, (b) 200 epochs, and (c) learning rate of 1e-4. Also, each model used the Adam optimizer25 and binary cross entropy loss function. The best model based on the highest validation accuracy was saved and used for classification on the test set.

For the Inception-V3, ResNet50, and DenseNet121 model, we performed data augmentation on the lesions using RandomHorizontalFlip, RandomverticalFlip, RandomRotation, ColorJitter and we then transformed it to tensor using the ToTensor method in Pytorch. We trained 70% of the data using a batch size of 16 in 200 epochs. We then picked our best model by using 20% of the dataset as the validation set.

In our ALBEF model, we resized the image to 256 pixels. We then transformed the lesions using RandomResizedCrop and RandomHorizontalFlip. Our other hyperparameters are patch size of 16, Embedding dimension (embed_dim) of 768, depth of 12, num_heads of 12, weight_decay of 0.01, Multi-Layer Perceptron (mlp_ratio) of 4, Query Key Value (qkv_bias) of True and Epsilon (eps) of 1e-4. We trained 70% of the data using a batch size of 16 in 200 epochs. We then picked our best model by using 20% of the dataset as the validation set.

Results

In this section, we present the performance metrics of InceptionV3, ResNet50, DenseNet121, and the Multimodal fusion (ALBEF) on the HAM10000 dataset. On the ALBEF model, we performed two different settings.

  1. Using the Images with the associated text (age, sex, and lesion location) for training the HAM10000 dataset.

  2. Using only the Images and passing blank as the text for training on the HAM10000 dataset. We performed this experiment to show the effect of adding text on the overall performance of the model.

Performance Metrics

Next, we briefly describe the performance metrics that we used for evaluating our different models. The Classification models aim to classify the HAM10000 dataset into 7 classes. We used true positives (Tp), true negatives (Tn), false positives (Fp), and false negatives (Fn) in the computation of our different performance metrics. Below are the brief definitions of Tp, Fp, Fn, and Tn. Tp indicates the total number of HAM10000 Skin Lesions that were predicted correctly in the positive class, Tn denotes the lesions that are in the negative class and are classified correctly as the negative class. Fn denotes the total number of skin lesions that were in the positive class but were predicted incorrectly as the negative class and Fp are the lesions that are in the negative class but were predicted incorrectly as the positive class.

Discussion and Conclusion

We implemented the multimodal deep learning models to classify the skin lesions in the HAM1000013 dataset. The sensitivity, specificity, AUCROC, accuracy and precision of classification using the ALBEF model with the image dataset and the associated text was the highest among the five different experiments that we conducted and deemed applicable in a primary care setting.

Multimodal model is a new field of Artificial Intelligence that helps to replicate the ability to combine information from multiple models. Information from different sources like audio, text, image, and video assists in implementing more complex models that improve the performance of many applications26.Multimodal learning includes fusion-based approaches, alignment-based approaches, and late fusion.

The ALBEF model (Align Before Fuse) enables us to fuse visual and clinical information, providing a holistic view of the skin lesions19.This Multimodal approach has the potential to significantly enhance the diagnostic capabilities of primary care physicians and nurse practitioners27. The ALBEF model used BERT as text encoder and Vision Transformer model as the image encoder. We used the joint text-image encoder to encode both the text and images and added a linear fully connected layer to it. ALBEF model has been used in a wide array of domains such as Hateful detection, Image Retrieval etc19. ALBEF can learn joint representations which has made it very useful in Image retrieval tasks such as in text queries, text-based on images e-commerce applications, etc. It has also been applied in natural language processing for generating text captions for images. It can be used in classification tasks too24,28.

Adebiyi et al applied three different models (InceptionV3, ResNet50 and DenseNet121) for their skin lesion classification11,29. They collected 770 de-identified dermoscopy images from the University of Missouri (MU) Healthcare. They then created three unique images that contained the original images and images after they applied a hair removal algorithm. DenseNet121 achieved the best accuracy of 80.52% and an AUCROC score of 0.81 in their experiment.

Alam et al achieved an accuracy of 85% on the HAM10000 dataset by using the Inception-V3 model30. Akter et al achieved an accuracy of 0.82 on the HAM10000 dataset using the ResNet50 model. Our ALBEF model outperformed their works by achieving an accuracy of 0.941131.

Tschandi et al applied Multimodal learning for skin lesion classification32.They employed two ResNet50 convolutional neural networks (CNN) followed by a late fusion technique to combine the features. Their result showed that combining the dermoscopic with macroscopic images and the metadata may improve network performance. Multimodal machine learning has been applied in Knee Osteoarthritis progression prediction from Plain Radiographs and Clinical Data by Tiulpin et al33. They utilized the raw radiographic data, clinical examination results, and previous medical history of the patient. They were able to achieve an area under the ROC curve (AUC) of 0.79.

We applied the Inception-V3, ResNet50, DenseNet121 and ALBEF models in our experiment. Our InceptionV3 model achieved an accuracy of 0.8653 and an AUCROC of 0.8589 on the HAM10000 dataset. The ResNet50 model achieved an accuracy of 0.8503 and an AUCROC of 0.8388. The DenseNet121 achieved an accuracy of 0.8862 and an AUCROC of 0.8967. We also implemented Multimodal models that combines the text features in the HAM10000 dataset and the Lesion for the Classification. The text we used in our experiments are the age, sex, and location of the lesion. The Multimodal model we used in this work is the well-known ALBEF (Align Before Fuse) model. We performed two experiments using the ALBEF model. In the first experiment, we used the lesion and passed the text as blank. This achieved an accuracy of 0.9132 and an AUCROC of 0.9136. In the second experiment on the ALBEF model, we used both image and the text. This achieved an accuracy of 0.9411 and an AUCROC of 0.9426. Overall, our Multimodal model outperformed all the other models. This shows that the addition of other patient’s feature specifically text in this instance can improve the overall performance of the Skin Lesion Classification.

Our study has some limitations. We only used three metadata (Age, Sex, and Location) with the images in our multimodal classification. We believe having more metadata in the dataset may even increase our performance.

We recommend future studies utilize additional text features, as our study was limited to age, sex, and localization data.

Data Availability

All data produced in the present study are available upon reasonable request to the authors

Authors Contribution

PR, EJS, MB, and EHS conceived the idea of multimodal learning on skin lesion images. AA implemented and evaluated the deep learning models. AA and PR designed the experiments and analyzed the results. JH and EHS provided clinical insights for the study. All authors were involved in drafting and editing the manuscript.

Acknowledgments

This project was funded by the Translational Research Informing Useful and Meaningful Precision Health (TRIUMPH) grant from the University of Missouri-Columbia.

Footnotes

  • This is the updated version

References

  1. 1.↵
    Working under the sun causes 1 in 3 deaths from non-melanoma skin cancer, say WHO and. https://www.iarc.who.int/cancer-type/skin-cancer
  2. 2.↵
    Skin cancer https://www.aad.org/media/stats-skin-cancer
  3. 3.↵
    Melanoma of the Skin - Cancer Stat Facts. Available from: https://seer.cancer.gov/statfacts/html/melan.html
  4. 4.↵
    Blake KD, Moss JL, Gaysynsky A, Srinivasan S, Croyle RT. Making the Case for Investment in Rural Cancer Control: An Analysis of Rural Cancer Incidence, Mortality, and Funding Trends. Cancer Epidemiol Biomarkers Prev. 2017 Jul;26(7):992–7.
    OpenUrlAbstract/FREE Full Text
  5. 5.↵
    Kalia S, Kwong Y k. k., Haiducu M l., Lui H. Comparison of sun protection behaviour among urban and rural health regions in Canada. Journal of the European Academy of Dermatology and Venereology. 2013;27(11):1452–4.
    OpenUrl
  6. 6.↵
    Pala P, Bergler-Czop BS, Gwiżdż JM. Teledermatology: idea, benefits and risks of modern age – a systematic review based on melanoma. Postepy Dermatol Alergol. 2020 Apr;37(2):159–67.
    OpenUrl
  7. 7.↵
    Jones OT, Jurascheck LC, van Melle MA, Hickman S, Burrows NP, Hall PN, et al. Dermoscopy for melanoma detection and triage in primary care: a systematic review. BMJ Open. 2019 Aug 20;9(8):e027529.
    OpenUrlAbstract/FREE Full Text
  8. 8.↵
    Karavan M, Compton N, Knezevich S, Raugi G, Kodama S, Taylor L, et al. Teledermatology in the diagnosis of melanoma. J Telemed Telecare. 2014 Jan;20(1):18–23.
    OpenUrlCrossRefPubMed
  9. 9.↵
    Brown AE, Najmi M, Duke T, Grabell DA, Koshelev MV, Nelson KC. Skin Cancer Education Interventions for Primary Care Providers: A Scoping Review. J Gen Intern Med. 2022 Jul;37(9):2267–79.
    OpenUrl
  10. 10.↵
    Li H, Pan Y, Zhao J, Zhang L. Skin disease diagnosis with deep learning: A review. Neurocomputing. 2021 Nov 13;464:364–93.
    OpenUrl
  11. 11.↵
    Adebiyi A, Rao P, Hirner J, Anokhin A, Hoffman Smith E, Simoes E, and Becevic M. Comparison of Three Deep Learning Models in Accurate Classification of 770 Dermoscopy Skin Lesion Images. In AMIA 2024 Informatics Summit, 8 pages, Boston, 2024.
  12. 12.↵
    Huang Y, Du C, Xue Z, Chen X, Zhao H, Huang L. What Makes Multi-Modal Learning Better than Single (Provably). In: Advances in Neural Information Processing Systems [Internet]. Curran Associates, Inc.; 2021 [cited 2023 Dec 15]. p. 10944–56.
  13. 13.↵
    Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data. 2018 Aug 14;5(1):180161.
    OpenUrl
  14. 14.↵
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. ImageNet Large Scale Visual Recognition Challenge. arXiv; 2015
  15. 15.↵
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015. p. 1–9.
  16. 16.↵
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [Internet]. 2016 [cited 2023 Dec 15]. p. 2818–26.
  17. 17.↵
    He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [Internet]. 2016
  18. 18.↵
    Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  19. 19.↵
    Li J, Selvaraju RR, Gotmare AD, Joty S, Xiong C, Hoi S. Align before Fuse: Vision and Language Representation Learning with Momentum Distillation [Internet]. arXiv; 2021
  20. 20.↵
    imagenet-1k · Datasets at Hugging Face 2024. https://huggingface.co/datasets/imagenet-1k
  21. 21.↵
    Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Information Processing Systems [Internet]. Curran Associates, Inc.; 2019
  22. 22.↵
    Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, et al. Array programming with NumPy. Nature. 2020 Sep;585(7825):357–62.
    OpenUrlCrossRefPubMed
  23. 23.↵
    The OpenCV Library | https://opencv.org/
  24. 24.↵
    multimodal-learning-hands-on-tutorial/multimodal_training.ipynb at main · dsaidgovsg/multimodal-learning-hands-on-tutorial. https://github.com/dsaidgovsg/multimodal-learning-hands-on-tutorial/blob/main/multimodal_training.ipynb
  25. 25.↵
    Kingma DP, Ba J. Adam: A Method for Stochastic Optimization [Internet]. arXiv; 2017
  26. 26.↵
    Pei X, Zuo K, Li Y, Pang Z. A Review of the Application of Multi-modal Deep Learning in Medicine: Bibliometrics and Future Directions. Int J Comput Intell Syst. 2023 Mar 29;16(1):44.
    OpenUrl
  27. 27.↵
    Kline A, Wang H, Li Y, Dennis S, Hutch M, Xu Z, et al. Multimodal machine learning in precision health: A scoping review. npj Digit Med. 2022 Nov 7;5(1):1–14.
    OpenUrlCrossRef
  28. 28.↵
    ALBEF [Internet]. SERP AI. 2023 https://serp.ai/albef/
  29. 29.↵
    Adebiyi A, Flowers L, Giefer J, Hirner J, Rao P, Smith EH, et al. Accurate classification of benign and malignant dermoscopy skin lesions using three deep learning models. 2023
  30. 30.↵
    Alam TM, Shaukat K, Khan WA, Hameed IA, Almuqren LA, Raza MA, et al. An Efficient Deep Learning-Based Skin Cancer Classifier for an Imbalanced Dataset. Diagnostics. 2022 Sep;12(9):2115.
    OpenUrl
  31. 31.↵
    Akter MS, Shahriar H, Sweta Sneha. Multi-class Skin Cancer Classification Architecture Based on Deep Convolutional Neural Network. In: 2022 IEEE International Conference on Big Data (Big Data).
  32. 32.↵
    Yap J, Yolland W, Tschandl P. Multimodal skin lesion classification using deep learning. Exp Dermatol. 2018 Nov;27(11):1261–7.
    OpenUrlCrossRef
  33. 33.↵
    Tiulpin A, Klein S, Bierma-Zeinstra SMA, Thevenot J, Rahtu E, Meurs J van, et al. Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data. Sci Rep. 2019 Dec 27;9(1):20038.
    OpenUrlCrossRefPubMed
Back to top
PreviousNext
Posted August 28, 2024.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset
Abdulmateen Adebiyi, Nader Abdalnabi, Emily Hoffman Smith, Jesse Hirner, Eduardo J. Simoes, Mirna Becevic, Praveen Rao
medRxiv 2024.05.30.24308213; doi: https://doi.org/10.1101/2024.05.30.24308213
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset
Abdulmateen Adebiyi, Nader Abdalnabi, Emily Hoffman Smith, Jesse Hirner, Eduardo J. Simoes, Mirna Becevic, Praveen Rao
medRxiv 2024.05.30.24308213; doi: https://doi.org/10.1101/2024.05.30.24308213

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)