Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Screening

Ali Akbar Septiandri, Aditiawarman, Roy Tjiong, Erlina Burhan, Anuraj H. Shankar
doi: https://doi.org/10.1101/19000190
Ali Akbar Septiandri
*Inovasi Sehat Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: aliakbars{at}inovasisehat.co.id
Aditiawarman
*Inovasi Sehat Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Roy Tjiong
*Inovasi Sehat Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Erlina Burhan
†Department of Pulmonology and Respiratory Medicine, University of Indonesia, Jakarta, Indonesia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anuraj H. Shankar
‡Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts 02115
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Active screening for Tuberculosis (TB) is needed to optimize detection and treatment. However, current algorithms for verbal screening perform poorly, causing misclassification that leads to missed cases and unnecessary and costly laboratory tests for false positives. We investigated the role of machine learning to improve the predefined one-size-fits-all algorithm used for scoring the verbal screening questionnaire. We present a cost-sensitive machine learning classification for mass tuberculosis screening. We compared score-based classification defined by clinicians to machine learning classification such as SVM-RBF, logistic regression, and XGBoost. We restricted our analyses to data from adults, the population most affected by TB, and investigated the difference between untuned and unweighted classifiers to the cost-sensitive ones. Predictions were compared with the corresponding GeneXpert MTB/Rif results. After adjusting the weight of the positive class to 40 for XGBoost, we achieved 96.64% sensitivity and 35.06% specificity. As such, sensitivity of our identifier increased by 1.26% while specificity increased by 13.19% in absolute value compared to the traditional score-based method defined by our clinicians. Our approach further demonstrated that only 2000 data points were sufficient to enable the model to converge. Our results indicate that even with limited data we can actually devise a better method to identify TB suspects from verbal screening. This approach may be a stepping stone towards more effective TB case identification, especially in primary health centres, and foster better detection and control of TB.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Stop TB Partnership through TB REACH Wave 3

Author Declarations

All relevant ethical guidelines have been followed and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

Any clinical trials involved have been registered with an ICMJE-approved registry such as ClinicalTrials.gov and the trial ID is included in the manuscript.

NA

I have followed all appropriate research reporting guidelines and uploaded the relevant Equator, ICMJE or other checklist(s) as supplementary files, if applicable.

NA

Data Availability

Not available for public

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted June 28, 2019.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Screening
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Screening
Ali Akbar Septiandri, Aditiawarman, Roy Tjiong, Erlina Burhan, Anuraj H. Shankar
medRxiv 19000190; doi: https://doi.org/10.1101/19000190
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Screening
Ali Akbar Septiandri, Aditiawarman, Roy Tjiong, Erlina Burhan, Anuraj H. Shankar
medRxiv 19000190; doi: https://doi.org/10.1101/19000190

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)