Published
Union Conference on Lung Health (2024)
Authors
Metasebia Mesfin, M.D.1, Abraham Eshetu, M.D.1, Robel Gemechu1,
Abel Worku1, Desalegn Abebaw, Ph.D.2, Tariku Mengesha, M.D.1
Affiliations
1Kidus Petros Hospital, Addis Ababa, Ethiopia.
2Artificial Intelligence Engineering Division, RadiSen Co. Ltd., Seoul, Korea
*to be published soon.
Summary
This study evaluates the detection performance of pulmonary Tuberculosis (TB) by a commercially available Artificial Intelligence (AI) software (AXIR-CX) from chest radiographs in Kidus Petros, a TB treatment hospital in Ethiopia. Prediction performances of the AI and expert radiologists are evaluated with the ground truth of Xpert MTB test results.
Background
AI is increasingly embraced for detecting TB and related abnormalities worldwide. This study retrospectively evaluates the performance of AXIR-CX (version 2.5.0), an AI software developed by RadiSen in South Korea, for detecting TB using radiographs from Kidus Petros hospital.
Methods
We collected Xpert results and chest radiographs of 1,579 (with 10% positives) individuals seen at the hospital in 2023. The AI’s predictions were evaluated on the entire dataset. Subsequently, a subset of 321 radiographs (49% positives) were interpreted by three radiologists, two from South Korea and one from the hospital, with 10+ years of experience. The radiologists rated the radiographs for TB presence on a scale of 0 to 5, while the AI software provided probability values (0-100%). We compared predictive performances of the hospital’s radiologist(without and with the help of AI), the joint South Korean radiologists, and the AI against Xpert results.
Results
In the entire dataset, the AI predicted with sensitivity, specificity, and accuracy of 71%, 83%, and 82% respectively with an AUC of 0.83. In the sampled dataset of 321 cases, the joint radiologists (JointRads) demonstrated sensitivity, specificity, and accuracy of 54%, 95%, and 75% respectively while the AI achieved specificity and accuracy of 92% and 73% at a similar sensitivity level. The hospital’s radiologist (Rad3) achieved sensitivity, specificity, and accuracy of 83%, 65%, and 74% respectively. At the same sensitivity level, the AI exhibited specificity and accuracy of 67% and 75%. When the radiologist was assisted by AI (Rad3&AI), a slight improvement was observed, with sensitivity, specificity, and accuracy reaching 85%, 68%, and 76% respectively.
Conclusion
The AI demonstrated a capacity to identify tuberculosis with similar accuracy to skilled radiologists. Collaborative work between a radiologist and AI yields enhanced predictive performance. These findings indicate the potential usefulness of AI in hospital triage scenarios.