Published
Radiology Advances
Authors
Taehee Kim1, Heejun Shin1, Yong Sub Song, M.D.2, Jong Hyuk Lee, M.D.3,Hyungjin Kim, M.D., Ph.D.3, Dongmyung Shin, Ph.D.1
Affiliations
1Artificial Intelligence Engineering Division, RadiSen Co. Ltd., Seoul, Korea,
2Department of Radiology, Kim’s Eye Hospital, Konyang University College of Medicine, Seoul,
3Department of Radiology, Seoul National University Hospital, Seoul National University, Seoul,
Korea.
Background
Detecting clinically unsuspected lung cancer on chest radiographs is challenging. Artificial intelligence (AI) software that performs comparably to radiologists may serve as a useful tool.
Purpose
To evaluate the lung cancer detection performance of a commercially available AI software and to that of humans in a healthy population.
Materials and Methods
This retrospective study utilized chest radiographs from the Prostate, Lung, Colorectal, and Ovarian (PLCO) cancer screening trial in the United States between November 1993 and July 2001 with pathological cancer diagnosis follow-up to 2009 (median 11.3 years). The software’s predictions were compared to the PLCO radiologists’ reads. A reader study was performed with a subset comparing the software to three experienced radiologists.
Results
The analysis included 24,370 individuals (mean age 62.6±5.4; median age 62; cancer rate 2%), with 213 individuals (mean age 63.6±5.5; median age 63; cancer rate 46%) for the reader study. AI achieved higher specificity (0.910 for AI vs. 0.803 for radiologists, p < 0.001), positive predictive value (0.054 for AI vs. 0.032 for radiologists, p < 0.001), but lower sensitivity (0.326 for AI vs. 0.412 for radiologists, p = 0.001) than the PLCO radiologists. When we calibrated the sensitivity of AI to match it with the PLCO radiologists, AI had higher specificity (0.815 for AI vs. 0.803 for radiologists, p < 0.001). In the reader study, AI achieved higher sensitivity than reader 1 and 3 (0.608 for AI vs. 0.588 for reader 1, p = 0.789 vs. 0.588 for reader 3, p = 0.803) but lower specificity than reader 1 (0.888 for AI vs. 0.905 for reader 1, p = 0.814). Compared to reader 2, AI showed higher specificity (0.888 for AI vs. 0.819 for reader 2, p = 0.153) but lower sensitivity (0.888 for AI vs. 0.905 for reader 1, p = 0.814).
Conclusions
AI detects lung cancer on chest radiographs among asymptomatic individuals with comparable performance to experienced radiologists.