Published
International Journal of Cardiology Heart & Vasculature
Authors
Joshua Ra a,1, Heejun Shin b,1, Christopher Park c, Yong-Xiang Wang d, Dongmyung Shin b,*
Affiliations
aDivision of Cardiology, Mount Sinai Morningside-BronxCare, NY, USA
bArtificial Intelligence Engineering Division, RadiSen Co. Ltd., Seoul, South Korea
cNew York University, NY, USA
dMarketech International Corp., Taipei, Taiwan
*Corresponding author. E-mail address: shinsae11@radisentech.com (D. Shin).
1 These authors contributed equally to this work.
Abstract
Background
This study presents an artificial intelligence (AI) model for automated cardiothoracic ratio (CTR)
measurement from chest X-rays (CXRs) and evaluates its association with severe left ventricular hypertrophy
(SLVH) and dilated left ventricle (DLV) diagnosed by echocardiography. The study also assesses CTR’s prognostic
value for predicting future SLVH/DLV development.
Methods: In this retrospective cohort study, an AI algorithm measured CTR on 71,129 CXRs from 24,673 patients
from 2013 to 2018 in the CheXchoNet database. SLVH/DLV was defined using echocardiographic criteria.
Diagnostic accuracy was assessed using AUROC and AUPRC alongside sensitivity and specificity at various CTR
thresholds. Logistic regression was performed for CXR-echocardiogram pairs. Time-to-event analysis was performed on 9,890 patients without baseline SLVH/DLV.
Results: Among 24,673 patients (mean age: 62.1 years; female sex: 56.9 %), mean CTR was higher in SLVH/DLV
patients (0.56 ± 0.07) than those without (0.52 ± 0.07; p < 0.001). AUROC was 0.70 (95 % CI: 0.69–0.70). At a CTR threshold of 0.53, sensitivity was 70 % and specificity 60 %. Increased CTR was associated with SLVH/DLV risk on paired echocardiogram, with an odds ratio of 1.26 at a CTR of 0.65 compared to CTR at 0.50 (95 % CI: 1.24–1.27, p < 0.001). Time-to-event analysis on patients without baseline SLVH/DLV showed patients with baseline CTR > 0.65 had a 4.13-fold increased risk of developing SLVH/DLV in the future compared to patients
with CTR ≤ 0.50 (adjusted HR: 4.13; 95 % CI: 2.48–6.89; p < 0.01).
Conclusion: AI-based CTR measurement helps predict SLVH/DLV and could be used for risk stratification for
cardiovascular care.
Methods
In this retrospective cohort study, an AI algorithm measured CTR on 71,129 CXRs from 24,673 patients from 2013 to 2018 in the CheXchoNet database. SLVH/DLV was defined using echocardiographic criteria. Diagnostic accuracy was assessed using AUROC and AUPRC alongside sensitivity and specificity at various CTR thresholds. Logistic regression was performed for CXR-echocardiogram pairs. Time-to-event analysis was performed on 9,890 patients without baseline SLVH/DLV.
Results
Among 24,673 patients (mean age: 62.1 years; female sex: 56.9 %), mean CTR was higher in SLVH/DLV patients (0.56 ± 0.07) than those without (0.52 ± 0.07; p < 0.001). AUROC was 0.70 (95 % CI: 0.69–0.70). At a CTR threshold of 0.53, sensitivity was 70 % and specificity 60 %. Increased CTR was associated with SLVH/DLV risk on paired echocardiogram, with an odds ratio of 1.26 at a CTR of 0.65 compared to CTR at 0.50 (95 % CI: 1.24–1.27, p < 0.001). Time-to-event analysis on patients without baseline SLVH/DLV showed patients with baseline CTR > 0.65 had a 4.13-fold increased risk of developing SLVH/DLV in the future compared to patients with CTR ≤ 0.50 (adjusted HR: 4.13; 95 % CI: 2.48–6.89; p < 0.01).
Conclusion
AI-based CTR measurement helps predict SLVH/DLV and could be used for risk stratification for cardiovascular care.
1. Introduction
Cardiothoracic ratio (CTR), defined as the ratio of the maximal horizontal cardiac diameter to the internal thoracic diameter, is a widely used measure in chest radiography to assess for cardiomegaly. Specific cut-offs (e.g., CTR > 0.50) have traditionally been used as a binary classifier for the presence or absence of cardiomegaly [1]. However, radiographic cardiomegaly is known to have limitations as a screening tool for the diagnosis of congestive heart failure (CHF) as confirmed by echocardiography [[2], [3], [4], [5]]. CTR measurement is also time-consuming to manually measure and subject to inter-observer variability [6].
Artificial Intelligence (AI)-based, automated CTR measurement methods can standardize this process, providing reliable data across large patient cohorts and potentially improving efficiency and consistency in CTR measurement [[7], [8], [9], [10]]. The ease of obtaining AI-based CTR measurements further provides opportunities to investigate the potential for precise CTR values to be evaluated across a numerical spectrum instead of a binary classification of cardiomegaly based on a specific cut-off, as examined in prior clinical studies [11,12]. For instance, subcategorizing CTR into distinct ranges (e.g., 0.55–0.60 compared to < 0.50) may more precisely risk stratify patients for whom pursuing echocardiography may be of higher diagnostic yield.
We used an AI model to calculate CTR of chest x-rays from the CheXchoNet database [13,14], which pairs CXRs to echocardiographic diagnoses of severe left ventricular hypertrophy (SLVH) or dilated left ventricle (DLV). The aims of this study are threefold: to use an AI model for automated CTR measurement in CXRs, assess the performance of AI-measured CTR to predict the composite outcome of SLVH/DLV on paired echocardiography, and assess the prognostic value of baseline CTR to predict future development of SLVH/DLV.
2. Methods
2.1. Data source and cohort construction
We utilized the CheXchoNet database, which is an IRB-approved and publicly available database of 71,589 paired CXRs and echocardiograms performed within 12 months of each other on 24,689 patients [13,14]. Briefly, these data were collected retrospectively from a single center from January 2013 to August 2018. CXRs were extracted in DICOM format and filtered to only include posteroanterior (PA) films. Demographic data on age and sex were extracted from chest X-ray metadata. Within one year of their X-ray scans, patients underwent echocardiograms that were determined to have SLVH and/or DLV. SLVH was classified in men as an interventricular septal thickness at end-diastole (IVSd) or left ventricular posterior wall distance at end-diastole (LVPWd) of > 1.5 cm and women with IVSd or LVPWd > 1.4 cm. DLV was classified in men with left ventricular internal diameter at end-diastole (LVIDd) > 5.9 cm and women with LVIDd > 5.3 cm. Our main analysis was performed on a composite binary label based on the presence of either SLVH or DLV, as we hypothesized that CTR on CXR would not provide sufficient diagnostic capacity to distinguish between the entities of SLVH and DLV, which may be more meaningfully differentiated by analyzing characteristics of the cardiac silhouette [14]. Supplementary analysis analyzing the outcomes of SLVH alone or DLV alone were included.
After excluding 460 lateral view CXRs and non-CXRs found in the database, our final dataset comprised 71,129 paired CXRs and echocardiograms from 24,673 patients (Fig. 1). A commercially available AI software (AXIR-CX; CE-certified; RadiSen Co. Ltd.; South Korea) using deep learning techniques was used to measure CTRs on individual CXRs (Fig. 2). Measurement of CTR is fully automated and integrated into the AXIR-CX software without need for manual adjustment by a human user. Each of the input chest radiographs was resized as 512 × 512 pixels and processed using a lung and heart segmentation model based on attention U-Net architecture [15]. Publicly available CXR datasets including from the NIH Database [16] as well as privately collected datasets from Vietnam and Indonesia annotated by radiologists were used for training using an AI model [17,18]. The radiologists annotated the boundaries of heart and lung regions on CXRs of the datasets. CTR was then calculated by the cardiac diameter, a summation of midline-to-right (MRD) and midline-to-left (MLD) heart diameters, divided by the internal chest diameter. For additional analysis, each CTR was subcategorized using thresholds of 0.50, 0.55, 0.60, and 0.65 as reported in a previous study [11].


For supplementary analysis of CTR, we identified a subcohort of patients without pulmonary abnormalities of atelectasis, pleural effusion, lung opacities, pulmonary edema, and pneumothorax as detected by the AI software (AXIR-CX; RadiSen Co. Ltd.; South Korea) (Supplementary Fig. 1, Supplementary Table 1). This subcohort without pulmonary abnormalities comprised of 46,396 CXRs from 19,920 patients. To compare the AI software’s analysis of CTR on an external dataset, CTR was measured on 59,651 CXRs from the MIMIC Chest X-ray (MIMIC-CXR) database [19,20].
2.2. Model evaluation
To evaluate the performance of AI-measured CTR models to predict SLVH/DLV, SLVH, or DLV on echocardiograms, we report metrics on area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) with 95 % confidence intervals constructed using bootstrapping methods. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratio, and accuracy with 95 % confidence intervals were calculated at the specified CTR thresholds.
Logistic regression analyses were performed to assess the association between CTR and SLVH/DLV, SLVH, or DLV on paired echocardiogram. To account for non-linear relationships between CTR and the outcome, we modeled CTR using B-spline basis functions with 3 degrees of freedom. Unadjusted and adjusted analyses controlled for age and sex were performed. We computed predicted log-odds for specified CTR cut-offs (0.50, 0.55, 0.60, and 0.65) and obtained odds ratios (ORs) with 95 % CIs.
A multilayer perceptron (MLP) model with two fully connected layers was developed to explore the predictive value of patient age and sex alone, as well as age, sex, and CTR on the outcome SLVH/DLV. The models were trained on 80 % of the dataset and tested on the remaining 20 %. For model evaluation, we used receiver operating characteristic (ROC) curve analysis to calculate the area under the ROC curve (AUROC) and the Youden index. DeLong’s test was used to compare the AUROCs (or AUPRCs, as applicable) between the two MLP models to assess statistical significance of differences in predictive performance.
2.3. Time-to-event analysis
Patients with more than one CXR and no SLVH or DLV at first CXR (e.g. baseline CXR) were identified for time-to-event analysis, with the event defined as composite SLVH/DLV, SLVH alone, or DLV alone on a future echocardiogram. We identified a subcohort of 9,890 patients who had no composite SLVH/DLV at time of baseline CTR and had follow-up paired CXR-echocardiograms available. Patients were censored at last available CXR or time of heart or lung transplant. Follow-up times were calculated from baseline CTR to time of composite SLVH/DLV, SLVH only, or DLV only diagnoses on echocardiography. In this subcohort, 310 patients developed composite SLVH/DLV compared to 9,580 patients who did not, 171 patients developed SLVH alone, and 147 developed DLV alone (concomitant SLVH and DLV diagnoses were possible on echocardiography). Median time from baseline CTR to development of SLVH/DLV was 1.8 years (IQR: 1.0–3.0) compared to 0.5 years (IQR: 0.1–1.3) for those who did not (p < 0.001). Patients who developed SLVH/DLV during the follow-up period had a significantly higher baseline CTR compared to those who did not (0.57 ± 0.08 vs. 0.52 ± 0.07, p < 0.001).
We calculated hazard ratios (HRs) and 95 % CI by Cox proportional hazards models. The cohort was divided by baseline CTR of ≤ 0.50 and > 0.50 for an initial time-to event-analysis. Unadjusted HR and HR adjusted by age and sex were calculated. We next categorized patients into five baseline CTR groups: ≤0.50, 0.50–0.55, 0.55–0.60, 0.60–0.65, and > 0.65. The reference group for comparison was the group with baseline CTR ≤ 0.50. Time-to-event analysis was similarly performed. Sensitivity analysis using a time-varying Cox regression model supported the robustness of the results from the Cox proportional hazards model. For dose–response analysis between CTR and respective outcome, CTR categories were specified as an ordinal variable and entered as a continuous predictor in the Cox proportional hazards models, with the coefficient for the ordinal CTR variable representing the change in the log hazard per one-unit increase and the statistical significance of the trend assessed using the Wald test.
2.4. Statistical analysis
Data were summarized as counts and percentages for categorical variables or as means with standard deviations for continuous variables. Mean CTR was compared between groups using an independent samples t-test. Chi-square tests were performed to compare patient counts and female sex proportions across CTR categories. For age comparisons across CTR categories, a t-test or one-way analysis of variance (ANOVA) was used as appropriate. Time was measured as median with interquartile range, and group comparisons were conducted using the Mann-Whitney U test. Pearson correlation coefficient was determined between CTR and LVIDd.
Analyses and visualizations were conducted using Python version 3.9.19 (pandas v2.2.2, SciPy v1.13.1, matplotlib v3.8.4, seaborn v0.13.2, lifelines v0.29.0, statsmodels v0.14.2) (Python Software Foundation). Statistical significance was set at p < 0.05.
3. Results
3.1. Patient cohort
71,129 paired CXR and echocardiograms from 24,673 patients in the CheXchoNet database were analyzed (Table 1). Mean age was 62.1 ± 16.1 years, 56.9 % were female, and 9,800 echocardiograms had composite SLVH/DLV labels (13.8 % of total paired CXR and echocardiograms).
3.2. CTR measurement
When the AI software was used to measure CTR on CXRs, patients without SLVH or DLV on echocardiogram had a mean CTR of 0.52 ± 0.07 on the paired CXR, while those with composite SLVH/DLV had a mean CTR of 0.56 ± 0.07 with a p < 0.001 (Supplementary Fig. 2). Similar results were obtained when SLVH or DLV were analyzed separately. Patients without SLVH had a mean CTR of 0.52 ± 0.07, while those with SLVH had a mean CTR of 0.56 ± 0.07 with a p < 0.001. Patients without DLV had a mean CTR of 0.52 ± 0.07, while those with DLV had a mean CTR of 0.57 ± 0.08 with a p < 0.001. Relationship between CTR and LVIDd, the echocardiographic parameter used to diagnose DLV, was additionally assessed. Mean LVIDd was 46 ± 6.8 mm, and a slight positive correlation with statistical significance was observed between CTR and LVIDd (r = 0.162, p < 0.001). Additional analysis in a subset of patients without pulmonary abnormalities detected by the AI software showed a mean CTR of 0.51 ± 0.06 for patients without SLVH/DLV on paired echocardiogram and 0.55 ± 0.07 for patients with SLVH/DLV (p < 0.001).
To assess the generalizability of the AI model for CTR measurement, we performed external testing using the MIMIC-CXR dataset on 59,651 CXRs. The AI model measured a mean CTR of 0.57 (± 0.08) for 9,812 CXRs identified as having cardiomegaly in the MIMIC-CXR dataset. For 49,839 CXRs without cardiomegaly, the model measured a mean CTR of 0.48 (± 0.06) (p < 0.001).
3.3. Model evaluation
The predictive performance of CTR on CXR to determine composite SLVH/DLV on paired echocardiography was assessed. AUROC was 0.70 (95 % CI: 0.69–0.70) and AUPRC was 0.26 (95 % CI: 0.25–0.27) (Supplementary Fig. 3). At the optimal threshold determined by Youden’s J statistic (CTR = 0.53), the model achieved a sensitivity of 0.70, specificity of 0.60, positive predictive value of 22 %, negative predictive value of 93 %, and accuracy of 61 %. Sensitivity, specificity, and predictive values were evaluated at various CTR thresholds and shown in Table 2. A lower threshold of CTR > 0.50 provided higher sensitivity (82.7 %) and negative predictive value (93.9 %) with a low specificity (42.8 %) and positive predictive value (18.7 %), while higher thresholds such as CTR > 0.65 had comparatively higher specificity (97.1 %) and positive predictive value (35.8 %) with lower sensitivity (10.2 %) and negative predictive value (87.1 %).
Table 1. Patient Characteristics.
Empty Cell | Total |
---|---|
CXRs, n | 71,129 |
Patients, n | 24,673 |
Age, years | 62.1 ± 16.1 |
Female sex | 14,026 (56.9 %) |
Labels | |
SLVH | 6158 (8.7) |
DLV | 4312 (6.1) |
Composite SLVH/DLV | 9800 (13.8) |
Data reported as mean ± SD or n (%). Sex is labeled by individual patient. Age and severe left ventricular hypertrophy (SLVH) or dilated left ventricle (DLV) labels are summarized by CXR-echocardiogram pair basis. Because echocardiograms may be labeled as SLVH only, DLV only, or both SLVH and DLV, the total number of echocardiograms labeled SLVH or DLV may not match the composite SLVH/DLV total.
Table 2. Diagnostic performance metrics of cardiothoracic ratio (CTR) thresholds for predicting the composite outcome of severe left ventricular hypertrophy (SLVH) or dilated left ventricle (DLV).
Cardiothoracic Ratio | Sensitivity % (95 % CI) | Specificity % (95 % CI) | + Likelihood ratio (95 % CI) | − Likelihood ratio (95 % CI) | Positive predictive value % (95 % CI) | Negative predictive value % (95 % CI) | Accuracy % (95 % CI) | Total CXR Count |
---|---|---|---|---|---|---|---|---|
>0.50 | 82.7 (81.9–83.4) | 42.8 (42.4–43.2) | 1.44 (1.43–1.46) | 0.41 (0.39–0.42) | 18.7 (18.4–19.1) | 93.9 (93.6–94.2) | 48.3 (47.9–48.6) | 43,204 |
>0.55 | 57.9 (56.9–58.9) | 70.2 (69.8–70.5) | 1.94 (1.91–1.97) | 0.60 (0.58–0.62) | 23.7 (23.1–24.2) | 91.3 (91.0–91.5) | 68.5 (68.1–68.8) | 23,978 |
>0.60 | 29.3 (28.4–30.2) | 89.1 (88.9–89.4) | 2.69 (2.61–2.77) | 0.79 (0.77–0.81) | 30.1 (29.1–31.0) | 88.7 (88.5–89.0) | 80.9 (80.6–81.2) | 9544 |
>0.65 | 10.2 (9.6–10.8) | 97.1 (97.0–97.2) | 3.49 (3.29–3.71) | 0.93 (0.88–0.97) | 35.8 (34.0–37.6) | 87.1 (86.9–87.4) | 85.1 (84.9–85.4) | 2780 |
Performance metrics reported with 95% confidence intervals (CI). The “Total CXR Count” reflects the number of chest X-rays (CXRs) exceeding each respective CTR threshold.
Predictive performance of CTR on CXR to determine SLVH or DLV were separately assessed. For SLVH alone, AUROC was 0.68 (95 % CI: 0.68–0.69) and AUPRC was 0.16 (95 % CI: 0.15–0.17) (Supplementary Fig. 4). At the optimal threshold determined by Youden’s J statistic (CTR = 0.53), the model for SLVH alone achieved a sensitivity of 0.69, specificity of 0.59, positive predictive value of 14 %, negative predictive value of 95 %, and accuracy of 59 %. For DLV alone, AUROC was 0.69 (95 % CI: 0.68–0.70) and AUPRC was 0.12 (95 % CI: 0.12–0.13) (Supplementary Fig. 5). At the optimal threshold determined by Youden’s J statistic (CTR = 0.53), the model for DLV alone achieved a sensitivity of 0.71, specificity of 0.58, positive predictive value of 10 %, negative predictive value of 97 %, and accuracy of 58 %. Sensitivity, specificity, and predictive values were evaluated at various CTR thresholds for SLVH alone (Supplementary Table 2) and DLV alone (Supplementary Table 3).
A logistic regression model to analyze the effect of CTR on SLVH/DLV (adjusted for age and sex) revealed that, compared to a reference CTR of 0.50, the odds of SLVH/DLV were 1.05 times higher at a CTR of 0.55 (95 % CI: 1.04–1.05, p < 0.001), 1.13 times higher at a CTR of 0.60 (95 % CI: 1.12–1.14, p < 0.001), and 1.26 times higher at a CTR of 0.65 (95 % CI: 1.24–1.27, p < 0.001). This demonstrated a statistically significant increase in risk of SLVH/DLV with higher CTR values. Similar results when analyzing SLVH alone or DLV alone were observed with statistical significance (Supplementary Table 4).
We additionally employed a MLP model to assess if CTR had predictive value for SLVH/DLV compared to other patient factors such as age and sex. We first inputted age and gender variables into the MLP model, with the model demonstrating an AUROC of 0.57 (95 % CI: 0.55–0.58) and AUPRC of 0.17 (95 % CI: 0.16–0.18) (Supplementary Fig. 6). We then incorporated CTR along with age and gender as input variables into the MLP model. AUROC improved to 0.71 (95 % CI: 0.70–0.73) and an AUPRC of 0.28 (95 % CI: 0.26–0.30). The difference in AUC as well as AUPRC between the MLP model with age and gender and the MLP model with age, gender, and CTR was statistically significant (p < 0.001). This demonstrated that CTR had significant predictive value for SLVH/DLV compared to other patient variables such as age and gender.
3.4. Time-to-Event analysis
Time-to-event analysis to determine risk of developing SLVH/DLV was first performed by dividing patients into two groups based on baseline CTR ≤ 0.50 or > 0.50. Median time from baseline CTR to outcome was 0.52 (0.08–1.29) years for the CTR ≤ 0.50 group compared to 0.55 (0.09–1.47) years for the CTR > 0.50 group (p = 0.05). Increased CTR was associated with increased age and female sex (Supplementary Table 5). Kaplan-Meier curve analysis showed a significantly higher risk of developing SLVH/DLV for the group with baseline CTR > 0.50 compared to ≤ 0.50 (log-rank p < 0.001) (Fig. 3). The unadjusted hazard ratio for developing SLVH/DLV was 2.18 (95 % CI: 1.68–––2.82, p < 0.01) for patients with CTR > 0.50 compared to those with CTR ≤ 0.50. An adjusted analysis controlling for age and gender was performed, with a hazard ratio of 2.21 (95 % CI: 1.69–2.88, p < 0.01) that remained statistically significant.


Similar results were obtained when outcomes for SLVH or DLV were separately assessed (Supplementary Fig. 7). Baseline CTR of > 0.50 had a higher risk of developing SLVH compared to baseline CTR of ≤ 0.50, with a log-rank test of p < 0.001, unadjusted hazard ratio of 2.08 (95 % CI: 1.48–––2.93, p < 0.001), and an adjusted hazard ratio controlling for age and gender of 2.11 (95 % CI: 1.48–3.01, p < 0.001). Baseline CTR of > 0.50 had a higher risk of developing DLV compared to baseline CTR of ≤ 0.50, with a log-rank test of p < 0.001, unadjusted hazard ratio of 2.22 (95 % CI: 1.52 – 3.23, p < 0.001), and an adjusted hazard ratio controlling for age and gender of 2.33 (95 % CI: 1.58–3.45, p < 0.001).
To examine if further subcategorization of baseline CTR predicted risk of developing SLVH/DLV, we next categorized patients into five baseline CTR groups: ≤ 0.50, 0.50–0.55, 0.55–0.60, 0.60–0.65, and > 0.65 (Supplementary Table 6). The reference group for comparison was the group with baseline CTR ≤ 0.50. Time-to-event analysis revealed a dose–response relationship between increasing baseline CTR categories and the risk of developing SLVH/DLV (Fig. 4). The log-rank test showed significant differences in survival distributions across all CTR groups (p < 0.001) as well as pairwise comparisons between the reference group (CTR ≤ 0.50) and each higher CTR group (Table 3). Compared to the reference group (CTR ≤ 0.50), the unadjusted hazard ratios (HRs) were statistically significant. After adjusting for age and sex, the hazard ratios remained statistically significant, with HR = 4.13 (95 % CI: 2.48–6.89, p < 0.01) for CTR > 0.65. When modeling each CTR category as an ordinal variable, each 0.05-unit increase in CTR was associated with a 46 % increase in unadjusted risk of developing SLVH/DLV (HR = 1.46, 95 % CI: 1.28–1.66, p < 0.01) and a 49 % increase after adjusting for age and sex (HR = 1.49, 95 % CI: 1.30–1.71, p < 0.01).
Table 3. Time-to-event analysis demonstrating the relationship between baseline cardiothoracic ratio (CTR) categories and the risk of developing severe left ventricular hypertrophy (SLVH) or dilated left ventricle (DLV).
CTR Comparison | Log-rank test p-value | Unadjusted HR (95 % CI) | Unadjusted HR p-value | Adjusted HR (95 % CI) | Adjusted HR p-value |
---|---|---|---|---|---|
≤ 0.5 vs 0.5–0.55 | < 0.01 | 1.63 (1.19–2.22) | < 0.01 | 1.68 (1.22–2.31) | < 0.01 |
≤ 0.5 vs 0.55–0.60 | < 0.001 | 2.40 (1.77–3.26) | < 0.01 | 2.48 (1.81–3.39) | < 0.01 |
≤ 0.5 vs 0.60–0.65 | < 0.001 | 2.94 (2.03–4.27) | < 0.01 | 3.06 (2.08–4.50) | < 0.01 |
≤ 0.5 vs > 0.65 | < 0.001 | 3.90 (2.39–6.38) | < 0.01 | 4.13 (2.48–6.89) | < 0.01 |
Performance metrics reported with 95% confidence intervals (CI).
Similar results were obtained when outcomes for SLVH or DLV were individually assessed (Supplementary Fig. 8). Log-rank tests showed statistical significance between all CTR groups with the reference group of CTR ≤ 0.50 (Supplementary Table 7, Supplementary Table 8). When modeling each CTR category as an ordinal variable, each 0.05-unit increase in CTR was associated with a 46 % increase in unadjusted risk of developing SLVH (HR = 1.46, 95 % CI: 1.28–1.66, p < 0.01) and a 49 % increase after adjusting for age and sex (HR 1.49, 95 % CI 1.30–1.71, p < 0.01). Similarly, each 0.05-unit increase in CTR corresponded to a 44 % higher unadjusted risk of developing DLV (HR = 1.44, 95 % CI: 1.26–1.64, p < 0.01) and a 46 % higher risk after adjustment (HR = 1.46, 95 % CI: 1.27–1.67, p < 0.01).
4. Discussion
Our study demonstrates that an AI-based measurement of cardiothoracic ratio (CTR) in chest X-rays (CXRs) is a predictor of severe left ventricular hypertrophy (SLVH) and/or dilated left ventricle (DLV) on echocardiogram. An optimal CTR threshold of 0.53 provided a balanced trade-off between sensitivity (0.70) and specificity (0.60), and the model showed fair discriminative ability (AUROC 0.70) in identifying patients with SLVH/DLV. The large dataset (n = 71,129) may have provided statistical power to show a slight yet statistically significant positive correlation between CTR and left ventricular size, which indicates that CTR may have a measurable, though limited, relationship with left ventricular dimensions. However, the observed effect size in this analysis remains small, and CTR should continue to be interpreted with caution when used as a standalone diagnostic tool for SLVH or DLV on echocardiography performed within days of the CXR, as referenced in prior studies [11,21].
Notably, this is the first study to our knowledge to suggest that an increased CTR value is associated with a higher risk for an echocardiographic diagnosis of congestive heart failure months to years after the initial CXR. A dose–response relationship between CTR and risk of future development of SLVH/DLV was observed, with a baseline CTR > 0.65 representing more than four times the risk of developing SLVH/DLV compared to patients with baseline CTR ≤ 0.50. This suggests that elevated CTR values may serve as an early radiographic predictor for long-term outcomes of heart failure.
Deep learning-based models of CXRs have recently demonstrated ability to detect subtle structural abnormalities in the cardiac silhouette indicative of structural heart disease, which holds considerable promise as a diagnostic tool for detecting SLVH or DLV [14]. However, AI-assisted CTR measurements continue to have utility as a simple, first-look screening tool for identifying patients who may benefit from future evaluation with echocardiography [12]. This aligns with class I indications in the European Society of Cardiology (ESC) and American Heart Association/American College of Cardiology (AHA/ACC) guidelines to perform chest x-rays for suspected or new-onset heart failure [22,23]. Close clinical follow-up and repeat imaging within the first few years of this patient population could facilitate earlier detection and intervention for structural heart disease.
Limitations of our study include its retrospective design and the potential for confounding factors not accounted for in our analysis. The CheXchoNet database consisted of echocardiograms that were conducted within 12 months of the paired CXR, with the exact time interval between each paired CXR and echocardiogram not made available. As a result, the time-to-event analysis must be interpreted with the caveat that the reported times from baseline CXR to clinical outcome have potential variability within this time range. Despite this limitation, the significant dose–response relationship between CTR and development of SLVH/DLV supports our findings that increased CTR is associated with increased risk of future SLVH/DLV. Another limitation is that while strong agreement between CTR measured by human radiologists and CTR measured by an earlier version of the AI algorithm used in this study has previously been demonstrated in a separate, smaller dataset [17], the CheXchoNet database analyzed in this study did not have radiologist-annotated CTR measurements available for comparative analysis. Future studies on inter-observer variability between CTR measurements annotated by radiologists and AI-derived CTR measurements continue to be warranted to establish clinical confidence in the CTR algorithm. Next, the conclusions from this study may have limitations in a real-world cohort, as they are not applicable to anteroposterior CXRs commonly obtained in an inpatient setting and may potentially face limitations on relatively less common CXR datasets such as patients with severe cardiothoracic anatomical deformities, which would require further training to ensure robustness of the CTR segmentation algorithm. Lastly, the CheXchoNet database used is derived from a single institution, and additional validation from other institutions with paired CXR-echocardiogram data are needed.
5. Conclusion
In conclusion, our study highlights the utility of an AI-based approach for CTR measurement in predicting SLVH/DLV. AI-based CTR measurements can potentially help clinicians risk stratify patients for echocardiography and assist earlier detection and follow-up for patients at risk for congestive heart failure.