Skip to main content

Table 3 Performance metrics of biomarker-based models

From: Deep learning-based predictive biomarker of pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer

Metrics pCR-score-based sTILs-baseda Subtype-baseda Baselinea Integratedb
Mean (95% CI) P Mean (95% CI) P Mean (95% CI) P Mean (95% CI) P Mean (95% CI) P
F1 score 0.503 (0.424–0.581) 0.250 (0.181–0.320) < 0.001 0.386 (0.252–0.519) 0.100 0.565 (0.491–0.639) 0.196 0.682 (0.631–0.732) 0.002
Accuracy 0.853 (0.827–0.879) 0.810 (0.786–0.834) < 0.001 0.815 (0.785–0.846) 0.008 0.840 (0.808–0.873) 0.380 0.884 (0.859–0.909) < 0.001
AUC 0.822 (0.784–0.861) 0.780 (0.753–0.806) < 0.001 0.818 (0.782–0.855) 0.858 0.839 (0.812–0.866) 0.485 0.890 (0.863–0.916) 0.001
Sensitivity 0.396 (0.314–0.474) 0.173 (0.118–0.226) < 0.001 0.370 (0.230–0.511) 0.730 0.541 (0.433–0.648) 0.026 0.633 (0.552–0.715) < 0.001
PPV 0.781 (0.699–0.864) 0.494 (0.377–0.612) < 0.001 0.418 (0.278–0.559) < 0.001 0.686 (0.571–0.801) 0.424 0.785 (0.707–0.863) 0.022
Specificity 0.971 (0.959–0.984) 0.969 (0.961–0.978) 1.000 0.936 (0.912–0.960) 0.057 0.918 (0.876–0.961) 0.092 0.945 (0.924–0.973) 0.180
NPV 0.864 (0.835–0.893) 0.823 (0.798–0.851) < 0.001 0.855 (0.815–0.895) 0.472 0.889 (0.857–0.922) 0.070 0.910 (0.884–0.936) 0.144
TP 3.250 (2.590–3.910) 1.438 (1.004–1.871) < 0.001 2.875 (1.794–3.956) 0.791 4.500 (3.527–5.473) 0.092 5.313 (4.517–6.108) 0.021
FN 5.313 (4.118–6.508) 7.125 (5.993–8.258) < 0.001 5.688 (3.897–7.478) 0.791 4.063 (2.780–5.345) 0.092 3.250 (2.287–4.213) 0.021
FP 1.000 (0.565–1.435) 1.063 (0.757–1.368) 1.000 2.250 (1.413–3.087) 0.057 2.813 (1.367–4.258) 0.092 1.750 (0.936–2.564) 0.180
TN 33.44 (32.41–34.47) 33.38 (32.29–34.46) 1.000 32.19 (31.14–33.29) 0.057 31.63 (29.84–33.41) 0.092 32.69 (31.24–34.14) 0.180
  1. Bold indicates statistical significance (P  <  0.05)
  2. pCR pathological complete response; sTILs stromal tumor-infiltrating lymphocytes; OR odds ratio; CI confidence interval; AUC area under the curve
  3. aP value was from that the comparisons with pCRscore-based model
  4. bsolely refers to the P value was from the comparisons of baseline model and integrated model (baseline: sTILs  +  subtype; integrated: sTILs  +  subtype  +  pCR-score)