Formulae for calculating IDI and NRI
Worked example of IDI and NRI use This example uses data from an article published in the BMJ,1 investigating if a new risk factor, the metabolic syndrome, adds to prediction of total mortality above and beyond established risk factors (smoking, diabetes, hypertension, total cholesterol) that are already
used in the primary preventive clinical setting. The question was researched in 2322 50-yearold men of the Uppsala Longitudinal Study of Adult Men cohort. Two logistic regression models with total mortality as outcome (1078 died) were investigated and compared; one small model with only the established risk factors as exposures, and one larger model with
established risk factors plus a metabolic syndrome variable as exposures. Please see the original article for a detailed description.1 The Ethics Committee of Uppsala University Hospital approved the study, and written informed consent was obtained from all participants. The variables used in this example are abbreviated such: “mortality” = mortality, “smo” = smoking, “diab” = diabetes, “htn” = hypertension, “chol” = cholesterol, “mets” = the metabolic syndrome (the new risk factor; a single variable in this example, but the software could equally well handle multiple new risk factors simultaneously). No established thresholds for total mortality risk over 30 years for middle-aged men exist, so for this example we constructed three risk groups: low (<33%), intermediate (33-67%), and high (>67%) risk, reflecting the fact that the average remaining lifespan of a 50 year old Swedish man in 19702 was similar to the follow-up in the sample. The SAS software needs the following input in this example: %nriidi(ds = name of the data set , y = mortality, id = identification variable , model1n = chol, model1c = smok diab htn, model2n = chol, model2c = smok diab htn mets, nriskcat = 3, riskcutoffs = 33 67, printtable = Y)
The Stata software needs the following input for calculating IDI: idi mortality smo diab htn chol, prvars(mets); and the following input for calculating NRI: nri2 mortality smo diab htn chol,
2
prvars(mets) cut(33 67). The nri2 version of the command was chosen because we used two
risk cutpoints in this example. The R software needs the following input : Model formula for logistic regression (in the format mortality ~ smo + diab + htn + chol + mets with the new predictor last), ds (name of the dataset), lim (vector with limits for risk strata), and npred (number of new predictors in the model). The resulting established measures and NRI and IDI are found in Tables 1 and 2. In the example, C-statistics between the two models are not statistically significantly different, but the NRI and IDI are. The clinical relevance of that phenomenon is unclear, and it is also an area for further research how the magnitudes of NRI and IDI are to be interpreted. Technically, the IDI is similar to the difference in Pearson’s R2 measures between two models, or the difference in scaled Brier scores.3 IDIs from different clinical fields may be differently valued. Notably, the NRI reclassification table is important per se.
3
Table 1. Tests of Discriminatory Value, Goodness of Fit, and Calibration
Odds ratio of total mortality corresponding
Estimate
P-value
1.55 (95% CI 1.22 to 1.96)
<0.001
to having the metabolic syndrome (vs. not), adjusted for established risk factors
Results from a small and a larger logistic regression model
Model with established risk
C-statistic
0.648 (95% CI 0.626 to 0.671)
factors (small model)
HL chi2
10.98
0.20
LR chi2
160.85
<0.001
Model with established risk
C-statistic
0.653 (95% CI 0.630 to 0.675)
factors and the metabolic
HL chi2
7.13
0.52
syndrome variable (larger
LR chi2
173.98
<0.001
model)
Comparing the small and the larger models
Difference between C-statistics
0.19
LR test
13.12
0.0003
IDI
0.0053 (95% CI 0.0023 to 0.0083)
0.0006
NRI
0.0328 (95% CI 0.0075 to 0.0582)
0.011
4
Hosmer-Lemeshow (HL) tests based on 10 groups. CI, confidence interval; LR, likelihood ratio; IDI, integrated discrimination improvement; NRI, net reclassification improvement (risk classification tables corresponding to this NRI are presented in Table 2).
5
Table 2. Risk Classification Tables Classification tables of predicted risks from the small and the larger model Non-Cases
Predicted risks from model with established risk factors and the metabolic syndrome variable (larger model) Low
Intermediate High
Total
Predicted risks from
Low
353
26
0
379
model with established
Intermediate
16
797
31
844
risk factors (small model)
High
0
8
13
21
Total
369
831
44
1244
Cases
Predicted risks from model with established risk factors and the metabolic syndrome variable (larger model)
Predicted risks from
Low
Intermediate High
Total
model with established
Low
150
20
0
170
risk factors (small model)
Intermediate
6
741
79
826
High
0
29
53
82
Total
156
790
132
1078
Data in the tables are numbers of persons in classes of predicted risk from a small model and a larger model (same models as in Table 1), separately for non-cases and cases. If the larger model on average assigns a higher risk class to cases and a lower risk class to non-cases than
6
the small model (as in this example), NRI is positive (NRI for comparison of these two models given in Table 1).
7
References (1) Sundstrom J, Riserus U, Byberg L, Zethelius B, Lithell H, Lind L. Clinical value of the metabolic syndrome for long term prediction of total and cardiovascular mortality: prospective, population based cohort study. BMJ 2006 April 15;332(7546):878-82. (2) SCB, Statistics Sweden. 9-12-2005. (3) Steyerberg EW, Vickers AJ, Cook NR et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010 January;21(1):128-38.
8