Demler et al developed a modification of the Nam-D'Agostino test that has nominal type 1 error rates with censored outcomes. The Greenwood-Nam-D'Agostino (GND) test is applicable in settings where the proportional hazards assumption is invalid. The GND test is based on groupwise error rates for observed versus expected risk, and the test is unstable when any group contains < 5 events (see Demler et al.).
scalib_gnd(
scalib_object,
group_method = "lump",
group_count_init = 10,
group_count_min = 2,
group_min_events = 5,
verbose = 0
)
scalib_gnd_manual(
scalib_object,
pred_risk_col,
group,
group_min_events_warn = 5,
group_min_events_stop = 2,
verbose = 0
)
vec_gnd(
pred_risk,
pred_horizon,
event_status,
event_time,
group_method = "lump",
group_count_init = 10,
group_count_min = 2,
group_min_events = 5,
verbose = 0
)
vec_gnd_manual(
pred_risk,
pred_horizon,
event_status,
event_time,
group,
group_min_events_warn = 5,
group_min_events_stop = 2,
verbose = 0
)
An object of class scalib
(see scalib).
(character value; 'lump' or 'redo') If 'lump', then
a 'lumping' procedure will be applied whenever a group has less than
group_min_events_warn
events. The lumping procedure will identify
whichever group has the lowest event count and assign members of that
group to the group with too few events. If 'redo', then the groups
will be re-done using predrisk_grp_prcnt but with one less group.
'lump' is the default as this method was studied by Demler et al.
(integer value) the initial number of
groups to form based on percentiles of predicted_risk
values.
(integer value) the minimum number of
groups to attempt running the GND test with. Only relevant if
using scalib_gnd_manual
.
(numeric value) The minimum number of events within a risk group (see details).
(integer value) If 0, no output will be printed. If 1, some output will be printed. If 2, all output will be printed.
(character value) the column name of the variable
that contains predicted risk values. This variable should be in
scalib_object$data_inputs
.
(numeric vector) Only relevant if using scalib_gnd_manual
.
An integer valued vector with group values starting at 1 and increasing
by 1 for each additional group. These risk groups are used to run the
GND test.
(numeric value) The lowest event count
within a risk group that will not cause a warning (see details).
Only relevant if using scalib_gnd_manual
.
(numeric value) The lowest event count
within a risk group that will not cause a hard stop (see details).
Only relevant if using scalib_gnd_manual
.
(numeric vector, list, data frame, or matrix)
predicted risk values for the event at or before pred_horizon
(numeric value) the time of risk prediction.
(numeric vector) observed event status. The values of this vector should be 0 (event censored) and 1 (event observed).
(numeric vector) observed event times
an object of class scalib
(see scalib)
scalib_gnd
automatically forms risk groups, checks event counts
in risk groups and, if necessary, collapses risk groups so that
every group has an event count greater than a given threshold.
scalib_gnd_manual
completes the GND test, but requires the user to
create a column in the scalib_object$data_inputs that contains risk group labels (see predrisk_grp_prcnt)
specify the column name of the risk group column and the predicted risk column.
Minimum event counts for risk groups: Low event counts within any risk
group may cause high variability in the GND test results. It is recommended
that all risk groups have at least 5 events, and this is why the default
value of group_min_events_warn
is 5. If there are less than 2 events
in any group, the GND test is unstable and risk groups should be collapsed.
Therefore, the default value of group_min_events_stop
is 2.
Demler, O.V., Paynter, N.P. and Cook, N.R., 2015. Tests of calibration and goodness‐of‐fit in the survival setting. Statistics in medicine, 34(10), pp.1659-1680. DOI: 10.1002/sim.6428
sc <- scalib(pred_risk = pbc_scalib$predrisk,
pred_horizon = 2500,
event_time = pbc_scalib$test$time,
event_status = pbc_scalib$test$status)
# run GND test using 10 groups; apply some rules:
# 1. require at least 40 events per group
# 2. try to create 10 groups, and lump the lowest event frequency groups
# if any groups have less than 40 events.
# 3. Hard stop if this is not obtainable with 5 or more groups.
# 4. set verbose = 2 to see every detail.
# (note that these rules are not a recommendation on how to use the test;
# this example just shows the mechanics of the automatic group reduction.
# The recommended values are the default values.)
scalib_gnd(sc,
group_min_events = 5,
group_count_init = 10,
group_count_min = 5,
verbose = 2)
#> Checking event counts using 10 risk groups...
#> too few; trying again
#> Checking event counts using 9 risk groups...
#> too few; trying again
#> Checking event counts using 8 risk groups...
#> too few; trying again
#> Checking event counts using 7 risk groups...
#> too few; trying again
#> Checking event counts using 6 risk groups...
#> too few; trying again
#> Checking event counts using 5 risk groups...
#> okay
#>
#> Survival calibration object with prediction horizon of 2500
#>
#> -- Input data ----------------------------------------------------------------
#>
#> event_time event_status prop_hazard rsf_axis gradient_booster rsf_oblique
#> <int> <num> <num> <num> <num> <num>
#> 1: 400 1 0.9990 0.9026 0.9351 0.9463
#> 2: 4500 0 0.4272 0.3072 0.0524 0.3680
#> 3: 1925 1 0.8286 0.4722 0.2342 0.5982
#> 4: 1832 0 0.0358 0.1474 0.0422 0.1460
#> 5: 2466 1 0.0392 0.1925 0.0568 0.1558
#> ---
#> 134: 1300 0 0.1509 0.1783 0.0629 0.1669
#> 135: 1293 0 0.1805 0.3466 0.1299 0.2416
#> 136: 1250 0 0.9823 0.4743 0.3254 0.5727
#> 137: 1230 0 0.0182 0.0589 0.0322 0.0637
#> 138: 1153 0 0.0718 0.1637 0.0527 0.1220
#>
#>
#> -- Output data --------------------------------------------------
#>
#> Key: <._id_.>
#> ._id_. gnd_df gnd_chisq gnd_pvalue gnd_data
#> <fctr> <num> <num> <num> <list>
#> 1: prop_hazard 4 12.20 1.59e-02 <data.table[5x8]>
#> 2: rsf_axis 4 6.00 1.99e-01 <data.table[5x8]>
#> 3: gradient_booster 5 28.99 2.33e-05 <data.table[6x8]>
#> 4: rsf_oblique 4 5.63 2.28e-01 <data.table[5x8]>
#> 1 variable not shown: [gnd_group_method <char>]
#>