Demler et al developed a modification of the Nam-D'Agostino test that has nominal type 1 error rates with censored outcomes. The Greenwood-Nam-D'Agostino (GND) test is applicable in settings where the proportional hazards assumption is invalid. The GND test is based on groupwise error rates for observed versus expected risk, and the test is unstable when any group contains < 5 events (see Demler et al.).

scalib_gnd(
  scalib_object,
  group_method = "lump",
  group_count_init = 10,
  group_count_min = 2,
  group_min_events = 5,
  verbose = 0
)

scalib_gnd_manual(
  scalib_object,
  pred_risk_col,
  group,
  group_min_events_warn = 5,
  group_min_events_stop = 2,
  verbose = 0
)

vec_gnd(
  pred_risk,
  pred_horizon,
  event_status,
  event_time,
  group_method = "lump",
  group_count_init = 10,
  group_count_min = 2,
  group_min_events = 5,
  verbose = 0
)

vec_gnd_manual(
  pred_risk,
  pred_horizon,
  event_status,
  event_time,
  group,
  group_min_events_warn = 5,
  group_min_events_stop = 2,
  verbose = 0
)

Arguments

scalib_object

An object of class scalib (see scalib).

group_method

(character value; 'lump' or 'redo') If 'lump', then a 'lumping' procedure will be applied whenever a group has less than group_min_events_warn events. The lumping procedure will identify whichever group has the lowest event count and assign members of that group to the group with too few events. If 'redo', then the groups will be re-done using predrisk_grp_prcnt but with one less group. 'lump' is the default as this method was studied by Demler et al.

group_count_init

(integer value) the initial number of groups to form based on percentiles of predicted_risk values.

group_count_min

(integer value) the minimum number of groups to attempt running the GND test with. Only relevant if using scalib_gnd_manual.

group_min_events

(numeric value) The minimum number of events within a risk group (see details).

verbose

(integer value) If 0, no output will be printed. If 1, some output will be printed. If 2, all output will be printed.

pred_risk_col

(character value) the column name of the variable that contains predicted risk values. This variable should be in scalib_object$data_inputs.

group

(numeric vector) Only relevant if using scalib_gnd_manual. An integer valued vector with group values starting at 1 and increasing by 1 for each additional group. These risk groups are used to run the GND test.

group_min_events_warn

(numeric value) The lowest event count within a risk group that will not cause a warning (see details). Only relevant if using scalib_gnd_manual.

group_min_events_stop

(numeric value) The lowest event count within a risk group that will not cause a hard stop (see details). Only relevant if using scalib_gnd_manual.

pred_risk

(numeric vector, list, data frame, or matrix) predicted risk values for the event at or before pred_horizon

pred_horizon

(numeric value) the time of risk prediction.

event_status

(numeric vector) observed event status. The values of this vector should be 0 (event censored) and 1 (event observed).

event_time

(numeric vector) observed event times

Value

an object of class scalib (see scalib)

Details

scalib_gnd automatically forms risk groups, checks event counts in risk groups and, if necessary, collapses risk groups so that every group has an event count greater than a given threshold.

scalib_gnd_manual completes the GND test, but requires the user to

  1. create a column in the scalib_object$data_inputs that contains risk group labels (see predrisk_grp_prcnt)

  2. specify the column name of the risk group column and the predicted risk column.

Minimum event counts for risk groups: Low event counts within any risk group may cause high variability in the GND test results. It is recommended that all risk groups have at least 5 events, and this is why the default value of group_min_events_warn is 5. If there are less than 2 events in any group, the GND test is unstable and risk groups should be collapsed. Therefore, the default value of group_min_events_stop is 2.

References

Demler, O.V., Paynter, N.P. and Cook, N.R., 2015. Tests of calibration and goodness‐of‐fit in the survival setting. Statistics in medicine, 34(10), pp.1659-1680. DOI: 10.1002/sim.6428

Examples


sc <- scalib(pred_risk = pbc_scalib$predrisk,
             pred_horizon = 2500,
             event_time = pbc_scalib$test$time,
             event_status = pbc_scalib$test$status)

# run GND test using 10 groups; apply some rules:
#  1. require at least 40 events per group
#  2. try to create 10 groups, and lump the lowest event frequency groups
#     if any groups have less than 40 events.
#  3. Hard stop if this is not obtainable with 5 or more groups.
#  4. set verbose = 2 to see every detail.
# (note that these rules are not a recommendation on how to use the test;
#  this example just shows the mechanics of the automatic group reduction.
#  The recommended values are the default values.)

scalib_gnd(sc,
           group_min_events = 5,
           group_count_init = 10,
           group_count_min = 5,
           verbose = 2)
#> Checking event counts using 10 risk groups...
#> too few; trying again
#> Checking event counts using 9 risk groups...
#> too few; trying again
#> Checking event counts using 8 risk groups...
#> too few; trying again
#> Checking event counts using 7 risk groups...
#> too few; trying again
#> Checking event counts using 6 risk groups...
#> too few; trying again
#> Checking event counts using 5 risk groups...
#> okay
#> 
#> Survival calibration object with prediction horizon of 2500
#> 
#> -- Input data ----------------------------------------------------------------
#> 
#>      event_time event_status prop_hazard rsf_axis gradient_booster rsf_oblique
#>           <int>        <num>       <num>    <num>            <num>       <num>
#>   1:        400            1      0.9990   0.9026           0.9351      0.9463
#>   2:       4500            0      0.4272   0.3072           0.0524      0.3680
#>   3:       1925            1      0.8286   0.4722           0.2342      0.5982
#>   4:       1832            0      0.0358   0.1474           0.0422      0.1460
#>   5:       2466            1      0.0392   0.1925           0.0568      0.1558
#>  ---                                                                          
#> 134:       1300            0      0.1509   0.1783           0.0629      0.1669
#> 135:       1293            0      0.1805   0.3466           0.1299      0.2416
#> 136:       1250            0      0.9823   0.4743           0.3254      0.5727
#> 137:       1230            0      0.0182   0.0589           0.0322      0.0637
#> 138:       1153            0      0.0718   0.1637           0.0527      0.1220
#> 
#> 
#> -- Output data --------------------------------------------------
#> 
#> Key: <._id_.>
#>              ._id_. gnd_df gnd_chisq gnd_pvalue          gnd_data
#>              <fctr>  <num>     <num>      <num>            <list>
#> 1:      prop_hazard      4     12.20   1.59e-02 <data.table[5x8]>
#> 2:         rsf_axis      4      6.00   1.99e-01 <data.table[5x8]>
#> 3: gradient_booster      5     28.99   2.33e-05 <data.table[6x8]>
#> 4:      rsf_oblique      4      5.63   2.28e-01 <data.table[5x8]>
#> 1 variable not shown: [gnd_group_method <char>]
#>