cumulative_dynamic_auc,RuntimeWarning: invalid value encountered in divide true_pos = cumsum_tp / cumsum_tp[-1] #507
-
Dear author, Upon investigation, I find that there's a RuntimeWarning at line 484 in metrics.py indicating an invalid value encountered in divide: true_pos = cumsum_tp / cumsum_tp[-1], where cumsum_tp[-1] contains a zero value. The data I'm using is from the public METABRIC dataset. For calculating c-index and IBS (Integrated Brier Score), I use pycox. The code for calculation directly follows the example of LogisticHazard in pycox. Both the training set and test set contain data with events marked as true and false across different time intervals. I want to know what causes the NaN values in the AUC result and how to solve this problem. If you can help answer this, I would be very grateful. The complete code is as follows: import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn_pandas import DataFrameMapper
import torch
import torchtuples as tt
from pycox.datasets import metabric
from pycox.models import LogisticHazard
from pycox.evaluation import EvalSurv
import sksurv
from sksurv.metrics import cumulative_dynamic_auc
np.random.seed(1234)
_ = torch.manual_seed(123)
df_train = metabric.read_df()
df_test = df_train.sample(frac=0.2)
df_train = df_train.drop(df_test.index)
df_val = df_train.sample(frac=0.2)
df_train = df_train.drop(df_val.index)
cols_standardize = ['x0', 'x1', 'x2', 'x3', 'x8']
cols_leave = ['x4', 'x5', 'x6', 'x7']
standardize = [([col], StandardScaler()) for col in cols_standardize]
leave = [(col, None) for col in cols_leave]
x_mapper = DataFrameMapper(standardize + leave)
x_train = x_mapper.fit_transform(df_train).astype('float32')
x_val = x_mapper.transform(df_val).astype('float32')
x_test = x_mapper.transform(df_test).astype('float32')
num_durations = 10
labtrans = LogisticHazard.label_transform(num_durations)
get_target = lambda df: (df['duration'].values, df['event'].values)
y_train = labtrans.fit_transform(*get_target(df_train))
y_val = labtrans.transform(*get_target(df_val))
train = (x_train, y_train)
val = (x_val, y_val)
durations_test, events_test = get_target(df_test)
durations_train, events_train = get_target(df_train)
in_features = x_train.shape[1]
num_nodes = [32, 32]
out_features = labtrans.out_features
batch_norm = True
dropout = 0.1
net = tt.practical.MLPVanilla(in_features, num_nodes, out_features, batch_norm, dropout)
model = LogisticHazard(net, tt.optim.Adam(0.01), duration_index=labtrans.cuts)
batch_size = 256
epochs = 100
callbacks = [tt.cb.EarlyStopping()]
log = model.fit(x_train, y_train, batch_size, epochs, callbacks, val_data=val)
surv = model.interpolate(10).predict_surv_df(x_test)
ev = EvalSurv(surv, durations_test, events_test, censor_surv='km')
ctd = ev.concordance_td('antolini')
print(f'Cindex Score for test: {ctd}')
# time_grid = np.linspace(durations_test.min(), durations_test.max(), 100)
# IBS = ev.integrated_brier_score(time_grid)
# print(f'IBS Score for test: {IBS}')
start_t = min(durations_test)
end_t = max(durations_test)
times = np.arange(start_t, end_t, ((end_t - start_t) / (model.predict(x_test).squeeze().shape[1])))
def transform_to_struct_array(times, events):
return sksurv.util.Surv.from_arrays(events, times)
estimate = model.predict(x_test).squeeze()
survival_train = transform_to_struct_array(durations_train, events_train)
survival_test = transform_to_struct_array(durations_test, events_test)
AUC_metric = cumulative_dynamic_auc(
survival_train,
survival_test,
estimate,
times)
print(f'AUC metric is:{AUC_metric[1]}') |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Thanks for the detailed report. Would it be possible to provide the actual values, you passed to |
Beta Was this translation helpful? Give feedback.
The problem is
times
, i.e. the time points where the time-dependent ROC is evaluated. The first entry is 0, but the first event insurvival_test
occurs at time 3.5. Hence, at time point 0 all samples are censored and there is no information which are the cases and which are the controls.This should be checked in
cumulative_dynamic_auc
and throw an exception that explains why this happened.