K pliant CV avec xgboost
grid = pd.DataFrame({'eta':[0.01,0.05,0.1]*2,
'subsample':np.repeat([0.1,0.3],3)})
def fit(x):
params = {'objective':'binary:logistic',
'eval_metric':'logloss',
'eta':x[0],
'subsample':x[1]}
xgb_cv = xgb.cv(dtrain=data_dmatrix, params=params,
nfold=5, metrics = 'logloss',seed=42)
return xgb_cv[-1:].values[0]
grid[['train-logloss-mean','train-logloss-std',
'test-logloss-mean','test-logloss-std']] = grid.apply(fit,axis=1,result_type='expand')
eta subsample train-logloss-mean train-logloss-std test-logloss-mean test-logloss-std
0 0.01 0.1 0.663682 0.003881 0.666744 0.003598
1 0.05 0.1 0.570629 0.012555 0.580309 0.023561
2 0.10 0.1 0.503440 0.017761 0.526891 0.031659
3 0.01 0.3 0.646587 0.002063 0.653741 0.004201
4 0.05 0.3 0.512229 0.008013 0.545113 0.018700
5 0.10 0.3 0.414103 0.012427 0.472379 0.032606
Real Raccoon