-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Which feature do you want to include?
We need a repeated (non-deterministic) group k-fold
How do you imagine this integrated in julearn?
Something like this, from @kaurao using ChatGPT
import numpy as np
from sklearn.model_selection import GroupKFold
class RepeatedGroupKFold:
def __init__(self, n_splits=5, n_repeats=5, random_state=None):
self.n_splits = n_splits
self.n_repeats = n_repeats
self.random_state = np.random.RandomState(random_state)
def split(self, X, y=None, groups=None):
if groups is None:
raise ValueError("Groups must be provided for GroupKFold.")
unique_groups = np.unique(groups)
for repeat in range(self.n_repeats):
# Shuffle groups before each repeat
shuffled_groups = self.random_state.permutation(unique_groups)
folds = np.array_split(shuffled_groups, self.n_splits)
for fold_groups in folds:
test_idx = np.isin(groups, fold_groups)
train_idx = ~test_idx
yield np.where(train_idx)[0], np.where(test_idx)[0]
def get_n_splits(self, X=None, y=None, groups=None):
return self.n_splits * self.n_repeats
Do you have a sample code that implements this outside of julearn?
Anything else to say?
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request