-
Notifications
You must be signed in to change notification settings - Fork 130
Open
Description
HDFS Training is failing as the fixed_window for a min_len of 10 is returning empty array . This is because the hdfs/train file has only seq number and because of the samples are 0
head ../output/hdfs/train
6
13
1
9
3
5
3
3
3
========================================
10%|████████████████▋ | 213/2131 [00:00<00:00, 657385.40it/s]
=> Total available sequences: 0
=> Using 21 samples for validation
Traceback (most recent call last):
File "logbert-main/HDFS/logbert.py", line 103, in <module>
Trainer(options).train()
File "logbert-main/HDFS/../bert_pytorch/train_log.py", line 62, in train
logkey_train, logkey_valid, time_train, time_valid = generate_train_valid(self.output_path + "train", window_size=self.window_size,
File "logbert-main/HDFS/../bert_pytorch/dataset/sample.py", line 99, in generate_train_valid
logkey_trainset, logkey_validset, time_trainset, time_validset = train_test_split(logkey_seq_pairs,
File ".local/lib/python3.10/site-packages/sklearn/utils/_param_validation.py", line 216, in wrapper
return func(*args, **kwargs)
File ".local/lib/python3.10/site-packages/sklearn/model_selection/_split.py", line 2851, in train_test_split
n_train, n_test = _validate_shuffle_split(
File ".local/lib/python3.10/site-packages/sklearn/model_selection/_split.py", line 2426, in _validate_shuffle_split
raise ValueError(
ValueError: test_size=21 should be either positive and smaller than the number of samples 0 or a float in the (0, 1) range
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels