Skip to content
Felix Pei edited this page Sep 15, 2021 · 5 revisions

Can I train my model at a different bin size and re-bin or upsample the data to the desired submission bin size?

For the competition, you are allowed to do this, though we recommend that you make note of this in the EvalAI submission metadata, either in the 'Method description' field or the 'Additional materials' field.

Due to the way NWBDataset handles resampling/binning data, the naive approach of running provided functions at the desired modeling bin size is not optimal. NWBDataset bins data starting from the beginning of the recording, not individually for each trial window, so all timestamps after resampling are multiples of the new bin size. Thus, if a given trial's alignment window does not lie exactly on a multiple of the bin size, the actual data included in the resampled alignment window is not the same as the data lying within the alignment window at the original 1 ms resolution.

When modeling at a smaller bin size than the desired submission bin size, a solution to this is to round the alignment points up to the nearest multiple of the desired submission bin size. The following code snippet demonstrates a possible implementation of this:

from nlb_tools.make_tensors import PARAMS, make_train_input_tensor, make_eval_input_tensor

modeling_bin_size = 1 # modeling at 1 ms bin size
submission_bin_size = 5 # submitting at 5 ms bin size

# Get original alignment points and round up to nearest multiple of submission_bin_size
align_field = PARAMS[dataset_name]['make_params']['align_field']
rounded_align_field = dataset.trial_info[align_field].dt.ceil(f'{submission_bin_size}ms')
dataset.trial_info[f'{align_field}_{submission_bin_size}ms'] = rounded_align_field

# Create new make_params with updated align field
make_params = PARAMS[dataset_name]['make_params'].copy()
make_params['align_field'] = f'{align_field}_{submission_bin_size}ms'

# Use new make_params for tensor creation functions
train_dict = make_train_input_tensors(dataset, dataset_name, train_split, update_params={'make_params': make_params})
eval_dict = make_eval_input_tensors(dataset, dataset_name, eval_split, update_params={'make_params': make_params}

The tensors in train_dict and eval_dict are at 1 ms bin size but are aligned just as the 5 ms tensors would be.

If you want to model at a larger bin size than the desired submission bin size, we recommend simply making the tensors at the submission bin size and re-binning separately afterwards.

Clone this wiki locally