Make training deterministic #101
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Using
seed_everything(seed)helps with reproducibility by setting seeds for Python'srandom, NumPy, and PyTorch (both CPU and CUDA), and it also sets:random.seed(seed)— Python’s built-in random module.np.random.seed(seed)— NumPy's random number generator.torch.manual_seed(seed)— PyTorch’s RNG on CPU.torch.cuda.manual_seed(seed)— PyTorch’s RNG on current GPU.torch.cuda.manual_seed_all(seed)— For all GPUs.torch.backends.cudnn.deterministic = True— Forces cuDNN to use deterministic algorithms.torch.backends.cudnn.benchmark = False— Disables the autotuner that selects the fastest convolution algorithm (introduces randomness).However, this doesn't imply that
Trainer(deterministic=True)is set. Thedeterministicflag in the Trainer does more — it enforces deterministic algorithms during training and can catch known non-deterministic ops (especially on GPU), which is useful for debugging reproducibility issues.✅ Recommendation: Use both together for maximum reproducibility:
Pytorch-lightning sets
seed_everythingto 0 by default, if no seed is provided but not the trainer's deterministic flag which is set to None by default.Please check the below links: