This repository was archived by the owner on May 13, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 300
This repository was archived by the owner on May 13, 2025. It is now read-only.
@mpc.run_multiprocess can not work correctly since CUDA initialization error occurs #511
Copy link
Copy link
Open
Description
I have install the requirements as instructions and I can successfully execute Tutorial 1, which may indicates that the basic dependency have been installed. However tutorial 2 can not work correctly for me. When executing
import crypten.mpc as mpc
import crypten.communicator as comm
@mpc.run_multiprocess(world_size=2)
def examine_arithmetic_shares():
x_enc = crypten.cryptensor([1, 2, 3], ptype=crypten.mpc.arithmetic)
rank = comm.get().get_rank()
crypten.print(f"\nRank {rank}:\n {x_enc}\n", in_order=True)
x = examine_arithmetic_shares()
I get the following error:
Process Process-2:
Process Process-1:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/CrypTen/tutorials/crypten/mpc/context.py", line 29, in _launch
crypten.init()
File "/home/CrypTen/tutorials/crypten/mpc/context.py", line 29, in _launch
crypten.init()
File "/home/CrypTen/tutorials/crypten/__init__.py", line 77, in init
_setup_prng()
File "/home/CrypTen/tutorials/crypten/__init__.py", line 77, in init
_setup_prng()
File "/home/CrypTen/tutorials/crypten/__init__.py", line 202, in _setup_prng
generators[key][device] = torch.Generator(device=device)
File "/home/CrypTen/tutorials/crypten/__init__.py", line 202, in _setup_prng
generators[key][device] = torch.Generator(device=device)
RuntimeError: CUDA error: initialization error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
RuntimeError: CUDA error: initialization error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
ERROR:root:One of the parties failed. Check past logs
It seems that the error is caused by initializing CUDA in multiple processes. Please help me with that!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels