-
Notifications
You must be signed in to change notification settings - Fork 35
Description
我在运行vatex部分的training命令,得到了这样的错误,我上网查了下,手动给os.environ['RANK‘]赋值可跳过此错误,但是后面会报错:os.environ['WORLD_SIZE'] key error, 我思考这个问题应该不简单,搞不懂了,请各位大神教我,如何把程序跑通是第一步。。谢谢
File "src/tasks/run_caption_VidSwinBert.py", line 689, in
main(args)
File "src/tasks/run_caption_VidSwinBert.py", line 675, in main
args, vl_transformer, optimizer, scheduler = mixed_precision_init(args, vl_transformer)
File "src/tasks/run_caption_VidSwinBert.py", line 105, in mixed_precision_init
model, optimizer, _, _ = deepspeed.initialize(
File "/home/bwang/anaconda3/envs/qysu_vc/lib/python3.8/site-packages/deepspeed/init.py", line 129, in initialize
dist.init_distributed(dist_backend=dist_backend, dist_init_required=dist_init_required)
File "/home/bwang/anaconda3/envs/qysu_vc/lib/python3.8/site-packages/deepspeed/comm/comm.py", line 592, in init_distributed
init_deepspeed_backend(get_accelerator().communication_backend_name(), timeout, init_method)
File "/home/bwang/anaconda3/envs/qysu_vc/lib/python3.8/site-packages/deepspeed/comm/comm.py", line 148, in init_deepspeed_backend
rank = int(os.environ["RANK"])
File "/home/bwang/anaconda3/envs/qysu_vc/lib/python3.8/os.py", line 675, in getitem
raise KeyError(key) from None
KeyError: 'RANK'