Skip to content

本地运行失败,这个代码是否必须在幻方的集群里面运行? #27

@jialiangZ

Description

@jialiangZ

$ python train_fourcastnet.py --pretrain-epochs 10 --fintune-epochs 4 --batch-size 1

报错:

非集群环境
非集群环境
Traceback (most recent call last):
File "train_fourcastnet.py", line 207, in
hfai.multiprocessing.spawn(main, args=(
File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/hfai/multiprocessing/spawn.py", line 66, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn', bind_numa=bind_numa)
File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/hfai/multiprocessing/spawn.py", line 37, in start_processes
while not context.join():
File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 130, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGSEGV

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions