- bitahub环境无法运行app,需要使用sample脚本推理;
- nsa稀疏注意力需要torch2.9+cuda126,可能无法安装; 使用2.7.1解决
- osa稀疏注意力(openai)需要自行编写Triton或cuda算子才能实现加速 TODO
git config --global user.email "shuaic@mail.ustc.edu.cn" git config --global user.name "Chen Shuai" git config --global --unset http.proxy git config --global --unset https.proxy pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple source ./venv/bin/activate
gpt prefill time: 0.8494267463684082s
gpt decode time: 14.262623071670532s
Full sampling takes about 15.23 seconds.
vq decoder takes about 0.77 seconds.
naive inference: 4.3725s/prompt
in coco2014 eval
FID_256px: 15.13755733841549
CLIP score: 0.3203125
PYTHONPATH=. python3 autoregressive/sample/sample_t2i.py
--vq-ckpt ./pretrained_models/vq_ds16_t2i.pt --gpt-ckpt ./pretrained_models/t2i_XL_stage1_256.pt --gpt-model GPT-XL --image-size 256
PYTHONPATH=. python3 autoregressive/sample/sample_c2i.py
--gpt-model ddGPT-L --gpt-ckpt ./pretrained_models/ddllamagen-L.pt --vq-ckpt ./pretrained_models/vq_ds16_c2i.pt --image-size 256
--precision fp16
PYTHONPATH=. python3 autoregressive/train/train_t2i.py
--data-path /data/ChenShuai/coco2014/annotations
--t5-feat-path /data/ChenShuai/coco2014/t5
--vq-ckpt ./pretrained_models/vq_ds16_t2i.pt
--results-dir /output/t2i_XL_512
--global-batch-size 2
--dataset t2i
--image-size 256
--mixed-precision fp16
--gpt-model GPT-B
--debug
--no-compile
--gpt-model Flash-GPT-B
bash ./scripts/autoregressive/train_t2i.sh
PYTHONPATH=. python3 language/extract_t5_feature.py
--data-path /data/ChenShuai/coco2014/annotations
--t5-path /data/ChenShuai/coco2014/t5
--data-start 0 --data-end 5000
--t5-model-path ./pretrained_models/t5-ckpt
PYTHONPATH=. bash ./scripts/autoregressive/sample_t2i_coco.sh
PYTHONPATH=. python3 ./evaluations/t2i/evaluation.py
--fake_dir /data/ChenShuai/coco2014/val/GPT-XL-t2i_XL_stage1_256-coco_captions-size-256-size-256-VQ-16-topk-1000-topp-1.0-temperature-1.0-cfg-7.5-seed-0
--ref_dir /data/ChenShuai/coco2014/val
--ref_data coco2014
--ref_type val
PYTHONPATH=. python3 app.py
--gpt-model GPT-B
--gpt-type c2i
--precision fp16