Skip to content

Fisher-007/SBO-LLMSIM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLMSIM

调用calculon

PYTHONPATH=LLMSIM/calculon/ LLMSIM/calculon/bin/calculon -h
PYTHONPATH=LLMSIM/calculon/ LLMSIM/calculon/bin/calculon llm LLMSIM/calculon/models/megatron-1T.json LLMSIM/calculon/examples/3072_t4_p64_d12_mbs4_full.json LLMSIM/calculon/systems/a100_80g.json -
PYTHONPATH=LLMSIM/calculon/ LLMSIM/calculon/bin/calculon llm-optimal-execution LLMSIM/calculon/models/turing-530B.json 5128 2520 float16 LLMSIM/calculon/systems/a100_80g.json LLMSIM/output/output.json -m
PYTHONPATH=LLMSIM/calculon/ LLMSIM/calculon/bin/calculon llm-all-executions LLMSIM/calculon/models/megatron-1T.json 10256 5040 float16 LLMSIM/calculon/systems/a100_80g.json LLMSIM/output/all-megatron-1T.csv

# Part of parameters are fixed
PYTHONPATH=LLMSIM/calculon/ LLMSIM/calculon/bin/calculon llm-optimal-execution LLMSIM/calculon/models/megatron-1T.json 16384 4096 float16 LLMSIM/calculon/systems/a100_80g.json LLMSIM/output/output-fixed.json -m --fixed --execution LLMSIM/calculon/examples/1T_tx_px_dx_mbs1_full.json

运行llmsim

python LLMSIM/llmsim.py 16384 4096 LLMSIM/calculon/models/megatron-1T.json LLMSIM/calculon/systems/a100_80g.json

TODO

  • 搞清楚calculon各个求解变量的含义与取值范围
  • 完成SBO-LLMSIM代码框架
  • 梳理calculon参数集合生成的逻辑(目前存在找不到合法配置的情况)
  • 理清变量间的约束关系,集成在SBO-LLMSIM的计算函数中完成测试
  • 修改calculon中不合理的遍历逻辑,重新对比试验
  • Failed问题能不能加在约束里面

目前存在scipy的minimize函数优化失败的问题,报错TypeError: unsupported operand type(s) for -: 'int' and 'NoneType'或者double free or corruption (!prev),可能需要想办法减少约束或者改变约束的形式 ==> 更换或者修改calculon框架

About

This project is a preliminary attempt to use SBO to accelerate the speed of searching for the optimal configuration of large models. It is no longer maintained and can be replaced by optuna, which has been proven effective.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages