STUN

We provide scripts that we used for our experiments.

We also provide our optimized version of OWL and wanda that prunes 480B model even only one 80GB GPU.

STUN with snowflake-arctic

python utils/merge_weight.py --model_path snowflake-arctic --output_dir=new_arctic_snowflake_0.5_127_26_3570_nodivnorm_greedy_but_reject_mixed_load_max2 --layer_num 35 --gate_template "model.layers.{}.block_sparse_moe.gate.weight" --threshold=0.5 --top_k=127 --division_by_norm=False --merge_method=greedy_but_reject_mixed --snake_case_model_name=arctic --expert_num_key_in_config=num_local_experts --expert_template=model.layers.{}.block_sparse_moe.experts.{}.w1.weight,model.layers.{}.block_sparse_moe.experts.{}.w2.weight,model.layers.{}.block_sparse_moe.experts.{}.w3.weight --router_logits_file=arctic.txt  --load_path=arctic.pt --divide_mean  --merge_max_clusters=2  --binary_search_target=3570

Then run OWL or wanda.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
OWL		OWL
utils		utils
wanda		wanda
README.md		README.md
assemble_logit_patterns.py		assemble_logit_patterns.py
get_router_logit_patterns.py		get_router_logit_patterns.py
merge_weight.py		merge_weight.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STUN

STUN with snowflake-arctic

About

Uh oh!

Releases

Packages

Languages

thnkinbtfly/STUN

Folders and files

Latest commit

History

Repository files navigation

STUN

STUN with snowflake-arctic

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages