-
Notifications
You must be signed in to change notification settings - Fork 103
Open
Description
Regarding the batch size issue I mentioned earlier (Issue 22), I have conducted further investigations. I found a critical performance discrepancy when increasing the batch size, specifically linked to the use of latent steps
- When Latent Step = 0: Batch size does not have a significant impact on performance.
- When Latent Step > 0: A noticeable performance decrease is observed as the batch size increases.
Settings
# MBPP+
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 1
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20
# HumanEval+
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 1
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20Resutls
| LatentMAS (latent_steps=10, bs = 1, HF) | LatentMAS (latent_steps=10, bs = 20, HF) | |
|---|---|---|
| MBPP+ | 0.621 | 0.219 |
| HumanEval+ | 0.719 | 0.189 |
Due to resource constraints, I haven't been able to test this on larger models yet. Currently, this issue has been confirmed on the Qwen3-4B model.
I would be grateful if you could look into whether the latent step logic is fully compatible with batched generation. Thank you for your time and for maintaining this great project!
Metadata
Metadata
Assignees
Labels
No labels