Skip to content

Performance degradation when Batch Size > 1 with Latent Steps #25

@wonjun-chung

Description

@wonjun-chung

Regarding the batch size issue I mentioned earlier (Issue 22), I have conducted further investigations. I found a critical performance discrepancy when increasing the batch size, specifically linked to the use of latent steps

  1. When Latent Step = 0: Batch size does not have a significant impact on performance.
  2. When Latent Step > 0: A noticeable performance decrease is observed as the batch size increases.

Settings

# MBPP+ 
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10  --generate_bs 1 
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task mbppplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20

# HumanEval+
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10  --generate_bs 1 
python run.py --method latent_mas --model_name Qwen/Qwen3-4B --task humanevalplus --max_samples -1 --prompt sequential --max_new_tokens 4096 --latent_steps 10 --generate_bs 20

Resutls

  LatentMAS (latent_steps=10, bs = 1, HF) LatentMAS (latent_steps=10, bs = 20, HF)
MBPP+ 0.621 0.219
HumanEval+ 0.719 0.189

Due to resource constraints, I haven't been able to test this on larger models yet. Currently, this issue has been confirmed on the Qwen3-4B model.
I would be grateful if you could look into whether the latent step logic is fully compatible with batched generation. Thank you for your time and for maintaining this great project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions