Skip to content

[QUESTION] GPT-OSS example configs: should --window-size be 127,0 to match sliding_window=128? #3690

@returnL

Description

@returnL

Question
In the GPT-OSS example configs under examples/post_training/modelopt/conf/openai/, the scripts currently use --window-size 128,0.

With the common window_size=(left,right) semantics in causal attention (right=0), the effective visible tokens count is left + 1 (including the current token). To match GPT-OSS sliding_window = 128 tokens (including current token), it seems the correct mapping should be --window-size 127,0.

I opened a minimal PR updating the two example scripts accordingly:
#2771

Could you confirm whether 127,0 is the intended setting for these GPT-OSS example scripts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions