Skip to content

Question about CLS/SEP usage: paper description vs code implementation #2

@SuFame920

Description

@SuFame920
Hello authors, thanks for releasing the code👍. I’m trying to reproduce the text encoding described in the paper and noticed a possible discrepancy. Could you please clarify?
In the paper (Eq. 1–3), it seems each utterance is preceded by a [CLS], and the whole dialogue ends with a [SEP], and the utterance representation is the corresponding h_cls,i.
However, in the released code:
1.loader.py pack(...) starts each segment with a single [CLS], and then appends [SEP] after every utterance.
2.model.py merge_input(...) uses the last token before [SEP] for each utterance and adds the segment-level [CLS] vector, instead of using per-utterance [CLS].
So the implementation seems to be “segment [CLS] + utterance-ending token”, not “per-utterance [CLS]”.
Could you confirm which is the intended behavior? If the code is correct, should the paper description be interpreted differently? If the paper description is intended, would you recommend adjusting the code to add [CLS] per utterance?
Thanks again for your work!
Best wishes,
SuFame

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions