Skip to content

Why used rewards[:-1] instead of rewards[1:]? #1

@M-Heidari2000

Description

@M-Heidari2000

I have been reading this code and I saw in line 177 of train.py, rewards[:-1] is used instead of rewards[1:] for reward_loss. Is that a bug? If not, could you please explain why rewards[:-1] is correct?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions