Skip to content

Improve Completion Caching #19

@kfallah

Description

@kfallah

Responses from the training engines include templates and CoT. OpenClaw does not filter this out for every model, resulting in unwanted text in responses. To address this, the inference backends strip CoT and templates. But for training we need exact-token rollouts to remain on-policy. To address this, we have a hacky cache in the inference backends that retrieve the unfiltered responses from the filtered ones. We should build a more robust approach compared to this hack.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions