Skip to content

Issue with Judge system prompt, GPT4 refusing to output ratings #17

@NamburiSrinath

Description

@NamburiSrinath

Hi @patrickrchao and @eltociear,

Wonderful repo, thanks a lot!

I am wondering if the Judge System prompt for GPT is actually correct i.e Section E in the paper and/or code - https://github.com/patrickrchao/JailbreakingLLMs/blob/main/system_prompts.py#L50

The judge should have the goal/objective and response to do the rating. But am I missing something here?

P.S: I changed the prompt a bit but GPT4 is refusing to provide ratings. I marked this issue in JailBreakBench as well (JailbreakBench/jailbreakbench#34)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions