Skip to content

attack: ported bidirectional fine-tuning attack#15

Draft
psyonp wants to merge 34 commits intomainfrom
psyonp/bidirectional_anchoring
Draft

attack: ported bidirectional fine-tuning attack#15
psyonp wants to merge 34 commits intomainfrom
psyonp/bidirectional_anchoring

Conversation

@psyonp
Copy link
Collaborator

@psyonp psyonp commented Aug 12, 2025

Changes

Summarize the changes in this PR and describe the context or motivation for
them.

Add a title, prepending the tag [attack], [defense], [evaluation], or [infra] if
appropriate.

Testing

Describe how you tested the changes in this PR. E.g., added tests, or ran
command foo and checked the results looked good.

@psyonp psyonp added the attack Adds or modifies attacks label Aug 12, 2025
@psyonp psyonp changed the title attack: ported bidirectional fine-tuning attack (WIP) attack: ported bidirectional fine-tuning attack Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

attack Adds or modifies attacks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants