fix: optim_string_condense by pipiPdesu · Pull Request #20 · GraySwanAI/nanoGCG

pipiPdesu · 2024-09-19T13:44:27Z

Referring to issue #15, problem 2, I simulate the user's behavior by interspersing spaces between the letters and systematically removing the characters '!', '?', and '.' from the INIT_CHAR.

It should work well on Llama3 tokenizer and InternLM2 tokenizer, which previously encountered issues (I have not yet tested other models). I suspect that under these circumstances, the filter_ids parameter might be redundant. Even if we successfully optimize a suffix, it seems infeasible to identify a string that can be tokenized as the suffix's token(unless directly pass input_embeds). It might be beneficial to enhance the warning messages to guide users on reporting model incompatibilities or suggest modifications to the INIT_CHAR to resolve tokenization problems.

Let discuss it here and figure it out:)

justinwangx

there are a few things i'm worried about with this approach, which are:

it's not a guarantee that each string in init_str will tokenize to the same length
even if all the strings in init_str tokenize to the same length, i don't think it's necessarily the case that this length will be the intended sequence length

in the prior approach, indexing into the ids after tokenization gives us this guarantee. i think it would be rare for a space-separated sequence of n characters to tokenize to m != n tokens, but i'm not sure that this won't ever occur with some tokenizers.

does using the prior approach but leaving !', '?', and '.' removed still fix our problem?

r.e. identifying strings that are tokenized as the proper suffix, if filter_ids=True, then this should be guaranteed, but, yeah, if not then the string could be tokenized differently

justinwangx · 2024-09-20T19:39:48Z

i do think it could be good courtesy if filter_ids=False to check if the best string at the end of the run actually tokenizes to the best optimized ids (and report to the user if this isn't the case) -- what do you think of that?

pipiPdesu · 2024-09-22T05:00:36Z

Hello, I understand your concern. I have kept the original method and tried to add spaces before and after the letters in INIT_CHAR. This works for some tokenizers (essentially changing from "!" to " !", making it a new token), but for others, the space becomes an additional token. So perhaps we can add a note [here](

nanoGCG/nanogcg/gcg.py

Line 161 in 6e839fe

"Consider setting `filter_ids=False` or trying a different `optim_str_init`"

) suggesting that users can solve the problem by adding spaces or deleting some letters. Also, we will recheck the best string when filter_ids=False.

justinwangx · 2024-09-24T06:07:40Z

sorry, i mean -- it seems like the characters that get condensed are '!', '?', and '.' (with emphasis on '!'). if we simply remove these from being used in the initialization (without adding any spaces or thinking about spaces), do the strings generally work fine without getting condensed?

pipiPdesu · 2024-09-28T11:44:25Z

Make the letters of the string condensed not only '!', '?', and '.', depending on the tokenizer. For example, as long as the tokenizer encodes "!" and " !" into two separate tokens and the tokenizer also contains other tokens with "!", such as "?!", this issue will occur. We need to find a better generation method based on the tokenizer.

Moreover, this characteristic of the tokenizer, along with the implementation of NanoGCG, will cause some more serious problems. This issue arises because we embed the entire input in four separate parts rather than embedding the whole sentence together. This causes the embeddings at the junctions of each part to potentially differ from the embeddings of the entire sentence. Here are two examples that may cause problems:

Taking the llama2 tokenizer as an example, at the junction of after_str and target, the embedding results for [\INST],[\INST]Sure, [\INST] Sure, Sure, [space]Sure are

Although we provided the add_space_before_target parameter, the embedding results are different whether a space is added or not, when embedding separately versus embedding together.

This issue varies depending on the tokenizer. Using internlm as an example, the ideal situation is like this:

Taking the internlm tokenizer as an example, at the junction of before_str and optim_str, optim_str may collapse together with before_str. Suppose before_str ends with ? and optim_str starts with !, they will collapse together just like during buffer initialization.

These issues, like the buffer problem, can be summarized as inconsistencies between direct embeddings and separate embeddings. I am exploring whether this issue affects the overall algorithm.

It might be a bit confusing to discuss here (◎_◎;). Can we open a related issue? If you have any questions or want to see more examples, feel free to discuss with me!

justinwangx · 2024-10-21T19:26:28Z

This is indeed an issue... if this occurs frequently, then I think the best fix would be a refactor to using the slices approach that the original repo uses. We adapted this approach from HarmBench, which means that the HarmBench implementation would also suffer from the same problem.

…aySwanAI/nanoGCG#20)

fix: optim_string_condense

94ca2fa

justinwangx reviewed Sep 20, 2024

View reviewed changes

pipiPdesu mentioned this pull request Sep 28, 2024

About the problems of the buffer #15

Open

Silent-Zebra added a commit to Silent-Zebra/RePULSe that referenced this pull request Sep 5, 2025

try bug fix for gcg eval on new model (see: GraySwanAI/nanoGCG#12, Gr…

3489457

…aySwanAI/nanoGCG#20)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: optim_string_condense#20

fix: optim_string_condense#20
pipiPdesu wants to merge 1 commit intoGraySwanAI:mainfrom
pipiPdesu:fix_concatenating_letters

pipiPdesu commented Sep 19, 2024

Uh oh!

justinwangx left a comment

Uh oh!

justinwangx commented Sep 20, 2024

Uh oh!

pipiPdesu commented Sep 22, 2024

Uh oh!

justinwangx commented Sep 24, 2024

Uh oh!

pipiPdesu commented Sep 28, 2024

Uh oh!

justinwangx commented Oct 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pipiPdesu commented Sep 19, 2024

Uh oh!

justinwangx left a comment

Choose a reason for hiding this comment

Uh oh!

justinwangx commented Sep 20, 2024

Uh oh!

pipiPdesu commented Sep 22, 2024

Uh oh!

justinwangx commented Sep 24, 2024

Uh oh!

pipiPdesu commented Sep 28, 2024

Uh oh!

justinwangx commented Oct 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants