Skip to content

fix: better "generate more" functionality#891

Open
codedealer wants to merge 6 commits intoagnaistic:devfrom
codedealer:fix/generate-more
Open

fix: better "generate more" functionality#891
codedealer wants to merge 6 commits intoagnaistic:devfrom
codedealer:fix/generate-more

Conversation

@codedealer
Copy link
Contributor

@codedealer codedealer commented Apr 7, 2024

Reasoning:
Empirically, continuation of the already generated text is more consistent when the latest message is supplied as is. With that in mind in this PR we remove {{post}} from template when request is passed with kind continue.

Update
Our goal here is not to manipulate {{post}} per se but to cut the prompt as close to the last response in history as possible to let the model continue the generation.

New algorithm:

  1. Manipulate ast so that the history is the last node (if present)
  2. When rendering complex nodes (conditionals, iterables) cut strictly after the message was rendered

That means that stuff like ujb and whatever was written in validated/unvalidated presets after history (like modifiers) will be erased and it can affect the generation results. However, I have tested with different prompt formats (albeit with sub 1 temperatures) and this approach beats the state of art every time.

Known issues:
Currently the concatenation of the existing message and the newly generated response will insert either white space or a new line between the chunks (which expands the existing behavior that always inserts a space). In case where the existing message chunk cuts off mid-word this will cause a white space inserted in the middle of a word. However, the chances of that are small enough comparing to the potential workload to factor this case in, so we ignore it for now.

Additional information
Closes: #869
Discord thread

common/prompt.ts Outdated
) {
const post = []
post.push(`${opts.replyAs.name}:`)
opts.kind !== 'continue' && post.push(`${opts.replyAs.name}:`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use if here instead of JSX-ish inlining.

@sceuick
Copy link
Member

sceuick commented Apr 7, 2024

You probably just need to update the snapshots using pnpm snapshot.
I can't see anything obviously run with the snapshot failures. It'll be easier to see if they're okay once you push the updated snapshots.

const ast = parser.parse(template, {}) as PNode[]

// hack: when we continue, remove the post along with the last newline from tree
if (opts.continue && ast.length > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In unvalidated prompts the {{post}} placeholder can technically be anywhere.

Copy link
Contributor Author

@codedealer codedealer Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! I have reexamined my whole approach to this problem. Our goal here is not to manipulate {{post}} per se but to cut the prompt as close to the last response in history as possible to let the model continue the generation. I have rolled back my change to createPostMessage.

New algorithm:

  1. Manipulate ast so that the history is the last node (if present)
  2. When rendering complex nodes (conditionals, iterables) cut strictly after the message was rendered

That means that stuff like ujb and whatever was written in validated/unvalidated presets after history (like modifiers) will be erased and it can affect the generation results. However, I have tested with different prompt formats (albeit with sub 1 temperatures) and this approach beats the state of art every time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to avoid modifying the template-parser if we send the correct history/lines from the frontend?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, realistically the parser shouldn't make any assumptions about the parts (including history) that it is passed so it shouldn't be modifying it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible. It works without any modification to the template if {{history}} is used. Then we can just cut the tree and it will be enough. The complexity arises from the usage of {{each msg}} because it allows users to modify how each individual line in history is rendered.

So, the line we pass from the front may be perfectly fine (e.g. Hello world) but upon render it will turn into Hello world\n\n. There could be any symbols at the end, not just new lines and after it's rendered it is impossible to determine whether they came from the template or from the message. This is a major cause of interference when it comes to continuing the message.

The only way I found to ensure that the prompt ends on what was sent by the bot is to change how history props are edited in template parser.

if (opts.continue && i === 0 && isHistory) break children_loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess an alternative would be to have a completely different "continue" template that is triggered in this particular case?

codedealer added 6 commits May 2, 2024 02:36
Note: this approach doesn't account for the event when continuation starts mid-word (white space will be inserted anyway). However, statistically chance of this happening is low.
@codedealer codedealer force-pushed the fix/generate-more branch from ce433fc to e54b72e Compare May 2, 2024 02:18
@codedealer
Copy link
Contributor Author

I tried another approach, that doesn't require editing the template parser, although I think there was nothing wrong with it and it worked better because it was more granular.

The new method modifies AST to always put {{history}} in the tree. Although this can interfere with the user's template in a substantial way, it ensures that generation always starts from where it left off.

Sentence concatenation also changed a bit with more reliance on LLM to compose it. Prior there was a discrepancy between the message on the client and the prompt where the prompt was always trimmed, now the client message is also trimmed so the LLM and the user see the same thing.

@codedealer codedealer requested a review from sceuick May 2, 2024 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The "Generate more" button does not work properly

2 participants