fix: add add_generation_prompt parameter to ChatCompletionRequest #64
+7
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix: Add
add_generation_promptparameter to ChatCompletionRequestProblem Description
When sending requests through the router with
continue_final_message: trueset but without explicitly settingadd_generation_prompt, the following error occurs:Root Cause
The
ChatCompletionRequeststruct in the router was missing theadd_generation_promptparameter. When this parameter was not included in the user's request, the router would not forward it to the backend, causing the backend to use its default value oftrue. Ifcontinue_final_message: truewas also set, these two parameters would conflict.This issue does not occur when sending requests directly to the backend model, as all parameters can be controlled directly.
Solution
Added the
add_generation_promptfield to theChatCompletionRequeststruct with a default value oftrue(aligned with vLLM). This ensures:add_generation_promptparameteradd_generation_prompt: false, it can be used compatibly withcontinue_final_message: trueChanges
add_generation_prompt: boolfield toChatCompletionRequeststructtrueusing#[serde(default = "default_true")]to align with vLLM behaviorTesting
Verified the fix using the following curl command:
The request now works correctly without parameter conflict errors.