-
Notifications
You must be signed in to change notification settings - Fork 249
Add Support for Toggling Guidance #1931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@conleypri please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
|
Thank you for your contribution! You can still dynamically control when and how guidance is applied through the Lark grammar that you construct. For example, Phi-4 mini reasoning has a Chain-of-Thought (CoT) process that reasons through before beginning to return output. Here is an example of how you can write the grammar to enable the model to think before returning a tool call via JSON output. This will allow the CoT tokens to be produced before applying a JSON schema to constrain the output. You can write multiple conditionals to ensure more flexibility in the generated output. Here is another example where the output after CoT can be text or a tool call. As long as you construct enough options in your grammar, you should be able to cover all of your cases without needing to toggle guidance on or off. Therefore, the changes in this PR should not be necessary. You can find more information about the available features in a Lark grammar here. |
Add Support for Toggling Guidance
Issue
When JSON guidance constraints are enabled, features like Chain of Thought and Tool Calling are disabled, since the model is forced to only output valid JSON at all times. Note: The Generator object is immutable, and guidance constraints (such as JSON output enforcement) must be set during generator initialization.
Resolution
This PR introduces the ability to toggle guidance constraints on and off. This allows guidance constraints to be set at generator initialization, but only enforced when needed (e.g., at the Final channel). This enables workflows where Chain of Thought and Tool Calling can proceed unconstrained, and strict JSON output is enforced only at the appropriate stage.
Additional Notes: