Add LLM-assisted instruction extraction from natural language chat by Chris0Jeky · Pull Request #586 · Chris0Jeky/Taskdeck

Chris0Jeky · 2026-03-29T22:55:39Z

Summary

OpenAI and Gemini providers now extract structured instructions from natural language via LLM structured output (JSON mode)
A shared LlmInstructionExtractionPrompt defines the system prompt and response parser
ChatService uses LLM-extracted instructions when available, falls back to static classifier
Multi-instruction support: a single message can produce multiple proposals
Mock provider unchanged for deterministic test behavior
Review-first gate preserved: extracted instructions become proposals, not direct mutations

Closes #573

Test plan

Existing mock-based tests pass unchanged (960 application tests pass)
New LlmInstructionExtractionPromptTests (12 tests) verify JSON parsing, code fences, edge cases
New ChatServiceTests (4 tests) verify instruction extraction flow, fallback, multi-proposal
Updated provider tests account for system prompt injection
dotnet test passes clean across all projects (1,544 total tests)
Manual: "can you create onboarding tasks?" with real provider should produce a proposal

Extend LlmCompletionResult with optional Instructions list for LLM-extracted structured instructions. Extend ChatCompletionRequest with optional SystemPrompt field for provider-specific system prompts.

Shared system prompt and JSON response parser used by OpenAI and Gemini providers to extract actionable board instructions from natural language chat messages. Handles markdown code fences, missing fields, and malformed JSON gracefully.

Add system prompt for instruction extraction, request JSON mode via response_format, and parse structured output into LlmCompletionResult.Instructions. Falls back to static LlmIntentClassifier when structured parse fails.

Add system prompt prepended as user message, request JSON via responseMimeType, and parse structured output into LlmCompletionResult.Instructions. Falls back to static LlmIntentClassifier when structured parse fails.

When LlmCompletionResult.Instructions has entries, iterate each and call ParseInstructionAsync individually. Falls back to raw user message when no instructions are extracted. Supports multiple proposals from a single message.

OpenAI and Gemini providers now prepend a system prompt message, so role mapping tests expect two messages instead of one.

LlmInstructionExtractionPromptTests: 12 tests covering valid/invalid JSON, code fences, missing fields, multiple instructions, and system prompt content. ChatServiceTests: 4 new tests for LLM-extracted instruction flow, fallback behavior, multi-instruction proposals, and empty instructions list handling.

chatgpt-codex-connector · 2026-03-29T22:55:43Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

gemini-code-assist

Code Review

This pull request implements LLM-assisted instruction extraction for the OpenAI and Gemini providers, enabling the conversion of natural language chat messages into structured board actions. Key changes include a new shared system prompt, JSON response parsing logic, and updates to the ChatService to support multiple extracted instructions. Feedback identifies a high-severity issue in the Gemini provider where system prompts are incorrectly prepended as user messages, which may violate API role requirements. Furthermore, the markdown stripping logic for parsing LLM responses is noted as fragile and should be improved using a more robust brace-matching strategy.

gemini-code-assist · 2026-03-29T22:57:36Z

+            var systemPrompt = request.SystemPrompt ?? LlmInstructionExtractionPrompt.SystemPrompt;
+            var allMessages = new List<object>
+            {
+                new
+                {
+                    role = "user",
+                    parts = new[] { new { text = systemPrompt } }
+                }
+            };
+            allMessages.AddRange(request.Messages.Select(MapMessage));
+
            message.Content = JsonContent.Create(new
            {
-                contents = request.Messages.Select(MapMessage).ToArray(),
+                contents = allMessages.ToArray(),
                generationConfig = new
                {
                    temperature = request.Temperature,
-                    maxOutputTokens = request.MaxTokens
+                    maxOutputTokens = request.MaxTokens,
+                    responseMimeType = "application/json"
                }
            });


The current implementation prepends the system prompt as a user message. This will cause API errors for any multi-turn conversation, as the Gemini API requires roles to alternate between user and model. Prepending a user message will result in two consecutive user messages at the start of the conversation.

The recommended way to provide system instructions to Gemini models is by using the system_instruction field at the top level of the request payload. This avoids altering the message history and ensures the role sequence remains valid.

var systemPrompt = request.SystemPrompt ?? LlmInstructionExtractionPrompt.SystemPrompt; message.Content = JsonContent.Create(new { contents = request.Messages.Select(MapMessage).ToArray(), system_instruction = new { parts = new[] { new { text = systemPrompt } } }, generationConfig = new { temperature = request.Temperature, maxOutputTokens = request.MaxTokens, responseMimeType = "application/json" } });

gemini-code-assist · 2026-03-29T22:57:36Z

+            var trimmed = responseBody.Trim();
+            if (trimmed.StartsWith("```", StringComparison.Ordinal))
+            {
+                var firstNewline = trimmed.IndexOf('\n');
+                if (firstNewline >= 0)
+                    trimmed = trimmed[(firstNewline + 1)..];
+                if (trimmed.EndsWith("```", StringComparison.Ordinal))
+                    trimmed = trimmed[..^3].TrimEnd();
+            }


The current logic for stripping markdown code fences is not fully robust. It can fail if the LLM returns a fenced JSON block without a newline after the language specifier (e.g., ```json{...}```). This would cause JSON parsing to fail and the system to incorrectly fall back to the static classifier.

A more resilient approach is to find the first opening brace { and the last closing brace } to extract the JSON object, as this is less sensitive to variations in markdown formatting.

var firstBrace = responseBody.IndexOf('{'); var lastBrace = responseBody.LastIndexOf('}'); if (firstBrace == -1 || lastBrace < firstBrace) return false; // Not a valid JSON object structure var trimmed = responseBody[firstBrace..(lastBrace + 1)];

Probe requests now pass SystemPrompt = "" to avoid forcing JSON mode, which would cause the LLM to return JSON instead of the expected "OK" response. Instruction extraction system prompt and JSON mode are only applied when SystemPrompt is null (the default for chat requests).

Chris0Jeky · 2026-03-29T22:58:37Z

Adversarial Self-Review

Issues Found and Fixed

Probe requests broken by JSON mode (fixed in 43d5acf): Probe requests (Reply with exactly: OK) were getting the instruction extraction system prompt and JSON mode forced on, which would cause the LLM to either return JSON instead of "OK" or error on non-JSON output. Fixed by having probes pass SystemPrompt = "" to opt out.

Remaining Considerations

Prompt injection risk (low):

The system prompt instructs the LLM to respond with JSON only. A malicious user message could attempt to override this (e.g., "ignore the JSON format and ..."). However:
- The existing ContainsBlockedPromptPattern check in ChatService already blocks common prompt injection phrases
- Even if the LLM returns non-JSON, TryParseStructuredResponse fails gracefully and falls back to the static classifier
- Instructions still go through ParseInstructionAsync which validates against structured patterns before creating proposals
- The review-first gate (proposals, not direct mutations) is preserved

Edge cases handled:

Empty/null instructions list: falls back to raw user message
Malformed JSON from LLM: falls back to static classifier
Markdown code fences in LLM response: stripped before parsing
Partial success in multi-instruction: reports successful proposals, surfaces first failure

Potential follow-up items:

The system prompt is static and does not include board context (column names, existing cards). Board-context prompting (Board-context-aware LLM prompting for chat proposals #575) would improve instruction quality.
Multi-instruction support creates one proposal per instruction rather than batching. Issue Multi-instruction parsing for batch chat requests #574 could optimize this.
responseMimeType for Gemini and response_format for OpenAI are always sent for non-probe requests, even when structured output is not strictly needed. This is harmless but slightly wasteful for conversational messages.
The firstSuccess variable in ChatService is computed but not directly used (only proposalIds is used for the response). This is intentional for future use but could be cleaned up.

…pt as user message The Gemini API supports a top-level system_instruction field for system prompts. Sending the system prompt as a user message breaks multi-turn conversations by creating consecutive user messages. This moves the system prompt to system_instruction and omits it when empty (e.g. for probe requests). Updates tests to verify the new payload structure.

…raction The regex-based code fence stripping fails when the LLM returns JSON without a newline after the language specifier (e.g. ```json{...}```). Use brace-matching (first '{' to last '}') to reliably extract the JSON object regardless of surrounding text or formatting. Adds tests for the no-newline edge case, bare code fences, and JSON with surrounding prose.

Chris0Jeky · 2026-03-29T23:37:43Z

Gemini code review findings fixed

1. System prompt sent via `system_instruction` (HIGH)

Commit: aefdbc3

Moved the system prompt from being prepended as a user message in the contents array to the Gemini API's native system_instruction top-level field. This prevents consecutive user messages that break multi-turn conversations. When the system prompt is empty (e.g. probe requests), the system_instruction field is omitted entirely.

Files changed:

backend/src/Taskdeck.Application/Services/GeminiLlmProvider.cs
backend/tests/Taskdeck.Application.Tests/Services/GeminiLlmProviderTests.cs -- updated null-role test, added two new tests verifying system_instruction presence/absence

2. Brace-matching JSON extraction (HIGH)

Commit: 351a349

Replaced the regex-based markdown code fence stripping with brace-matching (IndexOf('{') to LastIndexOf('}')). The old approach failed when the LLM returned JSON without a newline after the language specifier (e.g. ```json{"reply":...}```). The new approach handles all formatting variations: code fences with/without newlines, bare JSON, and JSON with surrounding prose.

Files changed:

backend/src/Taskdeck.Application/Services/LlmInstructionExtractionPrompt.cs
backend/tests/Taskdeck.Application.Tests/Services/LlmInstructionExtractionPromptTests.cs -- added 3 new edge case tests

Verification

All 1,690 backend tests pass (0 failures): dotnet test backend/Taskdeck.sln -c Release -m:1

Chris0Jeky added 8 commits March 29, 2026 23:54

Add Instructions and SystemPrompt fields to LLM request/result records

2926e53

Extend LlmCompletionResult with optional Instructions list for LLM-extracted structured instructions. Extend ChatCompletionRequest with optional SystemPrompt field for provider-specific system prompts.

Update OpenAiLlmProvider with structured instruction extraction

1297a61

Add system prompt for instruction extraction, request JSON mode via response_format, and parse structured output into LlmCompletionResult.Instructions. Falls back to static LlmIntentClassifier when structured parse fails.

Update GeminiLlmProvider with structured instruction extraction

9286389

Add system prompt prepended as user message, request JSON via responseMimeType, and parse structured output into LlmCompletionResult.Instructions. Falls back to static LlmIntentClassifier when structured parse fails.

Update provider tests for system prompt message injection

11caafc

OpenAI and Gemini providers now prepend a system prompt message, so role mapping tests expect two messages instead of one.

Update STATUS.md to reflect LLM instruction extraction delivery (#573)

af57424

github-project-automation bot added this to Taskdeck Execution Mar 29, 2026

github-project-automation bot moved this to Pending in Taskdeck Execution Mar 29, 2026

gemini-code-assist bot reviewed Mar 29, 2026

View reviewed changes

Chris0Jeky added 3 commits March 29, 2026 23:59

Remove unused firstSuccess variable in ChatService instruction loop

f337370

Chris0Jeky merged commit f0741cf into main Mar 29, 2026
18 checks passed

github-project-automation bot moved this from Pending to Done in Taskdeck Execution Mar 29, 2026

Chris0Jeky deleted the enhance/573-llm-instruction-extraction branch March 29, 2026 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM-assisted instruction extraction from natural language chat#586

Add LLM-assisted instruction extraction from natural language chat#586
Chris0Jeky merged 12 commits intomainfrom
enhance/573-llm-instruction-extraction

Chris0Jeky commented Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 29, 2026

Uh oh!

gemini-code-assist bot Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Chris0Jeky commented Mar 29, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Chris0Jeky commented Mar 29, 2026

Adversarial Self-Review

Issues Found and Fixed

Remaining Considerations

Uh oh!

Chris0Jeky commented Mar 29, 2026

Gemini code review findings fixed

1. System prompt sent via system_instruction (HIGH)

2. Brace-matching JSON extraction (HIGH)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. System prompt sent via `system_instruction` (HIGH)