Skip to content

Fix chat to auto-create proposals on actionable intent#567

Merged
Chris0Jeky merged 6 commits intomainfrom
fix/512-chat-creates-proposals
Mar 29, 2026
Merged

Fix chat to auto-create proposals on actionable intent#567
Chris0Jeky merged 6 commits intomainfrom
fix/512-chat-creates-proposals

Conversation

@Chris0Jeky
Copy link
Copy Markdown
Owner

Summary

  • Root cause: ChatService.SendMessageAsync only created proposals when dto.RequestProposal was explicitly true, which defaults to false. Users had to manually check a hidden checkbox to get proposals — otherwise actionable messages like "create card" returned prose with messageType: "status" and no proposal.
  • Fix: When the LLM classifies a message as actionable (IsActionable = true), the chat service now automatically routes through AutomationPlannerService.ParseInstructionAsync to create a review-first proposal (GP-06 compliant). The RequestProposal flag is repurposed as a force-try for non-actionable messages.
  • Intent classifier: Broadened LlmIntentClassifier to recognize natural language like "create a task", "new card", "make a task", "remove card", "rename board" — not just exact command patterns.
  • Graceful degradation: If the planner cannot parse the instruction, the response includes the LLM prose plus a hint about what went wrong. If no board is scoped, a hint directs the user to open a board-scoped session.

Closes #512

Test plan

  • All 4 new unit tests pass (auto-proposal, no-board hint, parse failure hint, explicit RequestProposal fallback)
  • Updated integration test passes (expects proposal-reference instead of status)
  • Full test suite: 0 failures across all projects (394 Api + Application + Domain + Architecture + CLI)
  • Manual: send "create card 'My Task'" in chat without checking RequestProposal → should see proposal-reference
  • Manual: send same in a non-board-scoped session → should see status with board-scope hint

When the LLM classifies a message as actionable (e.g. "create card",
"move card"), the chat service now automatically creates a review-first
proposal via AutomationPlannerService without requiring the user to
explicitly check RequestProposal. The RequestProposal flag is repurposed
as a force-try mechanism for non-actionable messages.

Fixes the core issue: chat surface responded with prose but never
created proposals or tasks.

Fixes #512
Add patterns like "create a task", "new card", "make a task",
"remove card", "rename board" so the mock provider detects actionable
intent from natural language, not just exact command syntax.
Cover: auto-proposal without RequestProposal, status hint when no board
scope, graceful fallback when planner cannot parse, and explicit
RequestProposal for non-actionable messages.
The test previously expected "status" for actionable messages without
RequestProposal; update to expect "proposal-reference" now that chat
auto-creates proposals.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@Chris0Jeky
Copy link
Copy Markdown
Owner Author

Self-review

GP-06 compliance

Verified: all actionable intents route through AutomationPlannerService.ParseInstructionAsync which creates PendingReview proposals. No direct board mutations.

Behavior changes

  • No-board-scope actionable message: changed from error to status with hint. Rationale: the LLM response is still useful; downgraded from error because this is a missing-context situation, not a failure.
  • Planner parse failure: changed from error to status with explanatory hint appended to LLM prose. Same rationale.
  • Auto-proposal without RequestProposal: this is the core fix. Existing RequestProposal: true behavior is preserved.

Traceability improvement

Now passes sourceType: ProposalSourceType.Chat and sourceReferenceId: session.Id to the planner, enabling proposal-to-session tracing.

Pre-existing issue noted

LlmIntentClassifier matches "sort" as a substring which could false-positive on phrases like "some sort of...". Not introduced by this PR but worth a future refinement with word-boundary matching.

No issues found requiring action

  • No injection vectors (input already filtered by prompt denylist, planner uses regex parsing)
  • No silent mutations (all proposals go to review queue)
  • Error handling is explicit throughout
  • All 394+ tests pass

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the chat service by automatically creating task proposals when actionable intent is detected by the LLM, even without an explicit user request. It also expands the LlmIntentClassifier with more natural language patterns and adds comprehensive unit and API tests for these new flows. Feedback was provided regarding significant code duplication in the ChatService proposal handling logic and the use of non-deterministic time functions in unit tests.

Comment on lines 220 to 270
if (llmResult.IsActionable)
{
if (!session.BoardId.HasValue)
{
messageType = "error";
assistantContent = "Actionable instructions require a board-scoped chat session. Create a session with BoardId and retry.";
// Surface a hint so the user knows why no proposal was created
assistantContent = $"{llmResult.Content}\n\n(To act on this, open a board-scoped chat session.)";
messageType = "status";
}
else
{
var proposalResult = await _automationPlanner.ParseInstructionAsync(
dto.Content,
userId,
session.BoardId,
ct);
ct,
sourceType: ProposalSourceType.Chat,
sourceReferenceId: session.Id.ToString());

if (proposalResult.IsSuccess)
{
messageType = "proposal-reference";
proposalId = proposalResult.Value.Id;
assistantContent = $"{llmResult.Content}\n\nProposal created: {proposalResult.Value.Id}";
assistantContent = $"{llmResult.Content}\n\nProposal created for review: {proposalResult.Value.Id}";
}
else
{
messageType = "error";
assistantContent = $"I could not create a proposal: {proposalResult.ErrorMessage}";
// Planner could not parse — return the LLM prose with a hint
assistantContent = $"{llmResult.Content}\n\n(I detected a task request but could not parse it into a proposal: {proposalResult.ErrorMessage})";
messageType = "status";
}
}
}
else if (llmResult.IsActionable)
else if (dto.RequestProposal && session.BoardId.HasValue)
{
messageType = "status";
// User explicitly requested proposal even though LLM did not
// detect actionable intent — try the planner anyway.
var proposalResult = await _automationPlanner.ParseInstructionAsync(
dto.Content,
userId,
session.BoardId,
ct,
sourceType: ProposalSourceType.Chat,
sourceReferenceId: session.Id.ToString());

if (proposalResult.IsSuccess)
{
messageType = "proposal-reference";
proposalId = proposalResult.Value.Id;
assistantContent = $"{llmResult.Content}\n\nProposal created for review: {proposalResult.Value.Id}";
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's significant code duplication in handling proposal creation. The logic for calling _automationPlanner.ParseInstructionAsync and handling the success case is repeated. This can be refactored to improve maintainability and reduce redundancy. By combining the conditions, you can have a single path for proposal creation that handles all cases correctly.

                var shouldAttemptProposal = llmResult.IsActionable || (dto.RequestProposal && session.BoardId.HasValue);

                if (shouldAttemptProposal)
                {
                    if (!session.BoardId.HasValue)
                    {
                        // This case is only hit when llmResult.IsActionable is true, due to the outer condition.
                        assistantContent = $"{llmResult.Content}\n\n(To act on this, open a board-scoped chat session.)";
                        messageType = "status";
                    }
                    else
                    {
                        var proposalResult = await _automationPlanner.ParseInstructionAsync(
                            dto.Content,
                            userId,
                            session.BoardId,
                            ct,
                            sourceType: ProposalSourceType.Chat,
                            sourceReferenceId: session.Id.ToString());

                        if (proposalResult.IsSuccess)
                        {
                            messageType = "proposal-reference";
                            proposalId = proposalResult.Value.Id;
                            assistantContent = $"{llmResult.Content}\n\nProposal created for review: {proposalResult.Value.Id}";
                        }
                        else if (llmResult.IsActionable) // Only give a hint if we were confident it was actionable
                        {
                            assistantContent = $"{llmResult.Content}\n\n(I detected a task request but could not parse it into a proposal: {proposalResult.ErrorMessage})";
                            messageType = "status";
                        }
                        // If it was just a user-forced proposal that failed, we silently fall through and return the original LLM content.
                    }
                }

Comment on lines +215 to +217
DateTimeOffset.UtcNow,
DateTimeOffset.UtcNow,
DateTime.UtcNow.AddHours(1),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using DateTimeOffset.UtcNow and DateTime.UtcNow directly in tests can lead to non-deterministic behavior, especially if tests are run near midnight or across time zone changes. While these specific values aren't asserted on here, it's a best practice to use a fixed date/time or a time provider abstraction (like TimeProvider in .NET 8) to ensure tests are fully deterministic and repeatable.

@Chris0Jeky
Copy link
Copy Markdown
Owner Author

Adversarial Code Review — PR #567

Reviewer focus: GP-06 (Review-First Automation Safety), security, robustness, error handling, test coverage, layer compliance.

All 1,489 backend tests pass.


GP-06 Compliance: PASS

The core change correctly maintains the review-first invariant. Chat now auto-creates proposals (not direct board mutations) when actionable intent is detected. Proposals still require explicit user approval before any board state changes. This is compliant with GP-06.


Findings

Medium-1: LlmIntentClassifier false positives on "sort" and "reorder"

The classifier matches lower.Contains("sort") and lower.Contains("reorder"), which are substring matches that will fire on natural language like:

  • "What sort of tasks should I prioritize?"
  • "Can you help me figure out sorting my priorities?"
  • "I need to reorder my thoughts on this project"

Previously this was lower-risk because proposals required RequestProposal: true. Now that proposals auto-create, every false positive triggers an unnecessary planner call and potentially confusing "Proposal created" UX.

Recommendation: Use word-boundary matching (e.g., regex \bsort\b with context like "sort cards" / "sort column") or at least require "sort " (with trailing space/object) instead of bare substring. The same applies to "reorder".

Severity: Medium — not a safety issue (proposals still require review), but degrades UX and wastes planner calls.

Medium-2: RequestProposal + !IsActionable path silently swallows planner failure

In the new else if (dto.RequestProposal && session.BoardId.HasValue) branch (lines 252-270), when the planner fails, the code does nothing — it falls through and returns the LLM response as plain "text" with no hint to the user that their explicit proposal request was attempted and failed:

if (proposalResult.IsSuccess)
{
    messageType = "proposal-reference";
    proposalId = proposalResult.Value.Id;
    assistantContent = $"...";
}
// No else — failure is silently dropped

Compare this to the IsActionable branch which correctly surfaces the error:

else
{
    assistantContent = $"...(I detected a task request but could not parse it into a proposal: {proposalResult.ErrorMessage})";
    messageType = "status";
}

Recommendation: Add an else clause to surface the planner error, consistent with the actionable-intent path. The user explicitly asked for a proposal — silence on failure is worse here than in the auto-detect case.

Severity: Medium — explicit user intent is silently dropped.

Low-1: No test for RequestProposal + !IsActionable + planner failure

The new test ShouldTryPlanner_WhenRequestProposalExplicitButNotActionable only tests the success path. The failure path (Medium-2 above) is untested.

Severity: Low — follows from Medium-2.

Low-2: IsDegraded and IsActionable are not mutually exclusive

When the LLM returns IsDegraded: true AND IsActionable: true, the code first sets messageType = "degraded" (line 217) and then overwrites it in the IsActionable block (lines 240-241 or 248). This means degraded status is lost when the message is also actionable.

This was pre-existing behavior, but the PR widens the impact since more messages now enter the IsActionable path. The degradedReason field is still persisted (via persistedDegradedReason), so the information is not truly lost — it is just not visible in messageType.

Severity: Low — informational; the degraded reason is still persisted on the message entity.

Low-3: Classifier does not handle "make card" (without "a")

The classifier handles "make a card" and "make a task" but not "make card" or "make task". This is inconsistent with the pattern for "create card" / "add card" which work without the article.

Severity: Low — minor coverage gap.

Info-1: No rate limiting on auto-proposal creation

Before this change, proposal creation via chat required the explicit RequestProposal flag — a natural rate limit. Now every actionable message auto-triggers planner calls. The quota service gates the LLM call but not the planner call specifically. A user spamming "create card X" in rapid succession could create many pending proposals.

This is not a real security issue in practice (proposals are user-scoped and require review), but worth noting for future consideration.

Severity: Info


Summary

Severity Count Action Needed
Critical 0 -
High 0 -
Medium 2 Recommended fixes
Low 3 Nice to have
Info 1 Future consideration

Overall assessment: The PR is structurally sound and maintains GP-06 compliance. The two Medium findings (false positive expansion in classifier, silent failure on explicit proposal request) are the most impactful. Neither is a blocker, but Medium-2 (silent failure swallowing) should ideally be addressed before merge since it violates the repo's "handle error cases explicitly; do not swallow failures" Definition of Done.

Tests are thorough and all pass. Layer boundaries are respected — ChatService correctly uses Application-layer abstractions (IAutomationPlannerService, IAutomationProposalService) without touching Infrastructure.

- Surface planner error when RequestProposal is explicit but planner
  fails (Medium-2: do not swallow explicit user intent failures)
- Narrow 'sort'/'reorder' classifier matches to require board/card/column
  context to reduce false positives now that proposals auto-create
- Add 'make card'/'make task' (without article) to classifier
- Add test for RequestProposal + !IsActionable + planner failure path
@Chris0Jeky
Copy link
Copy Markdown
Owner Author

Final Assessment — Post-Fix

I pushed commit 8a81c342 to address the two Medium findings from my review:

Fixes applied:

  1. Medium-2 (silent failure swallowing): Added else clause to the RequestProposal + !IsActionable path in ChatService.cs so planner failures are surfaced to the user with a "Could not create the requested proposal" message and "status" messageType. This aligns with the repo's DoD ("handle error cases explicitly; do not swallow failures").

  2. Medium-1 (classifier false positives): Narrowed "sort" and "reorder" substring matches in LlmIntentClassifier.cs to require board/card/column context (e.g., "sort cards", "reorder columns"). Also added "make card"/"make task" without article for consistency.

  3. Low-1 (missing test): Added SendMessageAsync_ShouldReturnStatusWithHint_WhenRequestProposalExplicitButPlannerFails test covering the previously untested failure path.

Verification: All 1,490 backend tests pass (0 failures, 0 skipped).

Overall: The PR is ready to merge. GP-06 (review-first) is fully maintained — chat creates proposals, never direct board mutations. Layer boundaries are clean. Test coverage is comprehensive.

Consolidates the auto-detected and explicit RequestProposal code paths
into a single shouldAttemptProposal branch, eliminating duplicated
ParseInstructionAsync call and response handling. Addresses Gemini review.
@Chris0Jeky Chris0Jeky merged commit 0abdd01 into main Mar 29, 2026
18 checks passed
@Chris0Jeky Chris0Jeky deleted the fix/512-chat-creates-proposals branch March 29, 2026 18:24
@github-project-automation github-project-automation bot moved this from Pending to Done in Taskdeck Execution Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

BUG: Chat generates prose but never creates proposals or tasks

1 participant