Improve LlmIntentClassifier NLP coverage by Chris0Jeky · Pull Request #579 · Chris0Jeky/Taskdeck

Chris0Jeky · 2026-03-29T21:44:51Z

Summary

Fixes Improve LlmIntentClassifier coverage for natural language phrasing #571
Replace brittle exact-substring matching with compiled regex patterns supporting word-distance gaps (up to 5 intervening words), plural forms (cards?, tasks?, items?), and broader verb coverage (generate, build, prepare, set up)
Add negative context filtering: negations (don't create task yet) and questions about other tools (how do I create a card in Jira?) are suppressed
Fix ordering bug where remove card was misclassified as card.move due to substring overlap
All regex patterns use RegexOptions.Compiled and a 100ms timeout to prevent catastrophic backtracking
Backward compatible: all original exact-match patterns still work

Test plan

"can you create new onboarding tasks for non-technical people?" classifies as card.create
"I need three new cards for the sprint" classifies as card.create
"how do I create a card in Jira?" does NOT classify as actionable
"don't create task yet, just explain" does NOT classify as actionable
"remove card abc123" now correctly classifies as card.archive (was card.move)
All 14 existing exact-match card.create patterns still pass
All existing board, reorder, update, move, archive patterns still pass
86 unit tests covering exact patterns, NLP patterns, negations, other-tool questions, plurals, word-distance gaps, broader verbs, case insensitivity, edge cases
Full backend test suite passes (all projects)

…entClassifier Replace brittle exact-substring matching with compiled regex patterns that support word-distance gaps, plural forms, and broader verb coverage. Add negative context filtering for negations and questions about other tools. Fix "remove card" being matched as "move card" by reordering checks.

Cover natural language phrasing, plural forms, broader verbs, word-distance gaps, negation filtering, other-tool question filtering, edge cases, and backward compatibility with all existing exact-match patterns.

chatgpt-codex-connector · 2026-03-29T21:44:56Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Remove overly broad `move.*to` pattern from CardMovePattern to prevent matching "move on to the next topic". Replace unbounded `.*` in NegationPattern with bounded word-distance to avoid suppressing intents in compound sentences like "stop the sprint and then create cards".

gemini-code-assist

Code Review

This pull request significantly improves the LlmIntentClassifier by transitioning from basic substring matching to a more robust Regex-based approach. Key enhancements include the ability to detect natural language patterns, handle negations, and filter out questions regarding external tools. It also resolves a classification bug by reordering intent checks. The review feedback suggests simplifying the new helper methods by removing redundant substring checks that are now encompassed by the regex patterns.

gemini-code-assist · 2026-03-29T21:46:04Z

+    private static bool MatchesCardCreate(string lower)
+    {
+        // Exact substring matches (backward compatible)
        if (lower.Contains("create card") || lower.Contains("add card")
            || lower.Contains("create a card") || lower.Contains("add a card")
            || lower.Contains("create task") || lower.Contains("add task")
            || lower.Contains("create a task") || lower.Contains("add a task")
            || lower.Contains("new card") || lower.Contains("new task")
            || lower.Contains("make a card") || lower.Contains("make a task")
            || lower.Contains("make card") || lower.Contains("make task"))
-            return (true, "card.create");
+            return true;

-        if (lower.Contains("move card"))
-            return (true, "card.move");
+        // Regex-based natural language matching
+        try
+        {
+            if (CardCreatePattern.IsMatch(lower))
+                return true;
+            if (NewCardPattern.IsMatch(lower))
+                return true;
+        }
+        catch (RegexMatchTimeoutException)
+        {
+            // Fall through — don't match on timeout
+        }
+
+        return false;
+    }


This method, and the other Matches... methods that follow, can be significantly simplified.

The initial if block with lower.Contains(...) is now redundant. The new regular expressions are supersets that cover these exact-match cases while also providing broader NLP capabilities. Removing these checks simplifies the code without losing backward compatibility.

The try-catch logic can be made more direct by combining the regex checks and returning false from within the catch block.

This refactoring makes the intent clearer and reduces code duplication across all the Matches... methods. The same pattern can be applied to MatchesCardMove, MatchesCardArchive, etc.

Here is a suggested simplification for this method:

private static bool MatchesCardCreate(string lower) { try { // The regex patterns cover both the old exact matches and new NLP variations. return CardCreatePattern.IsMatch(lower) || NewCardPattern.IsMatch(lower); } catch (RegexMatchTimeoutException) { // On timeout, treat as a non-match for safety. return false; } }

Chris0Jeky · 2026-03-29T21:46:22Z

Self-Review Findings

Regex Performance

All patterns use bounded repetition (\s+\w+){0,N} (max 4-6 words) — no unbounded .* in hot paths
All regex instances use RegexOptions.Compiled and a 100ms TimeSpan timeout
RegexMatchTimeoutException is caught at every call site, falling through gracefully
Tested with 200-word input to verify no catastrophic backtracking — completes without hanging
Fixed in commit 3: Removed the \bmove\b.*\bto\b alternation from CardMovePattern (unbounded .*) and replaced unbounded .* in NegationPattern with (\s+\w+){0,6}

False Positive Analysis

"I deleted the create card button by accident": Still matches as card.archive because "delete" is adjacent to "card". This is a known limitation — detecting past tense / UI references would require POS tagging, which is out of scope for a keyword classifier. Acceptable trade-off.
"move on to the next topic": Previously would have matched card.move via the move.*to pattern. Fixed by removing that alternation.
Other-tool suppression: Only triggers when BOTH a question pattern (how/what/where...?) AND an other-tool name are present. Commands mentioning other tools (e.g., "create a card like in Jira") correctly pass through. Tested.
Negation suppression: Bounded to 6 words between negation and verb, preventing false suppression in compound sentences.

False Negative Analysis

"please add these items: ...": Now matches via add ... items regex pattern
"can you create new onboarding tasks...": Now matches — "create" + words + "tasks"
Remaining gap: Extremely indirect phrasing like "I could use some help organizing my sprint" (no verb+noun pattern) won't match. This is by design — the classifier is intentionally conservative to avoid false positives.

Breaking Changes

"remove card" now correctly classifies as card.archive instead of card.move. This fixes the documented ordering bug. Test updated to reflect correct behavior.
All 14 original exact-match card.create patterns still pass unchanged.
All other existing patterns (move, archive, update, board, reorder) still pass.

Test Coverage

86 tests total covering: exact patterns (14), natural language (14), negation (6), other-tool questions (3), other-tool commands (2), non-actionable (7), edge cases (3), case insensitivity (5), plural forms (6), broader verbs (4), word-distance (5), card operations (7), board operations (6), reorder (4)
Edge cases include: empty string, whitespace-only, very long input, mixed intents

Remaining Risk

The items? noun in create patterns could match overly broad phrases like "add these items to my shopping list" — low risk in a Taskdeck chat context where all input is board-related
No coverage for multi-line input (messages with newlines) — regex patterns work per-line by default, which is fine for chat messages

Chris0Jeky · 2026-03-29T22:00:01Z

Adversarial Review — PR #579

Critical

None found.

Major

M1. Negation test "stop creating cards" passes for the wrong reason

The test at line 175 asserts this returns non-actionable and attributes it to negation. But NegationPattern requires \b(create|add|...)\b — the word boundary after "create" fails on "creating" (the "i" is a word char). The test actually passes because no positive pattern matches either ("creating" also fails \bcreate\b in CardCreatePattern). This means:

If someone later adds gerund support to positive patterns, this test will break and the negation won't save it.
The test gives false confidence that negation handles "stop + gerund" forms.

Recommendation: Either (a) add gerund forms to NegationPattern group 3 (e.g., creating|adding|...), or (b) change the test to document that it's non-actionable because neither negation nor positive patterns match, not because negation works.

M2. Double negatives bypass negation → false positive

"don't not create a task" — the NegationPattern matches "don't" + "not" (1 word) + "create", so it's suppressed. But semantically, "don't not create" means "do create." Similarly, "I can't avoid creating a task" — "avoid" triggers negation but the actual intent is to create. These are admittedly rare in chat, but worth documenting as a known limitation at minimum.

M3. "I deleted the create card button by accident" is a false positive

This input contains the literal substring "create card" so MatchesCardCreate returns card.create. The self-review acknowledges this (calling it a "past tense / UI reference" limitation) but attributes it to card.archive — it's actually card.create because archive checks \bdelete\b which doesn't match "deleted", and the exact substring "create card" fires in the create path. No test covers this case. The old test suite had this as a documented edge case but the PR removed it without replacement.

Recommendation: Add a test documenting this known false positive, even if you choose not to fix it.

Minor

m1. QuestionAboutHowPattern uses unbounded .*

The .* is technically safe here because: (a) it only fires when OtherToolPattern already matched, (b) on inputs ending with ? the backtracking is O(1), and (c) the 100ms timeout is the backstop. But it's inconsistent with the disciplined bounded-repetition approach used everywhere else. Consider [^?]* or [\s\S]{0,500} for consistency.

m2. NegationPattern does not cover "no" as a negation word

"no need to create a task" would match CardCreatePattern (create + gap + task) and fire as card.create. The word "no" is missing from the negation group. Similarly, "not" alone (without "do") is missing: "not creating tasks right now" — though this case is saved by the gerund not matching.

m3. Ordering sensitivity with "update board" and BoardRenamePattern

"update board settings" matches BoardRenamePattern (\b(rename|update|edit)\b ... \bboards?\b) and returns board.update. But "update the board's card list" could also match — is "update board" always a rename? The semantic mapping of update → board.update (which the test calls "rename") is potentially confusing. Not a bug, but the intent label could mislead consumers.

m4. No test for regex special characters in input

Input like "create card (urgent) [P0]" or "add task: fix the $PATH issue" — the substring checks would still work fine and fire before regex, but if someone crafts input with regex metacharacters that bypass substring matching and hit the regex path, e.g., "create a card" (multiple spaces), the \s+\w+ groups would handle it correctly. But "create\tcard" (tab character) — ToLowerInvariant() preserves tabs, \s+ matches tabs, so this works. Low risk but worth a test for documentation.

m5. Compiled regex count (10 patterns) — static memory cost

10 compiled regex patterns are held in static fields. Each compiled regex allocates IL code. This is fine for a singleton classifier but worth noting — if this pattern proliferates to other classifiers, consider RegexOptions.NonBacktracking (.NET 7+) as an alternative that avoids the compilation cost and eliminates backtracking risk entirely.

Nits

n1. The Classify_NullInput_ShouldReturnFalse test name says "null input" but actually tests whitespace. Classify accepts string (non-nullable), so true null would be a compiler warning. The test name is misleading.

n2. Classify_VeryLongInput_ShouldNotHang constructs a 200-word input but doesn't assert timing. A Stopwatch assertion (e.g., < 500ms) would make this test actually verify the performance claim rather than just "it didn't crash."

n3. The comment on line 38 of the classifier says "or move + card/task context + to" but that alternation was removed. Stale comment.

Overall Assessment

Pass with fixes. The implementation is solid overall — bounded quantifiers, compiled regexes with timeouts, timeout catch blocks, backward-compatible exact-match fast paths, and sensible ordering. The self-review was thorough and caught real issues.

The Major items are primarily about test accuracy (M1, M3) and documentation of known limitations (M2). M1 is the most concerning because it represents a false sense of security in the negation logic — the test passes by coincidence rather than by the mechanism it claims to test. M3 is a regression in test coverage from the old suite.

Recommended before merge:

Fix or re-document the "stop creating cards" test (M1)
Add a test for "I deleted the create card button by accident" documenting the false positive (M3)
Fix stale comment (n3)

…x stale comment - Split "stop creating cards" out of negation tests into its own test that documents the real reason it's non-actionable (gerund form invisible to both negation and positive patterns, not negation suppression) - Add test for known false positive: "I deleted the create card button" matches card.create due to literal substring "create card" - Fix stale comment on CardMovePattern that referenced removed alternation

Chris0Jeky · 2026-03-29T22:01:59Z

Follow-up: Fixes pushed for review findings

Commit 27aed019 addresses the three actionable findings from the adversarial review:

M1 — Gerund test accuracy

Moved "stop creating cards" out of the negation test group into a standalone test Classify_GerundForm_IsNonActionable_ButNotDueToNegation that documents the real mechanism: gerund forms like "creating" don't match \bcreate\b in either the negation or positive patterns. The test assertion message now explains this clearly.

M3 — Missing false-positive documentation

Added Classify_PastTenseNarrative_IsKnownFalsePositive test for "I deleted the create card button by accident" — documents that it matches card.create because the literal substring "create card" is present. Asserts the current (incorrect but known) behavior.

n3 — Stale comment

Fixed the CardMovePattern comment that still referenced the removed move.*to alternation.

All 87 tests pass. No behavior changes — only test accuracy and documentation improvements.

Chris0Jeky · 2026-03-29T22:18:10Z

Addressed Gemini review feedback: removed redundant Contains() substring checks from all Matches... methods in LlmIntentClassifier.cs. The regex patterns already cover those exact-match cases. All 1627 backend tests pass.

Update two analysis docs (chat-to-proposal gap and manual testing findings) to reflect recent fixes and testing status. Key changes: add Last Updated and status notes; mark Tier 1 improvements shipped (intent classifier regex/stemming/negation fixes, substring ordering bug, PR #579), UX parse hints shipped (PR #582), unit/integration tests shipped (PR #580), and note PR range #578–#582. In manual testing findings mark OBS-2/OBS-3 resolved (PR #581) and BUG-M5 resolved (PR #578), update resolutions and remove duplicate checklist items. Minor editorial clarifications and test counts added.

Chris0Jeky added 2 commits March 29, 2026 22:40

Add comprehensive tests for improved LlmIntentClassifier NLP matching

57b0863

Cover natural language phrasing, plural forms, broader verbs, word-distance gaps, negation filtering, other-tool question filtering, edge cases, and backward compatibility with all existing exact-match patterns.

github-project-automation bot added this to Taskdeck Execution Mar 29, 2026

github-project-automation bot moved this to Pending in Taskdeck Execution Mar 29, 2026

gemini-code-assist bot reviewed Mar 29, 2026

View reviewed changes

Remove redundant substring checks now covered by regex patterns

9b9f142

Merge branch 'main' into enhance/571-intent-classifier-nlp

09d73d2

Chris0Jeky merged commit 8037d5d into main Mar 29, 2026
8 checks passed

Chris0Jeky deleted the enhance/571-intent-classifier-nlp branch March 29, 2026 22:23

github-project-automation bot moved this from Pending to Done in Taskdeck Execution Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LlmIntentClassifier NLP coverage#579

Improve LlmIntentClassifier NLP coverage#579
Chris0Jeky merged 6 commits intomainfrom
enhance/571-intent-classifier-nlp

Chris0Jeky commented Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Chris0Jeky commented Mar 29, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Chris0Jeky commented Mar 29, 2026

Self-Review Findings

Regex Performance

False Positive Analysis

False Negative Analysis

Breaking Changes

Test Coverage

Remaining Risk

Uh oh!

Chris0Jeky commented Mar 29, 2026

Adversarial Review — PR #579

Critical

Major

Minor

Nits

Overall Assessment

Uh oh!

Chris0Jeky commented Mar 29, 2026

Follow-up: Fixes pushed for review findings

M1 — Gerund test accuracy

M3 — Missing false-positive documentation

n3 — Stale comment

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant