Add property-based and fuzz testing pilot for domain/API contracts#601
Add property-based and fuzz testing pilot for domain/API contracts#601Chris0Jeky merged 11 commits intomainfrom
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Adversarial Self-ReviewStrengths
Issues Found and Addressed
Remaining Observations (non-blocking)
VerdictShip as-is. The tests are well-structured, fast, and already found a real gap in the classifier. The observations above are enhancement opportunities, not blockers. |
There was a problem hiding this comment.
Code Review
This pull request introduces property-based and fuzz testing using FsCheck for DTO serialization, intent classification, and domain entity invariants. The reviewer identified several improvement opportunities: ensuring test determinism by generating values like Guids and timestamps through FsCheck rather than using static methods inside lambdas, deepening assertions to verify collection contents rather than just counts, and converting non-parameterized property tests into standard unit tests for better efficiency.
| deserialized.Columns.Count().Should().Be(dto.Columns.Count()); | ||
| deserialized.Cards.Count().Should().Be(dto.Cards.Count()); | ||
| deserialized.Labels.Count().Should().Be(dto.Labels.Count()); |
There was a problem hiding this comment.
The current assertions only verify the count of items in the nested collections (Columns, Cards, Labels). To ensure full roundtrip fidelity as intended by this test, you should verify that the actual content of these collections is preserved during serialization and deserialization.
deserialized.Columns.Should().BeEquivalentTo(dto.Columns);
deserialized.Cards.Should().BeEquivalentTo(dto.Cards);
deserialized.Labels.Should().BeEquivalentTo(dto.Labels);| $"Description for card {i}", | ||
| columnNames.Length > 0 ? columnNames[i % columnNames.Length] : "Default", | ||
| i, | ||
| i % 2 == 0 ? DateTimeOffset.UtcNow.AddDays(i) : null, |
There was a problem hiding this comment.
Using DateTimeOffset.UtcNow inside a generator introduces non-determinism. If a test fails, replaying it with the same seed will not reproduce the exact same DateTimeOffset value, which can make debugging and shrinking difficult. It is better to generate DateTimeOffset values via FsCheck or use a fixed reference date for offsets.
| }); | ||
| } | ||
|
|
||
| [Property(MaxTest = MaxTests)] |
There was a problem hiding this comment.
This test does not use any generated parameters and performs the same check every time. Using [Property(MaxTest = 200)] causes FsCheck to run this identical test 200 times, which is inefficient for CI. Consider using a standard [Fact] for tests that do not require property-based generation. This pattern is also present in other test classes in this PR (e.g., BoardPropertyTests.NameAtExactly100Chars_Succeeds).
[Fact]
public void EmptyUserId_AlwaysThrows()| var userId = Guid.NewGuid(); | ||
| var correlationId = Guid.NewGuid().ToString(); |
There was a problem hiding this comment.
Using Guid.NewGuid() inside the property lambda makes the test non-deterministic and prevents the FsCheck seed from fully capturing the failing state. It is recommended to let FsCheck generate these values by adding them as parameters to the lambda or the property method to ensure full reproducibility.
Tests with no FsCheck-generated parameters gain nothing from running 200 times. Convert EmptyUserId_AlwaysThrows, MarkAsApplied_OnlyFromApproved, MarkAsFailed_OnlyFromApproved, and MarkAsApplied_FromPending_AlwaysThrows to [Fact].
Summary
Closes #89
Implements a property-based and fuzz testing pilot using FsCheck for xUnit, covering core domain entity invariants and application-layer input parsing/validation contracts.
Domain property-based tests (51 tests)
Application fuzz tests (25 tests)
Finding: LlmIntentClassifier negation gap
Fuzz testing surfaced a real classifier gap: gerund verb forms (e.g., "avoid adding new tasks") bypass the negation regex because it only matches bare infinitives (
add,create,move, etc.). The input "avoid adding new tasks" is incorrectly classified as actionable (matchesNewCardPatternon "new tasks"). This is documented in the test comments and could be addressed by extending the negation pattern to support gerund/present-participle forms.Test runtime and flake controls
MaxTest = 200-300to cap generated cases and keep CI fast (total runtime < 3 seconds)Replay = "seed,size"on the[Property]attribute (documented in test class comments)Verification
Test plan
dotnet test backend/Taskdeck.sln -c Release -m:1passes (1775/1775)