TST-38: Add missing concurrency stress tests for presence and webhooks#814
TST-38: Add missing concurrency stress tests for presence and webhooks#814Chris0Jeky merged 4 commits intomainfrom
Conversation
… delivery Adds 4 new scenarios to ConcurrencyRaceConditionStressTests: - Rapid join/leave presence stress with barrier-synchronized connections - Disconnect during edit clears editing state from presence - Concurrent board mutations each create webhook delivery records - Concurrent webhook subscription creation with distinct IDs/secrets
Wrap editorConn in try/finally to ensure disposal even if assertions fail before the explicit DisposeAsync call.
Adversarial Self-ReviewFindings and Actions
No remaining issues to fix. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d99b125a1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| var cardsResp = await client.GetAsync($"/api/boards/{board.Id}/cards"); | ||
| cardsResp.StatusCode.Should().Be(HttpStatusCode.OK); | ||
| var cards = await cardsResp.Content.ReadFromJsonAsync<List<CardDto>>(); | ||
| var webhookCards = cards!.Where(c => c.Title.StartsWith("Webhook card ")).ToList(); |
There was a problem hiding this comment.
Validate webhook deliveries, not just created cards
WebhookDelivery_ConcurrentBoardMutations_EachCreatesDeliveryRecord claims to verify that each concurrent mutation enqueues a webhook delivery, but this assertion block only checks that cards were created and had unique titles. Because card writes can succeed even if webhook enqueueing fails, this test would still pass when delivery-record creation regresses, so it does not actually protect the scenario it documents. Assert against persisted outbound webhook deliveries (repository/DB or equivalent API) for the board/event and require mutationCount records.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds the remaining concurrency/race-condition stress test scenarios requested in #705, focusing on SignalR board presence behavior and outbound webhook creation under concurrent load.
Changes:
- Added two new SignalR presence stress tests for rapid join/leave and disconnect-during-edit scenarios.
- Added two new webhook stress tests for concurrent delivery creation via board mutations and concurrent subscription creation.
- Introduced SignalR client usage and presence snapshot assertions in the existing
ConcurrencyRaceConditionStressTestssuite.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| using var joinBarrier = new SemaphoreSlim(0, connectionCount); | ||
| var joinTasks = users.Select(async user => | ||
| { | ||
| var conn = SignalRTestHelper.CreateBoardsHubConnection(_factory, user.Token); | ||
| conn.On<BoardPresenceSnapshot>("boardPresence", _ => { }); | ||
| await conn.StartAsync(); | ||
| lock (connections) { connections.Add(conn); } | ||
| await joinBarrier.WaitAsync(); | ||
| await conn.InvokeAsync("JoinBoard", board.Id); | ||
| }).ToArray(); | ||
|
|
||
| joinBarrier.Release(connectionCount); | ||
| await Task.WhenAll(joinTasks); | ||
|
|
There was a problem hiding this comment.
The SemaphoreSlim here doesn’t ensure the JoinBoard invocations happen simultaneously: permits can be released before some tasks reach WaitAsync(), so late arrivals pass immediately and calls become staggered. If the goal is a true start barrier, consider a two-phase gate (e.g., signal "ready" after StartAsync(), then await a shared TaskCompletionSource/Barrier to release all tasks at once).
| // Wait for observer to see all joins | ||
| await SignalRTestHelper.WaitForEventsAsync(observerEvents, connectionCount, | ||
| TimeSpan.FromSeconds(10)); | ||
|
|
||
| // All users should now be visible (plus owner) | ||
| var afterJoin = observerEvents.ToList().Last(); | ||
| afterJoin.Members.Should().HaveCount(connectionCount + 1, | ||
| "all joined users plus the observer owner should be present"); | ||
|
|
There was a problem hiding this comment.
This assertion can be flaky because it assumes the last snapshot at the moment WaitForEventsAsync returns already reflects all joins. Since WaitForEventsAsync returns as soon as at least N events are seen, additional snapshots may arrive after ToList() is taken. To make this deterministic, poll until a snapshot meeting the expected member count is observed (within a timeout) rather than relying on Last() immediately after the wait.
| // Wait for leave events | ||
| await SignalRTestHelper.WaitForEventsAsync(observerEvents, leavingCount, | ||
| TimeSpan.FromSeconds(10)); | ||
|
|
||
| // Final snapshot should show only the remaining connections + owner | ||
| var remaining = connectionCount - leavingCount; | ||
| var afterLeave = observerEvents.ToList().Last(); | ||
| afterLeave.Members.Should().HaveCount(remaining + 1, | ||
| $"after {leavingCount} leaves, {remaining} users + owner should remain"); |
There was a problem hiding this comment.
Same flakiness risk as the join check: the first leavingCount presence events may not include the final settled snapshot, so Last() can still show transient membership. Consider waiting/polling until a snapshot shows exactly remaining + 1 members (or until the leaving users are absent) before asserting.
| // Not using 'await using' because we need to control when disposal happens | ||
| // to simulate an abrupt disconnect. The try/finally ensures cleanup on failure. | ||
| var editorConn = SignalRTestHelper.CreateBoardsHubConnection(_factory, editor.Token); | ||
| try | ||
| { | ||
| editorConn.On<BoardPresenceSnapshot>("boardPresence", _ => { }); | ||
| await editorConn.StartAsync(); | ||
| await editorConn.InvokeAsync("JoinBoard", board.Id); | ||
| await SignalRTestHelper.WaitForEventsAsync(observerEvents, 2); // join event | ||
|
|
||
| observerEvents.Clear(); | ||
| await editorConn.InvokeAsync("SetEditingCard", board.Id, card!.Id); | ||
|
|
||
| // Wait for editing presence update | ||
| var editingEvents = await SignalRTestHelper.WaitForEventsAsync(observerEvents, 1); | ||
| var editorMember = editingEvents.Last().Members | ||
| .FirstOrDefault(m => m.UserId == editor.UserId); | ||
| editorMember.Should().NotBeNull("editor should be visible in presence"); | ||
| editorMember!.EditingCardId.Should().Be(card.Id, | ||
| "editor should show as editing the card"); | ||
|
|
There was a problem hiding this comment.
editorConn isn’t declared with await using / using, so if the test fails before the explicit DisposeAsync() call, the SignalR connection may leak and interfere with other tests. Wrap it in await using with a try/finally (and still call DisposeAsync() at the desired point to simulate an abrupt disconnect).
| var board = await ApiTestHarness.CreateBoardAsync(client, "race-webhook-board"); | ||
|
|
||
| var colResp = await client.PostAsJsonAsync( | ||
| $"/api/boards/{board.Id}/columns", | ||
| new CreateColumnDto(board.Id, "Backlog", null, null)); | ||
| colResp.StatusCode.Should().Be(HttpStatusCode.Created); | ||
| var col = await colResp.Content.ReadFromJsonAsync<ColumnDto>(); | ||
|
|
||
| // Create a webhook subscription on this board. | ||
| // NOTE: the endpoint URL is external-looking; delivery will fail at send time | ||
| // (no real server) but the delivery RECORD should be created. | ||
| var webhookResp = await client.PostAsJsonAsync( | ||
| $"/api/boards/{board.Id}/webhooks", | ||
| new CreateOutboundWebhookSubscriptionDto( | ||
| "https://example.com/webhook-receiver", | ||
| new List<string> { "card.*" })); | ||
| webhookResp.StatusCode.Should().Be(HttpStatusCode.Created); | ||
|
|
||
| // Create multiple cards concurrently — each should trigger a webhook delivery | ||
| using var barrier = new SemaphoreSlim(0, mutationCount); | ||
| var statusCodes = new ConcurrentBag<HttpStatusCode>(); | ||
|
|
||
| var mutationTasks = Enumerable.Range(0, mutationCount).Select(async i => | ||
| { | ||
| using var raceClient = _factory.CreateClient(); | ||
| raceClient.DefaultRequestHeaders.Authorization = client.DefaultRequestHeaders.Authorization; | ||
| await barrier.WaitAsync(); | ||
| var resp = await raceClient.PostAsJsonAsync( | ||
| $"/api/boards/{board.Id}/cards", | ||
| new CreateCardDto(board.Id, col!.Id, $"Webhook card {i}", null, null, null)); | ||
| statusCodes.Add(resp.StatusCode); | ||
| }).ToArray(); | ||
|
|
||
| barrier.Release(mutationCount); | ||
| await Task.WhenAll(mutationTasks); | ||
|
|
||
| // All card creations should succeed | ||
| statusCodes.Should().AllSatisfy(s => | ||
| s.Should().Be(HttpStatusCode.Created), | ||
| "all concurrent card creations should succeed"); | ||
|
|
||
| // Verify all cards were created (no duplicates, no losses) | ||
| var cardsResp = await client.GetAsync($"/api/boards/{board.Id}/cards"); | ||
| cardsResp.StatusCode.Should().Be(HttpStatusCode.OK); | ||
| var cards = await cardsResp.Content.ReadFromJsonAsync<List<CardDto>>(); |
There was a problem hiding this comment.
Despite the test name/comment, this currently only verifies that concurrent card creations succeeded and resulted in the expected number of cards; it never asserts that outbound webhook delivery records were created. To cover the intended behavior, capture the created subscription ID from webhookResp, then query IOutboundWebhookDeliveryRepository (via _factory.Services.CreateScope()) and assert there are exactly mutationCount deliveries for that subscription/board/event type.
| ids.Distinct().Should().HaveCount(subscriptionCount, | ||
| "each subscription should have a unique ID"); | ||
|
|
||
| // Signing secrets should be distinct |
There was a problem hiding this comment.
The final verification only checks that the list endpoint returns 200, so it won’t catch cases where some concurrent creations were lost or overwritten. Parse the response and assert the returned subscriptions include subscriptionCount distinct IDs (and ideally match the IDs created above).
There was a problem hiding this comment.
Code Review
This pull request introduces several concurrency and stress tests for SignalR presence, webhook delivery, and webhook subscription creation. The tests cover scenarios such as rapid join/leave actions, abrupt disconnects during editing, and simultaneous mutations. Feedback focuses on strengthening these tests by ensuring they verify the actual side effects (like database records and response content) rather than just the success of the initial requests, and optimizing the use of HttpClient in test loops.
| barrier.Release(mutationCount); | ||
| await Task.WhenAll(mutationTasks); | ||
|
|
||
| // All card creations should succeed | ||
| statusCodes.Should().AllSatisfy(s => | ||
| s.Should().Be(HttpStatusCode.Created), | ||
| "all concurrent card creations should succeed"); | ||
|
|
||
| // Verify all cards were created (no duplicates, no losses) | ||
| var cardsResp = await client.GetAsync($"/api/boards/{board.Id}/cards"); | ||
| cardsResp.StatusCode.Should().Be(HttpStatusCode.OK); |
There was a problem hiding this comment.
The test name WebhookDelivery_ConcurrentBoardMutations_EachCreatesDeliveryRecord implies that it verifies the creation of webhook delivery records. However, the test only verifies the creation of cards. To make the test complete and match its description, you should add an assertion to check that the expected number of OutboundWebhookDelivery records have been created in the database.
You can achieve this by getting the TaskdeckDbContext from the TestWebApplicationFactory's service provider and querying the OutboundWebhookDeliveries table. You may need to add using Microsoft.EntityFrameworkCore; and using Microsoft.Extensions.DependencyInjection; to the file.
// Verify all cards were created (no duplicates, no losses)
var cardsResp = await client.GetAsync($"/api/boards/{board.Id}/cards");
cardsResp.StatusCode.Should().Be(HttpStatusCode.OK);
var cards = await cardsResp.Content.ReadFromJsonAsync<List<CardDto>>();
var webhookCards = cards!.Where(c => c.Title.StartsWith("Webhook card ")).ToList();
webhookCards.Should().HaveCount(mutationCount,
"each concurrent mutation should create exactly one card (no duplicates or losses)");
// Verify all card titles are unique (no duplicate processing)
webhookCards.Select(c => c.Title).Distinct().Should().HaveCount(mutationCount,
"each card title should be unique, proving no duplicate processing");
// Verify that a webhook delivery record was created for each mutation
using var scope = _factory.Services.CreateScope();
var dbContext = scope.ServiceProvider.GetRequiredService<Taskdeck.Infrastructure.Persistence.TaskdeckDbContext>();
await ApiTestHarness.PollUntilAsync(
() => dbContext.OutboundWebhookDeliveries.CountAsync(d => d.BoardId == board.Id),
count => count == mutationCount,
"waiting for all webhook deliveries to be created");| var users = new List<TestUserContext>(); | ||
| for (var i = 0; i < connectionCount; i++) | ||
| { | ||
| using var c = _factory.CreateClient(); | ||
| var u = await ApiTestHarness.AuthenticateAsync(c, $"race-rapid-{i}"); | ||
| var grant = await ownerClient.PostAsJsonAsync( | ||
| $"/api/boards/{board.Id}/access", | ||
| new GrantAccessDto(board.Id, u.UserId, UserRole.Editor)); | ||
| grant.StatusCode.Should().Be(HttpStatusCode.OK); | ||
| users.Add(u); | ||
| } |
There was a problem hiding this comment.
For improved performance and to follow best practices, it's better to create the HttpClient instance once outside the loop instead of creating and disposing it in each iteration. While not critical for a small loop, it's a good habit to maintain.
var users = new List<TestUserContext>();
using var userClient = _factory.CreateClient();
for (var i = 0; i < connectionCount; i++)
{
var u = await ApiTestHarness.AuthenticateAsync(userClient, $"race-rapid-{i}");
var grant = await ownerClient.PostAsJsonAsync(
$"/api/boards/{board.Id}/access",
new GrantAccessDto(board.Id, u.UserId, UserRole.Editor));
grant.StatusCode.Should().Be(HttpStatusCode.OK);
users.Add(u);
}| "each subscription should have a unique ID"); | ||
|
|
There was a problem hiding this comment.
The verification using the list endpoint is a good final check, but it currently only asserts that the request succeeds. To make the test more robust, you should also verify the content of the response, ensuring that the correct number of subscriptions were returned and that their IDs match the ones created during the test.
var listResp = await client.GetAsync($"/api/boards/{board.Id}/webhooks");
listResp.StatusCode.Should().Be(HttpStatusCode.OK);
var listedSubs = await listResp.Content.ReadFromJsonAsync<List<OutboundWebhookSubscriptionDto>>();
listedSubs.Should().HaveCount(subscriptionCount, "the list endpoint should return all created subscriptions");
listedSubs!.Select(s => s.Id).Should().BeEquivalentTo(ids, "the IDs from the list endpoint should match the created IDs");Address all bot review comments and adversarial review findings: 1. Replace SemaphoreSlim barriers with Barrier class for true synchronization in Scenarios 14, 16, 17. Barrier guarantees all participants arrive before any proceed, unlike SemaphoreSlim which can release permits before all tasks reach WaitAsync(). 2. Add WaitForPresenceCountAsync polling helper to avoid flaky assertions. Instead of checking .Last() snapshot (which may not reflect settled state), poll until a snapshot with the expected member count appears. 3. Wrap editorConn in await using for Scenario 15 to prevent connection leaks on assertion failure. The abrupt disconnect is simulated by explicit DisposeAsync() before the final assertions. 4. Add actual webhook delivery record verification in Scenario 16. The test name says "EachCreatesDeliveryRecord" but previously only verified cards were created. Now queries the delivery repository via DI to confirm delivery records exist with distinct IDs. 5. Parse and verify list endpoint response in Scenario 17. Previously only checked HTTP 200 status. Now deserializes the response, checks subscription count, verifies distinct IDs, and cross-checks that all concurrently-created subscription IDs appear in the list. 6. Reuse single HttpClient for user setup loop in Scenario 14 instead of creating one per iteration.
Adversarial Review Round 2Bot findings addressedAll 7 bot review comments have been investigated and fixed:
Fresh adversarial findingsNo additional issues found beyond what the bots caught. The existing test patterns are sound:
Changes madeSingle commit:
Test resultsAll 17 concurrency stress tests pass (2m 56s total): |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3edf94c71e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| deliveries.Should().HaveCountGreaterThanOrEqualTo(mutationCount, | ||
| $"each of the {mutationCount} card mutations should have created a webhook delivery record"); |
There was a problem hiding this comment.
Require exact delivery count in webhook stress test
This check allows deliveries.Count > mutationCount, so the test still passes if concurrent card creation accidentally enqueues duplicate webhook deliveries. In that failure mode, the scenario is regressed but the test remains green, because HaveCountGreaterThanOrEqualTo only guards against losses. Since this test creates a fresh board/subscription and performs exactly mutationCount mutations, the expected delivery-record count should be exact to catch both dropped and duplicated enqueues.
Useful? React with 👍 / 👎.
Summary
ConcurrencyRaceConditionStressTests, completing the coverage specified in TST-38: Concurrency and race condition stress tests #705:Task.WhenAllwithSemaphoreSlimbarriers for truly simultaneous executionCloses #705
Test plan
dotnet test --filter ConcurrencyRaceConditionStressTestsSignalRTestHelperandEventCollectorpatterns