Skip to content

[DEV-1440] M2: Daemon RunEvalCommand handler#15

Merged
alexeyzimarev merged 2 commits intomainfrom
alexeyzimarev/dev-1440-m2-daemon-eval-runner
Apr 13, 2026
Merged

[DEV-1440] M2: Daemon RunEvalCommand handler#15
alexeyzimarev merged 2 commits intomainfrom
alexeyzimarev/dev-1440-m2-daemon-eval-runner

Conversation

@alexeyzimarev
Copy link
Copy Markdown
Member

Summary

Daemon side of the dashboard-driven eval pipeline (DEV-1440 milestone 2). Pairs with the server M3 endpoint in kurrent-io/Kurrent.Capacitor#477 and depends on the M1 shared eval library in #14.

Changes

  • Wire types in `Models.cs` match server's `DaemonCommands.cs`: `RunEvalCommand` (server → daemon dispatch) plus the four daemon → server progress events (`EvalStarted`, `EvalQuestionCompleted`, `EvalFinished`, `EvalFailed`). Registered in `KapacitorJsonContext`.

  • `ServerConnection` registers a `RunEval` handler and exposes per-event send methods (`EvalStartedAsync` etc.) mirroring the existing `AgentRegisteredAsync` / `LaunchFailedAsync` pattern.

  • `EvalRunner` singleton subscribes to `OnRunEval`. Each incoming command spawns a fire-and-forget Task that builds an authenticated HttpClient, instantiates a `DaemonEvalObserver` bound to the run, and drives `EvalService.RunAsync`. Unhandled exceptions are caught and translated to an `EvalFailed` relay so the dashboard learns about daemon-side failures rather than waiting forever.

  • `DaemonEvalObserver` maps the `IEvalObserver` surface from M1 to SignalR sends. Info / per-question-start / per-question-failure / fact-retained callbacks just log locally — not interesting enough to justify per-judge SignalR chatter.

  • DI wiring: `AddSingleton` plus an explicit `GetRequiredService` at startup so the constructor's subscription happens before the daemon starts taking traffic.

Test plan

  • `dotnet build src/kapacitor/kapacitor.csproj` — clean
  • `dotnet publish -c Release` — zero IL3050/IL2026 (AOT-clean)
  • Full unit suite — 205/205 pass
  • CI
  • End-to-end once [DEV-1440] M1: Extract shared eval library #14 (M1) and Kurrent.Capacitor#477 (M3) land — then `POST /api/sessions/{id}/evals/run` from the server should drive the daemon, with progress flowing back via SignalR

Branch base

This PR is based on the M1 branch (`alexeyzimarev/dev-1440-m1-shared-eval-library`) since it consumes the extracted `EvalService` and `IEvalObserver`. After M1 merges to main, this branch will be rebased onto main automatically by GitHub.

🤖 Generated with Claude Code

@linear
Copy link
Copy Markdown

linear bot commented Apr 13, 2026

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

DEV-1440 M2: Daemon eval runner with SignalR progress relay

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Extracted shared eval library from CLI into reusable EvalService and IEvalObserver interfaces
• Implemented daemon-side eval orchestration with EvalRunner and DaemonEvalObserver for SignalR
  progress relay
• Added wire types for eval dispatch (RunEvalCommand) and four progress events (EvalStarted,
  EvalQuestionCompleted, EvalFinished, EvalFailed)
• Registered eval command handlers in ServerConnection and DI wiring in DaemonRunner
Diagram
flowchart LR
  Server["Server<br/>Dashboard"] -->|RunEvalCommand| Daemon["Daemon<br/>EvalRunner"]
  Daemon -->|EvalService.RunAsync| EvalSvc["EvalService<br/>Orchestration"]
  EvalSvc -->|IEvalObserver| Observer["DaemonEvalObserver"]
  Observer -->|EvalStarted<br/>EvalQuestionCompleted<br/>EvalFinished<br/>EvalFailed| SignalR["SignalR<br/>Connection"]
  SignalR -->|Progress Events| Server
Loading

Grey Divider

File Changes

1. src/kapacitor/Commands/EvalCommand.cs Refactoring +47/-395

Refactored to thin CLI adapter over EvalService

src/kapacitor/Commands/EvalCommand.cs


2. src/kapacitor/Eval/EvalService.cs ✨ Enhancement +410/-0

Extracted core eval orchestration pipeline

src/kapacitor/Eval/EvalService.cs


3. src/kapacitor/Eval/IEvalObserver.cs ✨ Enhancement +45/-0

New observer interface for progress callbacks

src/kapacitor/Eval/IEvalObserver.cs


View more (6)
4. src/kapacitor/Eval/EvalQuestions.cs ✨ Enhancement +51/-0

Extracted canonical question taxonomy

src/kapacitor/Eval/EvalQuestions.cs


5. src/kapacitor/Daemon/Services/EvalRunner.cs ✨ Enhancement +127/-0

New daemon eval handler with fire-and-forget dispatch

src/kapacitor/Daemon/Services/EvalRunner.cs


6. src/kapacitor/Daemon/Services/ServerConnection.cs ✨ Enhancement +16/-0

Added RunEval handler and eval progress send methods

src/kapacitor/Daemon/Services/ServerConnection.cs


7. src/kapacitor/Daemon/DaemonRunner.cs ⚙️ Configuration changes +5/-0

Registered EvalRunner singleton and subscription

src/kapacitor/Daemon/DaemonRunner.cs


8. src/kapacitor/Models.cs ✨ Enhancement +51/-0

Added eval wire types and JSON serialization

src/kapacitor/Models.cs


9. test/kapacitor.Tests.Unit/EvalServiceTests.cs 🧪 Tests +25/-25

Renamed and retargeted tests to EvalService namespace

test/kapacitor.Tests.Unit/EvalServiceTests.cs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 13, 2026

Code Review by Qodo

🐞 Bugs (1)   📘 Rule violations (0)   📎 Requirement gaps (0)
🐞\ ☼ Reliability (1)

Grey Divider


Action required

1. EvalRunId mismatch🐞
Description
DaemonEvalObserver sends EvalStarted with the EvalService-generated runId but sends subsequent
progress events with the dispatched cmd.EvalRunId, so the server/dashboard can’t reliably correlate
a single run’s lifecycle. This can also desynchronize what the server persisted
(EvalService-generated id) vs what the dashboard requested (command id).
Code

src/kapacitor/Daemon/Services/EvalRunner.cs[R83-100]

+    public void OnStarted(string runId, string contextSessionId, string judgeModel, int totalQuestions) {
+        logger.LogInformation("Eval {Run} started on session {Sid} (model {Model}, {Count} questions)", runId, sessionId, judgeModel, totalQuestions);
+        Relay(() => connection.EvalStartedAsync(runId, sessionId, judgeModel, totalQuestions), "EvalStarted");
+    }
+
+    public void OnContextFetched(int traceEntries, int traceChars, int toolResultsTotal, int toolResultsTruncated, long bytesSaved) =>
+        logger.LogDebug("Eval {Run} context fetched: {Entries} entries, {Chars} chars", evalRunId, traceEntries, traceChars);
+
+    public void OnQuestionStarted(int index, int total, string category, string questionId) =>
+        logger.LogDebug("[eval {Run}] [{Index}/{Total}] {Category}/{Question} started", evalRunId, index, total, category, questionId);
+
+    public void OnQuestionCompleted(int index, int total, EvalQuestionVerdict verdict, long inputTokens, long outputTokens) {
+        logger.LogInformation(
+            "[eval {Run}] [{Index}/{Total}] {Question} -> {Score} ({Verdict})",
+            evalRunId, index, total, verdict.QuestionId, verdict.Score, verdict.Verdict
+        );
+        Relay(() => connection.EvalQuestionCompletedAsync(evalRunId, sessionId, index, total, verdict.Category, verdict.QuestionId, verdict.Score, verdict.Verdict), "EvalQuestionCompleted");
+    }
Evidence
The server dispatch includes an EvalRunId in RunEvalCommand; EvalRunner passes that into
DaemonEvalObserver, but EvalService generates a new GUID and passes it to OnStarted.
DaemonEvalObserver relays EvalStarted using the OnStarted argument (service-generated), while
EvalQuestionCompleted/EvalFinished/EvalFailed use the captured evalRunId (command-provided),
creating two different ids within one run’s event stream.

src/kapacitor/Models.cs[463-470]
src/kapacitor/Daemon/Services/EvalRunner.cs[40-51]
src/kapacitor/Eval/EvalService.cs[39-40]
src/kapacitor/Eval/EvalService.cs[96-97]
src/kapacitor/Daemon/Services/EvalRunner.cs[83-86]
src/kapacitor/Daemon/Services/EvalRunner.cs[94-100]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The daemon run correlation id (`RunEvalCommand.EvalRunId`) is not used consistently: `EvalService` generates its own `evalRunId`, and `DaemonEvalObserver` relays a mix of service-generated and command-provided ids.

### Issue Context
The server/dashboard dispatches `RunEvalCommand(EvalRunId, ...)` and expects all progress events and the persisted aggregate to refer to that same eval run id.

### Fix Focus Areas
- src/kapacitor/Eval/EvalService.cs[29-41]
- src/kapacitor/Eval/EvalService.cs[96-97]
- src/kapacitor/Daemon/Services/EvalRunner.cs[40-51]
- src/kapacitor/Daemon/Services/EvalRunner.cs[83-100]

### Suggested fix
- Add an optional `string? evalRunIdOverride` (or required `string evalRunId`) parameter to `EvalService.RunAsync`.
- Use that value instead of `Guid.NewGuid()` when provided.
- In the daemon, pass `cmd.EvalRunId` into `EvalService.RunAsync`.
- Ensure `DaemonEvalObserver.OnStarted` relays the same run id used for all other events (ideally the dispatched id).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Judge-facts ignore cancellation🐞
Description
EvalService propagates CancellationToken for the main eval-context GET and evals POST, but
judge-facts GET/POST and content reads don’t accept/use the token. Cancellation/shutdown can be
delayed while those calls block/retry.
Code

src/kapacitor/Eval/EvalService.cs[R346-372]

+    static async Task<Dictionary<string, List<JudgeFact>>> FetchAllJudgeFactsAsync(
+            HttpClient    httpClient,
+            string        baseUrl,
+            string        encodedSessionId,
+            IEvalObserver observer
+        ) {
+        var result = new Dictionary<string, List<JudgeFact>>();
+
+        foreach (var category in EvalQuestions.Categories) {
+            try {
+                using var resp = await httpClient.GetWithRetryAsync(
+                    $"{baseUrl}/api/sessions/{encodedSessionId}/judge-facts?category={Uri.EscapeDataString(category)}"
+                );
+                if (!resp.IsSuccessStatusCode) {
+                    observer.OnInfo($"Failed to fetch judge facts for {category}: HTTP {(int)resp.StatusCode}");
+
+                    continue;
+                }
+
+                var json = await resp.Content.ReadAsStringAsync();
+                var list = JsonSerializer.Deserialize(json, KapacitorJsonContext.Default.ListJudgeFact) ?? [];
+                result[category] = list;
+                observer.OnInfo($"Loaded {list.Count} retained facts for category {category}");
+            } catch (HttpRequestException ex) {
+                observer.OnInfo($"Could not load judge facts for {category}: {ex.Message}");
+            }
+        }
Evidence
HttpClientExtensions.GetWithRetryAsync/PostWithRetryAsync support CancellationToken, but
EvalService’s judge-facts helpers don’t take/pass ct and also call ReadAsStringAsync() without ct.

src/kapacitor/HttpClientExtensions.cs[80-91]
src/kapacitor/Eval/EvalService.cs[346-372]
src/kapacitor/Eval/EvalService.cs[395-407]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
CancellationToken is not propagated to judge-facts GET/POST calls, so eval cancellation may not stop promptly.

### Issue Context
`EvalService.RunAsync` already accepts `CancellationToken ct` and uses it for other HTTP calls.

### Fix Focus Areas
- src/kapacitor/Eval/EvalService.cs[100-101]
- src/kapacitor/Eval/EvalService.cs[346-375]
- src/kapacitor/Eval/EvalService.cs[377-409]

### Suggested fix
- Add `CancellationToken ct = default` parameter to `FetchAllJudgeFactsAsync` and `PostJudgeFactAsync`.
- Pass `ct` through to `GetWithRetryAsync`/`PostWithRetryAsync` and `ReadAsStringAsync(ct)`.
- Pass the outer `RunAsync` ct into these helpers.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Out-of-order progress events 🐞
Description
DaemonEvalObserver relays SignalR progress events via Task.Run without sequencing, so
EvalStarted/EvalQuestionCompleted/EvalFinished can arrive out of order. This can break
server/dashboard state tracking and produce inconsistent progress UI.
Code

src/kapacitor/Daemon/Services/EvalRunner.cs[R118-126]

+    void Relay(Func<Task> send, string eventName) {
+        _ = Task.Run(async () => {
+            try {
+                await send();
+            } catch (Exception ex) {
+                logger.LogWarning(ex, "Failed to relay {Event} for eval {Run}", eventName, evalRunId);
+            }
+        });
+    }
Evidence
Each Relay call spawns a separate background task and does not await prior sends; multiple progress
callbacks can run quickly back-to-back, so concurrent SendAsync calls may interleave and reorder
in-flight operations.

src/kapacitor/Daemon/Services/EvalRunner.cs[83-86]
src/kapacitor/Daemon/Services/EvalRunner.cs[94-100]
src/kapacitor/Daemon/Services/EvalRunner.cs[108-116]
src/kapacitor/Daemon/Services/EvalRunner.cs[118-126]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Daemon eval progress relays are fire-and-forget and concurrent. This can cause out-of-order delivery of progress events to the server.

### Issue Context
The dashboard/server typically assumes an ordered lifecycle: started -> question completions -> finished/failed.

### Fix Focus Areas
- src/kapacitor/Daemon/Services/EvalRunner.cs[74-126]

### Suggested fix
- Replace per-event `Task.Run` with a single serialized queue (e.g., `Channel<Func<Task>>`) processed by one background loop.
- Alternatively, gate sends with a `SemaphoreSlim(1,1)` and `await` inside the background relay (still non-blocking for EvalService since the observer remains sync).
- Ensure `EvalStarted` is enqueued before any question events and `EvalFinished/EvalFailed` are last.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Observer exceptions unhandled🐞
Description
EvalService calls IEvalObserver callbacks without try/catch, despite the IEvalObserver contract
stating observer exceptions are caught and logged. A throwing observer can crash the eval
orchestration and prevent a terminal OnFailed/OnFinished signal.
Code

src/kapacitor/Eval/EvalService.cs[R89-97]

+        observer.OnContextFetched(
+            context.Trace.Count,
+            traceJson.Length,
+            context.Compaction.ToolResultsTotal,
+            context.Compaction.ToolResultsTruncated,
+            context.Compaction.BytesSaved
+        );
+        observer.OnStarted(evalRunId, context.SessionId, model, EvalQuestions.All.Length);
+
Evidence
IEvalObserver explicitly documents that observer exceptions are caught and logged by the service,
but EvalService invokes observer methods directly (e.g., OnContextFetched/OnStarted) without
guarding, so exceptions will propagate out of RunAsync.

src/kapacitor/Eval/IEvalObserver.cs[11-16]
src/kapacitor/Eval/EvalService.cs[89-97]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`EvalService.RunAsync` invokes `IEvalObserver` callbacks directly. If an observer throws, the eval run can abort unexpectedly and violate the observer contract.

### Issue Context
`IEvalObserver` docstring states the service catches/logs observer exceptions and continues.

### Fix Focus Areas
- src/kapacitor/Eval/EvalService.cs[89-180]
- src/kapacitor/Eval/IEvalObserver.cs[11-16]

### Suggested fix
- Introduce a small helper like `SafeNotify(Action call)` / `SafeNotify(string name, Action call)`.
- Wrap every `observer.*` invocation in try/catch and log (or at least swallow) exceptions so the eval can proceed / fail gracefully.
- Ensure `OnFailed`/`OnFinished` are still attempted even if earlier notifications threw.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
5. Daemon evals not cancellable🐞
Description
EvalRunner starts eval execution via fire-and-forget Task.Run without any linkage to daemon shutdown
(ApplicationStopping) or ServerConnection cancellation. In-flight evals can be abruptly terminated
on process exit or keep running while the SignalR connection is disposing, leading to missing final
progress events.
Code

src/kapacitor/Daemon/Services/EvalRunner.cs[R34-61]

+        // Fire-and-forget so the SignalR hub invocation doesn't block; the
+        // eval's own observer callbacks fan progress back to the server.
+        // Any unhandled exception from the run is caught and translated to
+        // an EvalFailed event so the dashboard learns about it.
+        _ = Task.Run(async () => {
+            try {
+                using var httpClient = await HttpClientExtensions.CreateAuthenticatedClientAsync();
+                var observer = new DaemonEvalObserver(_connection, cmd.EvalRunId, cmd.SessionId, _logger);
+
+                await EvalService.RunAsync(
+                    _baseUrl,
+                    httpClient,
+                    cmd.SessionId,
+                    cmd.Model,
+                    cmd.Chain,
+                    cmd.ThresholdBytes,
+                    observer
+                );
+            } catch (Exception ex) {
+                _logger.LogError(ex, "Unhandled exception running eval {RunId} on session {Sid}", cmd.EvalRunId, cmd.SessionId);
+
+                try {
+                    await _connection.EvalFailedAsync(cmd.EvalRunId, cmd.SessionId, $"daemon error: {ex.GetType().Name}");
+                } catch (Exception relayEx) {
+                    _logger.LogError(relayEx, "Failed to relay EvalFailed for eval {RunId}", cmd.EvalRunId);
+                }
+            }
+        });
Evidence
DaemonRunner connects ServerConnection with lifetime.ApplicationStopping, but EvalRunner doesn’t
receive or use that token; it spawns background tasks that run independently and calls
EvalService.RunAsync without a CancellationToken.

src/kapacitor/Daemon/DaemonRunner.cs[131-134]
src/kapacitor/Daemon/Services/EvalRunner.cs[34-51]
src/kapacitor/Eval/EvalService.cs[29-38]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Daemon-side eval runs are spawned in background tasks with no cancellation/cleanup integration, so shutdown can drop in-flight evals and/or prevent final progress relays.

### Issue Context
The daemon already has a lifecycle CancellationToken (`IHostApplicationLifetime.ApplicationStopping`) used for ServerConnection.

### Fix Focus Areas
- src/kapacitor/Daemon/DaemonRunner.cs[131-134]
- src/kapacitor/Daemon/Services/EvalRunner.cs[28-62]
- src/kapacitor/Eval/EvalService.cs[29-38]

### Suggested fix
- Inject `IHostApplicationLifetime` (or a CancellationToken via DI) into `EvalRunner`.
- Pass `ApplicationStopping` as the `ct` argument to `EvalService.RunAsync`.
- Optionally track spawned tasks and await them during shutdown (with timeout) so the daemon can emit EvalFailed/EvalFinished deterministically.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread src/kapacitor/Daemon/Services/EvalRunner.cs
@alexeyzimarev alexeyzimarev force-pushed the alexeyzimarev/dev-1440-m2-daemon-eval-runner branch from db791ed to 1e59042 Compare April 13, 2026 15:00
alexeyzimarev added a commit that referenced this pull request Apr 13, 2026
Three findings on PR #15 (the other two — observer-throw guard and
judge-fact cancellation propagation — were already addressed by the
M1 follow-up in 1f655f4):

1. EvalRunId mismatch (Action required) — server dispatches
   RunEvalCommand with an EvalRunId, but EvalService generated its own
   GUID, leading to two different ids in one run's event stream
   (EvalStarted used the service-generated id; subsequent question /
   finished / failed events used the dispatched id captured in
   DaemonEvalObserver). Fixed by adding an optional `evalRunId`
   parameter to EvalService.RunAsync; CLI passes null (mints a fresh
   id, current behaviour) and the daemon passes cmd.EvalRunId so the
   whole run, including the persisted SessionEvalCompleted aggregate,
   shares one correlation id end-to-end.

2. Out-of-order progress events (Recommended) — DaemonEvalObserver's
   per-event Task.Run can interleave concurrent SignalR sends. Added a
   SemaphoreSlim(1,1) gate inside Relay so the background sends drain
   in their enqueue order — the dashboard sees EvalStarted before any
   question completion, and EvalFinished/EvalFailed last, deterministically.

3. Daemon evals not cancellable on shutdown (Recommended) — EvalRunner
   spawned Task.Run with no link to the host lifecycle. Now injects
   IHostApplicationLifetime, captures ApplicationStopping, and passes
   it as ct to EvalService.RunAsync. M1's outer try/catch turns
   in-flight cancellation into a clean OnFailed("cancelled") relay so
   the dashboard learns the eval stopped instead of waiting forever.

Full suite 205/205, AOT publish clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alexeyzimarev and others added 2 commits April 13, 2026 17:06
Daemon side of the dashboard-driven eval pipeline. Pairs with the server
M3 endpoint in kurrent-io/Kurrent.Capacitor#477 and depends on the M1
shared eval library in #14.

- New SignalR wire types in Models.cs match the server's DaemonCommands.cs:
  RunEvalCommand (server -> daemon dispatch) plus the four daemon -> server
  progress events (EvalStarted, EvalQuestionCompleted, EvalFinished,
  EvalFailed). Registered in KapacitorJsonContext for source-gen
  serialization.

- ServerConnection registers a "RunEval" handler and exposes per-event
  send methods (EvalStartedAsync etc.) that mirror the existing
  AgentRegisteredAsync / LaunchFailedAsync pattern.

- New EvalRunner singleton subscribes to OnRunEval. Each incoming
  command spawns a fire-and-forget Task that builds an authenticated
  HttpClient, instantiates a DaemonEvalObserver bound to the run, and
  drives EvalService.RunAsync. Unhandled exceptions are caught and
  translated to an EvalFailed relay so the dashboard learns about
  daemon-side failures rather than waiting forever.

- DaemonEvalObserver maps the IEvalObserver surface to SignalR sends:
  OnStarted -> EvalStartedAsync, OnQuestionCompleted ->
  EvalQuestionCompletedAsync, OnFinished -> EvalFinishedAsync, OnFailed
  -> EvalFailedAsync. Info / per-question-start / per-question-failure /
  fact-retained callbacks just log locally — they're not interesting
  enough to justify SignalR chatter for every judge.

- Wired into DaemonRunner DI: AddSingleton<EvalRunner> + an explicit
  GetRequiredService at startup so the constructor's OnRunEval
  subscription happens before the host starts taking traffic.

Full suite 205/205, AOT publish clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three findings on PR #15 (the other two — observer-throw guard and
judge-fact cancellation propagation — were already addressed by the
M1 follow-up in 1f655f4):

1. EvalRunId mismatch (Action required) — server dispatches
   RunEvalCommand with an EvalRunId, but EvalService generated its own
   GUID, leading to two different ids in one run's event stream
   (EvalStarted used the service-generated id; subsequent question /
   finished / failed events used the dispatched id captured in
   DaemonEvalObserver). Fixed by adding an optional `evalRunId`
   parameter to EvalService.RunAsync; CLI passes null (mints a fresh
   id, current behaviour) and the daemon passes cmd.EvalRunId so the
   whole run, including the persisted SessionEvalCompleted aggregate,
   shares one correlation id end-to-end.

2. Out-of-order progress events (Recommended) — DaemonEvalObserver's
   per-event Task.Run can interleave concurrent SignalR sends. Added a
   SemaphoreSlim(1,1) gate inside Relay so the background sends drain
   in their enqueue order — the dashboard sees EvalStarted before any
   question completion, and EvalFinished/EvalFailed last, deterministically.

3. Daemon evals not cancellable on shutdown (Recommended) — EvalRunner
   spawned Task.Run with no link to the host lifecycle. Now injects
   IHostApplicationLifetime, captures ApplicationStopping, and passes
   it as ct to EvalService.RunAsync. M1's outer try/catch turns
   in-flight cancellation into a clean OnFailed("cancelled") relay so
   the dashboard learns the eval stopped instead of waiting forever.

Full suite 205/205, AOT publish clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexeyzimarev alexeyzimarev force-pushed the alexeyzimarev/dev-1440-m2-daemon-eval-runner branch from ee5bc16 to 2866c00 Compare April 13, 2026 15:07
@alexeyzimarev alexeyzimarev merged commit fc1865d into main Apr 13, 2026
3 checks passed
@alexeyzimarev alexeyzimarev deleted the alexeyzimarev/dev-1440-m2-daemon-eval-runner branch April 13, 2026 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant