-
-
Notifications
You must be signed in to change notification settings - Fork 918
feat(engine): Batch trigger reloaded #2779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: e78b619 The changes in this PR will be included in the next version bump. This PR includes changesets to release 26 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds a two‑phase batch system: Phase 1 API to create batches and Phase 2 NDJSON streaming to submit items. Introduces a Redis‑backed BatchQueue with DRR scheduling, FairQueue primitives (master/worker/visibility/concurrency), completion tracking, payload offload to object storage, per‑organization limits, and RunEngine integration for processing and callbacks. Persists batch state and item errors via Prisma migrations and schema updates. Extends web UI, API routes/presenters, SDK/client streaming surfaces, observability (OTel/Prometheus/Grafana), and many tests for queueing, scheduling, visibility, and streaming logic. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on January 3
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/trigger-sdk/src/v3/batch.ts (1)
26-46: Update JSDoc return type to matchRetrieveBatchV2Response
retrieveBatchnow correctly returnsApiPromise<RetrieveBatchV2Response>, but the JSDoc still documentsRetrieveBatchResponse. This can mislead consumers and tooling that read JSDoc.Consider updating the comment:
- * @returns {ApiPromise<RetrieveBatchResponse>} A promise that resolves with the batch details + * @returns {ApiPromise<RetrieveBatchV2Response>} A promise that resolves with the batch details
🧹 Nitpick comments (48)
apps/webapp/seed.mts (1)
113-156: Consider dynamically building the tenants array.The tenants array construction is repetitive. A refactored approach would reduce duplication and make it easier to maintain when adding/removing orgs or projects.
- console.log("tenants.json"); - console.log( - JSON.stringify({ - apiUrl: "http://localhost:3030", - tenants: [ - { - id: org1Project1.project.externalRef, - secretKey: org1Project1.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org1Project2.project.externalRef, - secretKey: org1Project2.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org1Project3.project.externalRef, - secretKey: org1Project3.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org2Project1.project.externalRef, - secretKey: org2Project1.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org2Project2.project.externalRef, - secretKey: org2Project2.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org2Project3.project.externalRef, - secretKey: org2Project3.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org3Project1.project.externalRef, - secretKey: org3Project1.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org3Project2.project.externalRef, - secretKey: org3Project2.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org3Project3.project.externalRef, - secretKey: org3Project3.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - ], - }) - ); + const allProjects = [ + org1Project1, org1Project2, org1Project3, + org2Project1, org2Project2, org2Project3, + org3Project1, org3Project2, org3Project3, + ]; + + const tenants = allProjects.map((p) => ({ + id: p.project.externalRef, + secretKey: p.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, + })); + + console.log("tenants.json"); + console.log(JSON.stringify({ apiUrl: "http://localhost:3030", tenants }));internal-packages/run-engine/src/engine/tests/batchTriggerAndWait.test.ts (3)
816-842: Loop iterates overcreatedRunsbut ignores therunIdwhen dequeuing.The loop variable
runIdfromcreatedRunsisn't used to select which run to dequeue—it relies on queue ordering. IfdequeuedChild.length === 0, the loop silently continues, which could mask timing issues in the test. Consider either:
- Failing the test if no run is dequeued when expected, or
- Adding a comment clarifying that the loop count just controls iteration count and ordering is assumed
for (const { runId } of createdRuns) { // Dequeue and start child await setTimeout(300); const dequeuedChild = await engine.dequeueFromWorkerQueue({ consumerId: "test_12345", workerQueue: "main", }); - if (dequeuedChild.length === 0) continue; + // Expect a run to be dequeued for each created run + expect(dequeuedChild.length).toBeGreaterThan(0);
1159-1185: Same dequeue loop pattern as the success test.Consider adding an assertion that a run is dequeued for each iteration, or documenting why silent continuation is acceptable. This would make test failures more debuggable if timing issues occur.
877-1221: Comprehensive partial failure test.Good coverage of the partial failure scenario. The test verifies:
PARTIAL_FAILEDstatus when some items fail- Error records created in
BatchTaskRunError- Parent still resumes after successful runs complete
- Batch eventually transitions to
COMPLETEDConsider adding a test case for when all items fail (the
ABORTEDbranch at line 983) to complete the status matrix coverage.Would you like me to help generate a test case for the all-items-fail (
ABORTED) scenario?packages/trigger-sdk/src/v3/shared.ts (3)
1578-1599: Streaming path fully buffers items, negating memory benefits.The
executeBatchTwoPhaseStreamingfunction collects all items into an array before processing, which eliminates the memory efficiency advantages of streaming for large batches. This is acknowledged in the comment, but users may expect true streaming behavior when passing anAsyncIterable.Consider documenting this limitation in the public API JSDoc comments (e.g., for
batchTriggerById) so users understand that streaming inputs are currently buffered before transmission.
1663-1961: Significant code duplication across transform functions.The six
transform*Stream*functions share ~90% identical logic for building theBatchItemNDJSONoptions object. The main variations are:
- Task ID extraction (
item.idvsitem.task.idvs parameter)lockToVersionsource (env var vstaskContext.worker?.version)- Optional queue parameter
Consider extracting the common options-building logic into a shared helper to reduce the ~300 lines of duplication:
function buildBatchItemOptions( item: { options?: BatchItemOptions }, payloadPacket: { dataType: string }, batchItemIdempotencyKey: string | undefined, options?: { idempotencyKeyTTL?: string }, overrides?: { lockToVersion?: string; queue?: string } ): BatchItemNDJSON['options'] { return { queue: item.options?.queue ? { name: item.options.queue } : overrides?.queue ? { name: overrides.queue } : undefined, // ... common fields lockToVersion: overrides?.lockToVersion ?? item.options?.version ?? getEnvVar("TRIGGER_VERSION"), }; }
611-730: Array path duplicates the streaming transform logic.The array processing (lines 613-652) duplicates the logic in
transformBatchItemsStream. Both paths could use the same transformation logic:// Unified approach const asyncItems = Array.isArray(items) ? (async function* () { for (const item of items) yield item; })() : normalizeToAsyncIterable(items); const ndJsonItems: BatchItemNDJSON[] = []; for await (const item of transformBatchItemsStream(asyncItems, options)) { ndJsonItems.push(item); }This would reduce duplication and ensure both paths stay in sync. However, the current explicit approach is also valid for clarity.
packages/redis-worker/src/fair-queue/tests/workerQueue.test.ts (1)
26-29: Consider guarding non-null assertions.While the test expects
popto return a result, using a non-null assertion without a preceding null check can make test failures less informative.Consider this pattern for clearer test failures:
const result = await manager.pop("worker-1"); expect(result).not.toBeNull(); -expect(result!.messageKey).toBe("msg-1:queue-1"); -expect(result!.queueLength).toBe(0); +if (result) { + expect(result.messageKey).toBe("msg-1:queue-1"); + expect(result.queueLength).toBe(0); +}docs/batch-queue-metrics.md (2)
74-78: Add language specifier to fenced code block.The code block lacks a language specifier. Based on the content (mathematical relationships), consider using
textorplaintextto satisfy the linter.-``` +```text batches_enqueued × avg_items_per_batch ≈ items_enqueued items_enqueued = items_processed + items_failed + items_pending batches_completed ≤ batches_enqueued (lag indicates processing backlog)--- `180-212`: **Add `promql` language specifier to PromQL code blocks.** The PromQL query blocks at lines 180, 194, and 205 are missing language specifiers. Adding `promql` will enable syntax highlighting and satisfy the linter. ```diff ### Processing Health -``` +```promql # Throughput rate(batch_queue_items_processed_total[5m])Apply similar changes to the code blocks starting at lines 194 and 205.
packages/redis-worker/src/fair-queue/retry.ts (1)
33-42: Minor redundancy in maxAttempts assignment.The
maxAttemptsvalue is already guaranteed to be set in thethis.optionsobject on line 35, making the null coalescing on line 41 redundant.this.options = { maxAttempts: options?.maxAttempts ?? 12, factor: options?.factor ?? 2, minTimeoutInMs: options?.minTimeoutInMs ?? 1_000, maxTimeoutInMs: options?.maxTimeoutInMs ?? 3_600_000, // 1 hour randomize: options?.randomize ?? true, }; - this.maxAttempts = this.options.maxAttempts ?? 12; + this.maxAttempts = this.options.maxAttempts!;apps/webapp/app/v3/batchTriggerWorker.server.ts (1)
24-25: Clarify the initialization pattern.While the comment on lines 9-10 explains the import, the
void enginestatement on line 25 could be more explicit about forcing initialization. Consider:- // Ensure the engine (and its BatchQueue) is initialized - void engine; + // Force engine initialization (triggers BatchQueue setup for v2 batches) + // The void operator discards the value while ensuring the side effect + void engine;packages/redis-worker/src/fair-queue/tests/drr.test.ts (1)
8-356: DRR scheduler test coverage is solid; consider renaming one test for clarityThe tests exercise deficit lifecycle, queue selection semantics, ordering by deficit, and aggregate deficit retrieval against a real Redis instance, which gives good confidence in
DRRScheduler.Minor nit: the test
redisTest("should skip tenants with insufficient deficit", ...);ultimately asserts that both tenants are returned after quantum is added, so the name doesn’t quite describe what’s being verified. Renaming it to something like
"should include tenants once quantum has been added"would make intent clearer for future maintainers.internal-packages/database/prisma/migrations/20251205135152_add_columns_for_run_engine_batch_trigger_v2/migration.sql (1)
17-30: Consider adding a unique constraint on(batchTaskRunId, index).The
BatchTaskRunErrortable allows multiple error entries with the samebatchTaskRunIdandindexcombination. If each batch item (identified by index) should have at most one error record, adding a unique constraint would enforce data integrity and prevent duplicate error entries.-- CreateIndex CREATE INDEX "BatchTaskRunError_batchTaskRunId_idx" ON "public"."BatchTaskRunError"("batchTaskRunId"); + +-- CreateIndex (optional: enforce one error per batch item) +CREATE UNIQUE INDEX "BatchTaskRunError_batchTaskRunId_index_key" ON "public"."BatchTaskRunError"("batchTaskRunId", "index");apps/webapp/app/presenters/v3/BatchPresenter.server.ts (1)
69-71: Consider using a more specific error or returning null.Throwing a generic
Error("Batch not found")may make it harder for the caller to distinguish between different failure modes. Consider using a custom error type or returningnulland letting the caller handle the not-found case, which is a common pattern in presenters.packages/redis-worker/src/fair-queue/tests/fairQueue.test.ts (1)
472-473: Fixed delay before DLQ check could cause flaky tests.Using a fixed 500ms delay to wait for DLQ processing may be insufficient under load or on slower CI machines. Consider using
vi.waitForwith a condition that checks DLQ length, similar to how other assertions in this file are structured.- // Give time for DLQ processing - await new Promise((resolve) => setTimeout(resolve, 500)); - - // Check DLQ - const dlqMessages = await queue.getDeadLetterMessages("t1"); - expect(dlqMessages).toHaveLength(1); + // Wait for DLQ processing and verify + let dlqMessages: Awaited<ReturnType<typeof queue.getDeadLetterMessages>> = []; + await vi.waitFor( + async () => { + dlqMessages = await queue.getDeadLetterMessages("t1"); + expect(dlqMessages).toHaveLength(1); + }, + { timeout: 5000 } + );internal-packages/run-engine/src/engine/tests/batchTwoPhase.test.ts (2)
484-508: Child run completion loop may skip runs if timing varies.The loop continues silently when
dequeuedChild.length === 0. While this handles timing issues gracefully, if a child run is never dequeued due to a bug, the test would still pass (it waits for waitpoints to clear, which could happen for other reasons).Consider adding an assertion after the loop to verify all expected child runs were processed:
// After the loop expect(createdRuns.filter(r => /* was completed */).length).toBe(2);
18-56: Consider extracting common engine configuration.The
RunEngineconfiguration is duplicated across all four tests with minor variations. Extract a helper function to reduce duplication and make tests easier to maintain.function createTestEngine(prisma: PrismaClient, redisOptions: RedisOptions, overrides?: Partial<RunEngineOptions>) { return new RunEngine({ prisma, worker: { redis: redisOptions, workers: 1, tasksPerWorker: 10, pollIntervalMs: 20, }, queue: { redis: redisOptions, masterQueueConsumersDisabled: true, processWorkerQueueDebounceMs: 50, }, runLock: { redis: redisOptions }, machines: { defaultMachine: "small-1x", machines: { "small-1x": { name: "small-1x" as const, cpu: 0.5, memory: 0.5, centsPerMs: 0.0001 }, }, baseCostInCents: 0.0001, }, batchQueue: { redis: redisOptions, consumerCount: 2, consumerIntervalMs: 50, drr: { quantum: 10, maxDeficit: 100 }, }, tracer: trace.getTracer("test", "0.0.0"), ...overrides, }); }Also applies to: 164-202, 278-316, 547-584
apps/webapp/app/routes/api.v3.batches.ts (1)
84-92: Consider redacting idempotencyKey in logs.The
idempotencyKeyis logged directly. If users include sensitive information in their idempotency keys (e.g., user IDs, email-based keys), this could leak PII to logs.logger.debug("Create batch request", { runCount: body.runCount, parentRunId: body.parentRunId, resumeParentOnCompletion: body.resumeParentOnCompletion, - idempotencyKey: body.idempotencyKey, + hasIdempotencyKey: !!body.idempotencyKey, triggerVersion, isFromWorker, triggerClient, });internal-packages/run-engine/src/engine/index.ts (1)
973-975:isBatchQueueEnabled()always returns true.Since
batchQueueis always instantiated in the constructor (lines 325-346), this method will always returntrue. If the intent is to conditionally enable the BatchQueue based on configuration, consider checking a config flag instead.isBatchQueueEnabled(): boolean { - return this.batchQueue !== undefined; + return this.options.batchQueue !== undefined && this.batchQueue !== undefined; }apps/webapp/app/v3/runEngine.server.ts (1)
354-366: Consider usingcreateManyfor batch error insertion.Creating error records in a sequential loop can be slow for batches with many failures. Consider using Prisma's
createManyfor better performance.- if (failures.length > 0) { - for (const failure of failures) { - await prisma.batchTaskRunError.create({ - data: { - batchTaskRunId: batchId, - index: failure.index, - taskIdentifier: failure.taskIdentifier, - payload: failure.payload, - options: failure.options as Prisma.InputJsonValue | undefined, - error: failure.error, - errorCode: failure.errorCode, - }, - }); - } - } + if (failures.length > 0) { + await prisma.batchTaskRunError.createMany({ + data: failures.map((failure) => ({ + batchTaskRunId: batchId, + index: failure.index, + taskIdentifier: failure.taskIdentifier, + payload: failure.payload, + options: failure.options as Prisma.InputJsonValue | undefined, + error: failure.error, + errorCode: failure.errorCode, + })), + }); + }packages/redis-worker/src/fair-queue/schedulers/roundRobin.ts (2)
79-89: Sequential capacity checks may impact performance.With many tenants, calling
context.isAtCapacitysequentially in a loop could introduce latency. Consider batching these checks or usingPromise.allwith a reasonable concurrency limit if the underlying implementation supports it.// Example: batch capacity checks const capacityResults = await Promise.all( rotatedTenants.map(async (tenantId) => ({ tenantId, atCapacity: await context.isAtCapacity("tenant", tenantId), })) ); const eligibleTenants: TenantQueues[] = capacityResults .filter((r) => !r.atCapacity) .map(({ tenantId }) => ({ tenantId, queues: queuesByTenant.get(tenantId) ?? [], }));
146-149: Consider adding TTL to the lastServed key.The
lastServedindex persists indefinitely in Redis. If shards are removed or renamed, stale keys accumulate. Consider setting an expiration or implementing periodic cleanup.async #setLastServedIndex(shardKey: string, index: number): Promise<void> { const key = this.#lastServedKey(shardKey); - await this.redis.set(key, index.toString()); + // Expire after 24 hours - will be refreshed on each selectQueues call + await this.redis.set(key, index.toString(), "EX", 86400); }apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.batches.$batchParam/route.tsx (2)
51-73: Redundant error handling:tryCatchwrapped intry/catch.The
tryCatchutility already returns[error, data]tuple, making the outertry/catchunnecessary unlessBatchPresenterconstructor can throw. The current pattern double-handles errors.- try { - const presenter = new BatchPresenter(); - const [error, data] = await tryCatch( - presenter.call({ - environmentId: environment.id, - batchId: batchParam, - userId, - }) - ); - - if (error) { - throw new Error(error.message); - } - - return typedjson({ batch: data }); - } catch (error) { - console.error(error); - throw new Response(undefined, { - status: 400, - statusText: "Something went wrong, if this problem persists please contact support.", - }); - } + const presenter = new BatchPresenter(); + const [error, data] = await tryCatch( + presenter.call({ + environmentId: environment.id, + batchId: batchParam, + userId, + }) + ); + + if (error) { + console.error(error); + throw new Response(undefined, { + status: 400, + statusText: "Something went wrong, if this problem persists please contact support.", + }); + } + + return typedjson({ batch: data });
75-75: Use named function declaration instead of default export.Per coding guidelines for
**/*.{ts,tsx,js,jsx}, prefer function declarations over default exports.-export default function Page() { +function Page() { // ... component body } + +export { Page as default };Or rename to a more descriptive name like
BatchDetailPage.apps/webapp/app/runEngine/concerns/batchPayloads.server.ts (1)
148-163: Silent fallback on serialization failure may hide issues.When
JSON.stringifyfails for non-JSON types, returning an empty packet without logging could make debugging difficult.Consider adding debug logging for the catch case:
// For other types, try to stringify try { return { data: JSON.stringify(payload), dataType: payloadType }; } catch { + logger.debug("Failed to stringify payload, returning empty packet", { + payloadType, + }); return { dataType: payloadType }; }packages/redis-worker/src/fair-queue/schedulers/drr.ts (2)
184-187: Fragile deficit key construction.The key is built by splitting the master queue key and taking the first segment. This creates tight coupling to the key format and could break if the key structure changes.
Consider adding a dedicated method to the
FairQueueKeyProducerinterface for the DRR deficit key:#deficitKey(): string { - // Use a fixed key for DRR deficit tracking - return `${this.keys.masterQueueKey(0).split(":")[0]}:drr:deficit`; + // Consider adding a deficitKey() method to FairQueueKeyProducer + // For now, use a fixed prefix approach + const prefix = this.keys.masterQueueKey(0).split(":")[0] ?? "fairqueue"; + return `${prefix}:drr:deficit`; }Or better, add
drrDeficitKey()to theFairQueueKeyProducerinterface for consistency with other key methods.
189-215: Hardcoded limit of 1000 queues per shard.The limit is hardcoded without documentation or configuration option. This could silently drop queues in high-volume scenarios.
Consider making this configurable:
+ private queueFetchLimit: number; + constructor(private config: DRRSchedulerConfig) { // ... existing code + this.queueFetchLimit = config.queueFetchLimit ?? 1000; } async #getQueuesFromShard(shardKey: string): Promise<QueueWithScore[]> { const now = Date.now(); const results = await this.redis.zrangebyscore( shardKey, "-inf", now, "WITHSCORES", "LIMIT", 0, - 1000 // Limit for performance + this.queueFetchLimit );apps/webapp/app/runEngine/services/streamBatchItems.server.ts (2)
34-37: Constructor doesn't pass the engine option to base class.The
WithRunEnginebase class accepts an optionalengineparameter, but here onlyprismais passed. While this works because the base class has a default, explicitly passing both options would be more consistent with the class design.- constructor(protected readonly _prisma: PrismaClientOrTransaction = prisma) { - super({ prisma }); + constructor(protected readonly _prisma: PrismaClientOrTransaction = prisma, engine?: RunEngine) { + super({ prisma: _prisma, engine }); this.payloadProcessor = new BatchPayloadProcessor(); }
96-148: Consider adding a timeout or item limit to prevent unbounded processing.The
for awaitloop processes items indefinitely without any timeout or maximum item count check. If a client sends items slowly or never closes the stream, this could hold resources indefinitely.Consider adding a timeout mechanism or enforcing the
batch.runCountlimit during streaming:+ const maxItems = batch.runCount; // Process items from the stream for await (const rawItem of itemsIterator) { + // Safety check: don't accept more items than expected + if (itemsAccepted + itemsDeduplicated >= maxItems) { + throw new ServiceValidationError( + `Received more items than expected runCount ${maxItems}` + ); + }packages/redis-worker/src/fair-queue/keyProducer.ts (1)
94-102: Consider logging or documenting when fallback is used in extractTenantId.The fallback to
parts[0] ?? ""when the queue ID doesn't match the expectedtenant:{tenantId}:...format could silently return incorrect tenant IDs. This might make debugging difficult in production.The current behavior is reasonable for flexibility, but documenting the expected format clearly or logging when fallback is triggered would help with troubleshooting.
packages/core/src/v3/apiClient/index.ts (2)
427-432: Consider using ApiError for consistency in parse failure.When the response fails to parse, a generic
Erroris thrown. Other methods in this class throwApiErrorfor server-related issues. Consider wrapping this inApiErrorfor consistent error handling by callers.if (!parsed.success) { - throw new Error(`Invalid response from server: ${parsed.error.message}`); + throw ApiError.generate( + response.status, + { error: "Invalid response format", details: parsed.error.message }, + undefined, + Object.fromEntries(response.headers.entries()) + ); }
1545-1548: JSON.stringify can throw on circular references.If a
BatchItemNDJSONcontains circular references (unlikely but possible with user data),JSON.stringifywill throw. The error would propagate but might be confusing. Consider wrapping with try-catch for a clearer error message.- const line = JSON.stringify(item) + "\n"; - controller.enqueue(encoder.encode(line)); + try { + const line = JSON.stringify(item) + "\n"; + controller.enqueue(encoder.encode(line)); + } catch (err) { + controller.error(new Error(`Failed to serialize batch item at index ${index - 1}: ${(err as Error).message}`)); + }Also applies to: 1562-1564
packages/redis-worker/src/fair-queue/scheduler.ts (2)
1-1: Unused import: QueueDescriptor.
QueueDescriptoris imported from./types.jsbut never used in this file.-import type { FairScheduler, SchedulerContext, TenantQueues, QueueDescriptor } from "./types.js"; +import type { FairScheduler, SchedulerContext, TenantQueues } from "./types.js";
77-92: Consider parallelizing capacity checks for better performance.The sequential
awaitinside the loop means N tenants require N round-trips. For large tenant counts, parallel checks would be faster:protected async filterAtCapacity( tenants: TenantQueues[], context: SchedulerContext, groupName: string = "tenant" ): Promise<TenantQueues[]> { - const filtered: TenantQueues[] = []; - - for (const tenant of tenants) { - const isAtCapacity = await context.isAtCapacity(groupName, tenant.tenantId); - if (!isAtCapacity) { - filtered.push(tenant); - } - } - - return filtered; + const capacityChecks = await Promise.all( + tenants.map(async (tenant) => ({ + tenant, + isAtCapacity: await context.isAtCapacity(groupName, tenant.tenantId), + })) + ); + return capacityChecks.filter(({ isAtCapacity }) => !isAtCapacity).map(({ tenant }) => tenant); }packages/redis-worker/src/fair-queue/masterQueue.ts (1)
24-30: Consider adding error handling for Redis connection failure.The constructor creates a Redis client but doesn't handle connection errors. If Redis is unavailable, errors will occur on first operation. Consider adding connection verification or at least documenting that the caller should handle connection errors.
constructor(private options: MasterQueueOptions) { this.redis = createRedisClient(options.redis); this.keys = options.keys; this.shardCount = Math.max(1, options.shardCount); this.#registerCommands(); + // Note: Redis connection errors will surface on first operation }packages/redis-worker/src/fair-queue/concurrency.ts (3)
10-14: Consider usingtypeinstead ofinterfaceper coding guidelines.Per the TypeScript coding guidelines for this repository, prefer types over interfaces.
-export interface ConcurrencyManagerOptions { - redis: RedisOptions; - keys: FairQueueKeyProducer; - groups: ConcurrencyGroupConfig[]; -} +export type ConcurrencyManagerOptions = { + redis: RedisOptions; + keys: FairQueueKeyProducer; + groups: ConcurrencyGroupConfig[]; +};
47-62: Sequential capacity check may cause unnecessary round-trips.The
canProcessmethod performs sequential async checks for each group. Consider batching these checks using a pipeline or the Lua script approach used inreservefor better performance with many groups.
94-107: Release operation is not atomic across groups.Unlike
reserve, thereleasemethod uses a pipeline which executes commands sequentially but not atomically. If a failure occurs mid-pipeline, some groups may have the message removed while others retain it.Consider using a Lua script for atomic release similar to the reserve operation, or document that partial release is acceptable:
async release(queue: QueueDescriptor, messageId: string): Promise<void> { + // Note: Pipeline is sufficient here as partial release is recoverable + // (worst case: message is removed from some groups, which is still safe) const pipeline = this.redis.pipeline();packages/redis-worker/src/fair-queue/visibility.ts (1)
5-14: Consider usingtypeinstead ofinterfaceper coding guidelines.-export interface VisibilityManagerOptions { - redis: RedisOptions; - keys: FairQueueKeyProducer; - shardCount: number; - defaultTimeoutMs: number; - logger?: { - debug: (message: string, context?: Record<string, unknown>) => void; - error: (message: string, context?: Record<string, unknown>) => void; - }; -} +export type VisibilityManagerOptions = { + redis: RedisOptions; + keys: FairQueueKeyProducer; + shardCount: number; + defaultTimeoutMs: number; + logger?: { + debug: (message: string, context?: Record<string, unknown>) => void; + error: (message: string, context?: Record<string, unknown>) => void; + }; +};apps/webapp/app/runEngine/services/createBatch.server.ts (1)
173-203: Error handling for P2002 could be more precise.The current logic assumes that if the target doesn't include "oneTimeUseToken", it must be an idempotency key violation. However, there could be other unique constraints on the table.
Consider checking for the specific constraint name or including additional checks:
if ( Array.isArray(target) && target.length > 0 && typeof target[0] === "string" && target[0].includes("oneTimeUseToken") ) { throw new ServiceValidationError( "Cannot create batch with a one-time use token as it has already been used." ); - } else { + } else if ( + Array.isArray(target) && + target.some((t) => typeof t === "string" && t.includes("idempotencyKey")) + ) { throw new ServiceValidationError( "Cannot create batch as it has already been created with the same idempotency key." ); + } else { + // Unknown unique constraint violation - re-throw original error + throw error; }packages/redis-worker/src/fair-queue/workerQueue.ts (3)
4-11: Consider usingtypeinstead ofinterfaceper coding guidelines.-export interface WorkerQueueManagerOptions { - redis: RedisOptions; - keys: FairQueueKeyProducer; - logger?: { - debug: (message: string, context?: Record<string, unknown>) => void; - error: (message: string, context?: Record<string, unknown>) => void; - }; -} +export type WorkerQueueManagerOptions = { + redis: RedisOptions; + keys: FairQueueKeyProducer; + logger?: { + debug: (message: string, context?: Record<string, unknown>) => void; + error: (message: string, context?: Record<string, unknown>) => void; + }; +};
93-150: Blocking pop has a potential resource leak with abort signal.The event listener added to the abort signal is never explicitly removed if the operation completes normally before abort. While
{ once: true }helps, the listener remains attached until either abort or GC.Consider explicitly removing the listener in the finally block:
async blockingPop( workerQueueId: string, timeoutSeconds: number, signal?: AbortSignal ): Promise<string | null> { const workerQueueKey = this.keys.workerQueueKey(workerQueueId); const blockingClient = this.redis.duplicate(); + let cleanup: (() => void) | undefined; try { if (signal) { - const cleanup = () => { + cleanup = () => { blockingClient.disconnect(); }; signal.addEventListener("abort", cleanup, { once: true }); if (signal.aborted) { return null; } } // ... rest of implementation } finally { + if (signal && cleanup) { + signal.removeEventListener("abort", cleanup); + } await blockingClient.quit().catch(() => {}); } }
231-274: Lua script duplication between private and public methods.The
popWithLengthLua script is defined identically in both#registerCommands()andregisterCommands(). Consider extracting the script to a constant to avoid drift.+const POP_WITH_LENGTH_LUA = ` +local workerQueueKey = KEYS[1] +local messageKey = redis.call('LPOP', workerQueueKey) +if not messageKey then + return nil +end +local queueLength = redis.call('LLEN', workerQueueKey) +return {messageKey, queueLength} +`; #registerCommands(): void { this.redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); } registerCommands(redis: Redis): void { redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); }internal-packages/run-engine/src/batch-queue/completionTracker.ts (1)
101-110: Consider validating parsed JSON against BatchMeta schema.The
getMetamethod parses JSON and casts toBatchMetawithout validation. If corrupted data exists in Redis, this could cause unexpected runtime errors.Consider using Zod validation for defense-in-depth:
async getMeta(batchId: string): Promise<BatchMeta | null> { const key = this.metaKey(batchId); const metaJson = await this.redis.get(key); if (!metaJson) { return null; } const result = BatchMeta.safeParse(JSON.parse(metaJson)); if (!result.success) { this.logger.error("Invalid batch metadata", { batchId, error: result.error.message }); return null; } return result.data; }packages/redis-worker/src/fair-queue/index.ts (2)
786-795: Rate limiting waits indefinitely if resetAt is in the future.When rate limited, the code waits until
resetAtbefore proceeding. This could cause a consumer to block for an extended period. Consider adding a maximum wait time or checking abort signal during the wait.if (this.globalRateLimiter) { const result = await this.globalRateLimiter.limit(); if (!result.allowed && result.resetAt) { - const waitMs = Math.max(0, result.resetAt - Date.now()); + const waitMs = Math.min(5000, Math.max(0, result.resetAt - Date.now())); // Cap at 5s if (waitMs > 0) { this.logger.debug("Global rate limit reached, waiting", { waitMs, loopId }); await new Promise((resolve) => setTimeout(resolve, waitMs)); } } }
1026-1036: Duplicate rate limiting code could be extracted.The rate limiting logic at lines 786-795 and 1026-1036 is duplicated. Consider extracting to a private helper method.
private async waitForRateLimit(loopId: string): Promise<void> { if (!this.globalRateLimiter) return; const result = await this.globalRateLimiter.limit(); if (!result.allowed && result.resetAt) { const waitMs = Math.max(0, result.resetAt - Date.now()); if (waitMs > 0) { this.logger.debug("Global rate limit reached, waiting", { waitMs, loopId }); await new Promise((resolve) => setTimeout(resolve, waitMs)); } } }packages/redis-worker/src/fair-queue/types.ts (1)
15-21: Consider usingtypeinstead ofinterfaceper coding guidelines.The coding guidelines specify "Use types over interfaces for TypeScript". However, this is a new file with extensive interface usage. Given the scope and that interfaces provide better error messages for complex types, this may be acceptable.
If strict adherence to guidelines is required:
export type GlobalRateLimiter = { limit(): Promise<{ allowed: boolean; resetAt?: number }>; };
e8388a5 to
3ed008d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/trigger-sdk/src/v3/batch.ts (1)
31-46: JSDoc return type is outdated.The JSDoc comment on line 31 still references
RetrieveBatchResponse, but the actual return type has been updated toRetrieveBatchV2Response. Update the JSDoc to maintain consistency.- * @returns {ApiPromise<RetrieveBatchResponse>} A promise that resolves with the batch details + * @returns {ApiPromise<RetrieveBatchV2Response>} A promise that resolves with the batch details
♻️ Duplicate comments (13)
internal-packages/BATCH_TRIGGER_LIMITS.md (2)
1-6: Remove AI conversation artifacts from documentation.Lines 1–6 contain conversational AI artifacts (
[10 tools called], "Now I have a comprehensive picture") that should be removed so the document begins directly with the technical analysis.-Let me analyze the potential issues with increasing batch size limits by examining the key components involved in batch processing: - -[10 tools called] - -Now I have a comprehensive picture. Let me compile a detailed analysis of the potential issues with increasing batch size limits: - ## Analysis: Potential Issues with Larger Batch Sizes
220-220: Remove trailing AI prompt artifact.Line 220 contains an AI prompt question that doesn't belong in technical documentation.
9. **Batch checkpointing**: For batches > 1,000 runs, checkpoint progress to allow recovery without reprocessing - -Would you like me to start implementing any of these recommendations?apps/webapp/app/env.server.ts (1)
943-950: VerifyBATCH_QUEUE_DRR_QUANTUMandBATCH_QUEUE_MAX_DEFICIThave safe fallbacks.These optional env vars are used by the DRR scheduler. Ensure the consuming code provides safe defaults when these are undefined, otherwise a runtime error may occur when the scheduler tries to use them.
#!/bin/bash # Check how these env vars are consumed and if defaults are provided at the usage site rg "BATCH_QUEUE_DRR_QUANTUM|BATCH_QUEUE_MAX_DEFICIT" -t ts -B 2 -A 5 | head -80.env.example (1)
88-94: Add trailing blank line.The local observability documentation is well-structured. However, the file is missing a trailing newline at the end.
docker/config/grafana/provisioning/dashboards/batch-queue.json (1)
113-118: NaN when no items are processed - previously flagged.The success rate expression
sum(rate(...processed...)) / (sum(rate(...processed...)) + sum(rate(...failed...)))returns NaN when both rates are 0. This was noted in a previous review.packages/redis-worker/src/fair-queue/schedulers/weighted.ts (3)
246-257: Division by zero risk when all tenants have zero avgAge.When all queues have a score of 0,
maxAgewill be 0, causingt.avgAge / maxAgeto produceNaN. This was flagged in a previous review.// Weighted shuffle to select top N tenants const maxAge = Math.max(...tenantAges.map((t) => t.avgAge)); + if (maxAge === 0) { + // All tenants have same age, use equal weights + const selectedTenants = new Set(tenantAges.slice(0, this.maximumTenantCount).map(t => t.tenantId)); + return queues.filter((q) => selectedTenants.has(q.tenantId)); + } const weightedTenants = tenantAges.map((t) => ({
302-314: Division by zero risk when maxLimit is 0.If all tenants have a concurrency limit of 0, the weight calculation at line 313 will divide by zero. This was flagged in a previous review.
// Calculate weights const maxLimit = Math.max( ...tenantIds.map((id) => snapshot.tenants.get(id)!.concurrency.limit) ); + if (maxLimit === 0) { + return this.#shuffle(tenantIds); + }
344-349: Division by zero risk when all queues have age 0.If
maxAgeis 0 (all queues are brand new), the weight calculation will produceNaN. This was flagged in a previous review.// Weighted random based on age const maxAge = Math.max(...queues.map((q) => q.age)); + if (maxAge === 0) { + return queues.map((q) => q.queueId); + } const weightedQueues = queues.map((q) => ({packages/redis-worker/src/fair-queue/visibility.ts (1)
363-372: Member parsing assumes messageId contains no colons.The
#parseMembermethod usesindexOf(":")to split. If a custom messageId contains colons, parsing produces incorrect results. UselastIndexOf(":")instead since queueId is appended last, or document the constraint.packages/redis-worker/src/fair-queue/index.ts (1)
1337-1351: Remove unusedshardIdwhen DLQ is disabled (minor).In the
!deadLetterQueueEnabledbranch of#moveToDeadLetterQueue,shardIdis computed but never used:if (!this.deadLetterQueueEnabled) { // Just complete and discard const shardId = this.masterQueue.getShardForQueue(storedMessage.queueId); await this.visibilityManager.complete(storedMessage.id, storedMessage.queueId); return; }You can safely drop the
shardIdline to avoid linter warnings.packages/trigger-sdk/src/v3/shared.ts (1)
1517-1564: ExposebatchIdon Phase 2 failures to aid recovery (non-blocking).If
createBatchsucceeds butstreamBatchItemsthrows, the error surfaced fromexecuteBatchTwoPhasedoesn’t carry thebatch.id, making post-failure inspection or manual recovery harder for callers. Consider attaching thebatchIdto the thrown error (or wrapping it) so higher-level code can decide whether to inspect/cleanup the batch:- const batch = await apiClient.createBatch( - { - runCount: items.length, - parentRunId: options.parentRunId, - resumeParentOnCompletion: options.resumeParentOnCompletion, - idempotencyKey: options.idempotencyKey, - }, - { spanParentAsLink: options.spanParentAsLink }, - requestOptions - ); - - // If the batch was cached (idempotent replay), skip streaming items - if (!batch.isCached) { - // Phase 2: Stream items - await apiClient.streamBatchItems(batch.id, items, requestOptions); - } + const batch = await apiClient.createBatch( + { + runCount: items.length, + parentRunId: options.parentRunId, + resumeParentOnCompletion: options.resumeParentOnCompletion, + idempotencyKey: options.idempotencyKey, + }, + { spanParentAsLink: options.spanParentAsLink }, + requestOptions + ); + + // If the batch was cached (idempotent replay), skip streaming items + if (!batch.isCached) { + try { + // Phase 2: Stream items + await apiClient.streamBatchItems(batch.id, items, requestOptions); + } catch (error) { + (error as any).batchId = batch.id; + throw error; + } + }This keeps existing behavior but gives callers more context when Phase 2 fails.
packages/redis-worker/src/fair-queue/types.ts (1)
40-79: Consider unifyingmetadatatypes across queue models (optional).
QueueMessage.metadatais typed asRecord<string, unknown>, whileStoredMessage.metadataand the metadata inEnqueueOptions/EnqueueBatchOptionsareRecord<string, string>. SinceStoredMessage.metadatais directly passed through toQueueMessage.metadata, this asymmetry is safe but a bit surprising.If metadata is intended to hold arbitrary JSON-like values everywhere, you could simplify by making these all
Record<string, unknown>:-export interface StoredMessage<TPayload = unknown> { +export interface StoredMessage<TPayload = unknown> { ... - metadata?: Record<string, string>; + metadata?: Record<string, unknown>; } -export interface EnqueueOptions<TPayload = unknown> { +export interface EnqueueOptions<TPayload = unknown> { ... - metadata?: Record<string, string>; + metadata?: Record<string, unknown>; } -export interface EnqueueBatchOptions<TPayload = unknown> { +export interface EnqueueBatchOptions<TPayload = unknown> { ... - metadata?: Record<string, string>; + metadata?: Record<string, unknown>; }Not required for correctness, but it keeps the type surface consistent.
Also applies to: 458-492
internal-packages/run-engine/src/batch-queue/index.ts (1)
63-67: Fix logger initialization to honoroptions.logger.The constructor still ignores a provided
options.logger; both ternary branches create a newLoggerinstance:this.logger = options.logger ? new Logger("BatchQueue", "info") : new Logger("BatchQueue", "info");You likely want to use the supplied logger when present:
- this.logger = options.logger - ? new Logger("BatchQueue", "info") - : new Logger("BatchQueue", "info"); + this.logger = options.logger ?? new Logger("BatchQueue", "info");
🧹 Nitpick comments (15)
internal-packages/BATCH_TRIGGER_LIMITS.md (1)
212-212: Replace overused intensifier "very" with more specific language.Lines 212 and 216 use "very" which can be more precisely expressed.
-7. **Streaming batch completion**: Instead of waiting for all runs, allow partial results callback for very large batches +7. **Streaming batch completion**: Instead of waiting for all runs, allow partial results callback for large batches (>1,000 runs) -8. **Separate batch runs table**: For very large batches, consider a denormalized `BatchRun` junction table optimized for batch queries +8. **Separate batch runs table**: For batches exceeding 1,000 runs, consider a denormalized `BatchRun` junction table optimized for batch queriesAlso applies to: 216-216
internal-packages/run-engine/src/engine/systems/batchSystem.ts (1)
66-78: Consider extracting the version string to a shared constant.The magic string
"runengine:v2"is used in multiple places across the codebase (batchSystem.ts:68, BatchPresenter.server.ts:74, and test files). Extracting it to a constant (e.g.,BATCH_VERSION_RUNENGINE_V2) would improve maintainability and prevent typo-related bugs if this string is referenced elsewhere.apps/webapp/seed.mts (1)
113-156: Consider simplifying tenant array construction and guarding against missing environments.The repetitive tenant object construction could be simplified, and the
secretKeylookup may silently produceundefinedif no DEVELOPMENT environment exists.- console.log("tenants.json"); - console.log( - JSON.stringify({ - apiUrl: "http://localhost:3030", - tenants: [ - { - id: org1Project1.project.externalRef, - secretKey: org1Project1.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - { - id: org1Project2.project.externalRef, - secretKey: org1Project2.environments.find((e) => e.type === "DEVELOPMENT")?.apiKey, - }, - // ... remaining entries - ], - }) - ); + const allProjects = [ + org1Project1, org1Project2, org1Project3, + org2Project1, org2Project2, org2Project3, + org3Project1, org3Project2, org3Project3, + ]; + + const tenants = allProjects.map((p) => { + const devEnv = p.environments.find((e) => e.type === "DEVELOPMENT"); + if (!devEnv) { + console.warn(`⚠️ No DEVELOPMENT environment found for project ${p.project.name}`); + } + return { + id: p.project.externalRef, + secretKey: devEnv?.apiKey, + }; + }); + + console.log("tenants.json"); + console.log(JSON.stringify({ apiUrl: "http://localhost:3030", tenants }));docs/batch-queue-metrics.md (1)
180-212: Add language specifiers to PromQL code blocks.The fenced code blocks containing PromQL queries should have a language specifier for proper syntax highlighting. This was flagged by markdownlint.
-``` +```promql # Throughput rate(batch_queue_items_processed_total[5m])Apply the same change to the code blocks at lines 194 and 205.
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.batches.$batchParam/route.tsx (1)
51-72: Redundant error handling pattern.The code uses
tryCatchfrom@trigger.dev/corewhich returns[error, data], but then wraps it in a try/catch block. The innerthrow new Error(error.message)at line 62 is caught by the outer catch at line 66, which logs and returns a generic 400. This works but is slightly redundant.Consider simplifying to use only one error handling approach, or making the error response more specific (e.g., 404 for "Batch not found" vs 400 for other errors).
try { const presenter = new BatchPresenter(); const [error, data] = await tryCatch( presenter.call({ environmentId: environment.id, batchId: batchParam, userId, }) ); if (error) { - throw new Error(error.message); + console.error("BatchPresenter error:", error); + if (error.message === "Batch not found") { + throw new Response("Batch not found", { status: 404 }); + } + throw new Response(undefined, { + status: 400, + statusText: error.message, + }); } return typedjson({ batch: data }); } catch (error) { + if (error instanceof Response) { + throw error; + } console.error(error); throw new Response(undefined, { status: 400, statusText: "Something went wrong, if this problem persists please contact support.", }); }apps/webapp/app/runEngine/concerns/batchPayloads.server.ts (1)
86-104: Consider documenting the inline storage limitation.When object storage is unavailable but payload exceeds the threshold, the code logs a warning and continues with inline storage. The comment notes this "may fail downstream for very large payloads."
This is a reasonable fallback, but consider whether this should be a hard failure in production environments where R2 is expected to be configured. As-is, large payloads could silently cause issues downstream.
packages/redis-worker/src/fair-queue/retry.ts (1)
29-52: Minor redundancy in maxAttempts assignment.
maxAttemptsis assigned twice - once within theoptionsobject (line 35) and again directly on line 41. This works but is slightly redundant.constructor(options?: Partial<RetryOptions>) { this.options = { maxAttempts: options?.maxAttempts ?? 12, factor: options?.factor ?? 2, minTimeoutInMs: options?.minTimeoutInMs ?? 1_000, maxTimeoutInMs: options?.maxTimeoutInMs ?? 3_600_000, // 1 hour randomize: options?.randomize ?? true, }; - this.maxAttempts = this.options.maxAttempts ?? 12; + this.maxAttempts = this.options.maxAttempts!; }packages/core/src/v3/apiClient/index.ts (1)
394-435: streamBatchItems lacks retry logic available to other API methods.Unlike other methods that use
zodfetchwith built-in retry support viamergeRequestOptions, this method uses rawfetch. For large batch streams, transient network failures could cause the entire stream to fail without retry.Consider documenting this limitation or implementing retry at a higher level for the streaming use case.
internal-packages/run-engine/src/batch-queue/tests/index.test.ts (1)
338-340: Avoid fixed sleep in tests; prefer event-driven waits.Using
setTimeoutwith a fixed 200ms delay is a potential source of flakiness. Consider using a more reliable mechanism like pollinggetBatchRemainingCountor checking queue state.- // Wait a bit - nothing should be processed - await new Promise((resolve) => setTimeout(resolve, 200)); - expect(processedItems).toHaveLength(0); + // Verify items are enqueued but not processed (consumers not started) + const remaining = await queue.getBatchRemainingCount("batch1"); + expect(remaining).toBe(3); + expect(processedItems).toHaveLength(0);packages/redis-worker/src/fair-queue/concurrency.ts (2)
47-62: TOCTOU race between canProcess and reserve.
canProcesschecks capacity non-atomically beforereserveis called. Between these calls, another consumer could reserve slots, causingreserveto fail unexpectedly. This is acceptable if callers handlereservereturningfalse, but consider documenting this or removingcanProcessifreservealready handles the atomic check.
97-107: Pipeline exec result not checked for errors.
pipeline.exec()returns an array of[error, result]tuples. If anySREMfails, the error is silently ignored. Consider checking for errors or at minimum logging them.- await pipeline.exec(); + const results = await pipeline.exec(); + if (results) { + for (const [err] of results) { + if (err) { + console.error("Error releasing concurrency:", err); + } + } + }packages/redis-worker/src/fair-queue/telemetry.ts (1)
248-318: Consider error handling in gauge callbacks.The gauge callbacks iterate over queues/shards and make async Redis calls. If any call fails, the entire callback throws and potentially disrupts metric collection for other dimensions.
Consider wrapping individual calls in try-catch to ensure partial failures don't prevent other metrics from being observed:
if (callbacks.getQueueLength && callbacks.observedQueues) { const getQueueLength = callbacks.getQueueLength; const queues = callbacks.observedQueues; this.metrics.queueLength.addCallback(async (observableResult) => { for (const queueId of queues) { - const length = await getQueueLength(queueId); - observableResult.observe(length, { - [FairQueueAttributes.QUEUE_ID]: queueId, - }); + try { + const length = await getQueueLength(queueId); + observableResult.observe(length, { + [FairQueueAttributes.QUEUE_ID]: queueId, + }); + } catch { + // Skip this queue on error, continue with others + } } }); }apps/webapp/app/runEngine/services/streamBatchItems.server.ts (1)
213-270: Reuse TextEncoder instance for better performance.A new
TextEncoderis created for each line (lines 234, 255). Since the stream may process many items, consider hoisting the encoder outside the transform callbacks.export function createNdjsonParserStream( maxItemBytes: number ): TransformStream<Uint8Array, unknown> { const decoder = new TextDecoder(); + const encoder = new TextEncoder(); let buffer = ""; let lineNumber = 0; return new TransformStream<Uint8Array, unknown>({ transform(chunk, controller) { buffer += decoder.decode(chunk, { stream: true }); // Split on newlines const lines = buffer.split("\n"); buffer = lines.pop() ?? ""; for (const line of lines) { lineNumber++; const trimmed = line.trim(); if (!trimmed) continue; // Check byte size before parsing - const lineBytes = new TextEncoder().encode(trimmed).length; + const lineBytes = encoder.encode(trimmed).length;packages/redis-worker/src/fair-queue/workerQueue.ts (1)
231-274: Duplicate Lua script definition.The
popWithLengthLua script is defined twice - in#registerCommands()andregisterCommands(). Extract the script to a constant to avoid duplication and potential drift.+const POP_WITH_LENGTH_LUA = ` +local workerQueueKey = KEYS[1] + +-- Pop the first message +local messageKey = redis.call('LPOP', workerQueueKey) +if not messageKey then + return nil +end + +-- Get remaining queue length +local queueLength = redis.call('LLEN', workerQueueKey) + +return {messageKey, queueLength} +`; + export class WorkerQueueManager { // ... #registerCommands(): void { this.redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); } registerCommands(redis: Redis): void { redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); } }apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.batches/route.tsx (1)
212-264: Unify row selection styling by passingisSelectedinto cells (optional).Right now selection is applied via
classNameonTableRowand onlyTableCellMenureceivesisSelected, so hover/focus styles may differ between the menu cell and the others. For more consistent UX and keyboard focus outlines, consider passingisSelectedinto the data cells as well and letting the table primitives handle styling:- <TableRow key={batch.id} className={isSelected ? "bg-grid-dimmed" : undefined}> - <TableCell to={inspectorPath} isTabbableCell> + <TableRow key={batch.id}> + <TableCell to={inspectorPath} isTabbableCell isSelected={isSelected}> ... - <TableCell to={inspectorPath}> + <TableCell to={inspectorPath} isSelected={isSelected}> ... - <BatchActionsCell runsPath={runsPath} /> + <BatchActionsCell runsPath={runsPath} isSelected={isSelected} />and update
BatchActionsCellto accept/forwardisSelectedintoTableCellMenu.Also applies to: 281-291
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/webapp/app/runEngine/services/triggerTask.server.ts (1)
347-359: UsesafeParseto prevent validation errors from bypassing the ServiceValidationError path.
TaskRunError.parse(taskRun.error)will throw iftaskRun.errordoesn't match the expected schema, which exits the try block before reaching your explicit error handling at lines 355–359. This changes the error surface from ServiceValidationError to ZodError. UsesafeParsewith an explicit success check instead:- const error = taskRun.error ? TaskRunError.parse(taskRun.error) : undefined; + const parsedError = taskRun.error ? TaskRunError.safeParse(taskRun.error) : undefined; + const error = parsedError?.success ? parsedError.data : undefined; + + if (taskRun.error && !parsedError?.success) { + logger.error("Failed to parse taskRun.error", { + taskId, + runFriendlyId, + }); + throw new ServiceValidationError("Run failed with an unparseable error payload."); + }
♻️ Duplicate comments (2)
internal-packages/run-engine/src/engine/index.ts (2)
333-333: TLS configuration is incorrectly converted to boolean.This issue was already identified in a previous review. Line 333 converts the TLS options object to a boolean
trueinstead of passing the actual TLS configuration.
968-973:isBatchQueueEnabled()always returns true.This issue was already identified in a previous review. Since
batchQueueis unconditionally initialized in the constructor (lines 325-346), this method will always returntrue.
🧹 Nitpick comments (1)
internal-packages/run-engine/src/engine/systems/batchSystem.ts (1)
66-88: Consider extracting the batch version string to a constant.The hardcoded string
"runengine:v2"at line 68 could be extracted to a named constant (e.g.,BATCH_VERSION_V2) to prevent typos and improve maintainability if this version string is referenced elsewhere in the codebase.For example, at the top of the file or in a shared constants file:
const BATCH_VERSION_V2 = "runengine:v2";Then use it in the comparison:
- const isNewBatch = batch.batchVersion === "runengine:v2"; + const isNewBatch = batch.batchVersion === BATCH_VERSION_V2;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
apps/webapp/app/runEngine/concerns/runNumbers.server.ts(0 hunks)apps/webapp/app/runEngine/services/triggerTask.server.ts(1 hunks)apps/webapp/app/runEngine/types.ts(0 hunks)apps/webapp/app/v3/services/triggerTask.server.ts(0 hunks)internal-packages/run-engine/src/engine/index.ts(5 hunks)internal-packages/run-engine/src/engine/systems/batchSystem.ts(2 hunks)internal-packages/run-engine/src/engine/tests/batchTwoPhase.test.ts(1 hunks)internal-packages/run-engine/src/engine/types.ts(3 hunks)
💤 Files with no reviewable changes (3)
- apps/webapp/app/runEngine/types.ts
- apps/webapp/app/v3/services/triggerTask.server.ts
- apps/webapp/app/runEngine/concerns/runNumbers.server.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- internal-packages/run-engine/src/engine/tests/batchTwoPhase.test.ts
- internal-packages/run-engine/src/engine/types.ts
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
internal-packages/run-engine/src/engine/index.tsinternal-packages/run-engine/src/engine/systems/batchSystem.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
internal-packages/run-engine/src/engine/index.tsinternal-packages/run-engine/src/engine/systems/batchSystem.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
internal-packages/run-engine/src/engine/index.tsinternal-packages/run-engine/src/engine/systems/batchSystem.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
🧠 Learnings (29)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch
Applied to files:
internal-packages/run-engine/src/engine/index.tsinternal-packages/run-engine/src/engine/systems/batchSystem.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use the Run Engine 2.0 from `internal/run-engine` for new run lifecycle code in the webapp instead of the legacy run engine
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-08-14T18:35:44.370Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2390
File: apps/webapp/app/env.server.ts:764-765
Timestamp: 2025-08-14T18:35:44.370Z
Learning: The BoolEnv helper in apps/webapp/app/utils/boolEnv.ts uses z.preprocess with inconsistent default value types across the codebase - some usages pass boolean defaults (correct) while others pass string defaults (incorrect), leading to type confusion. The helper should enforce boolean-only defaults or have clearer documentation.
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Applied to files:
internal-packages/run-engine/src/engine/index.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : Use Prisma for database interactions in internal-packages/database with PostgreSQL
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Leverage the PostgreSQL database through the `trigger.dev/database` Prisma 6.14.0 client in the webapp for all data access patterns
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Specify runtime environment (node or bun) in trigger.config.ts using the `runtime` property
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Applied to files:
internal-packages/run-engine/src/engine/index.tsinternal-packages/run-engine/src/engine/systems/batchSystem.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Applied to files:
internal-packages/run-engine/src/engine/index.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeyTTL` option to define a time window during which duplicate triggers return the original run
Applied to files:
internal-packages/run-engine/src/engine/index.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeys.create()` to create idempotency keys for preventing duplicate task executions
Applied to files:
internal-packages/run-engine/src/engine/index.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Applied to files:
internal-packages/run-engine/src/engine/systems/batchSystem.tsapps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Export tasks with unique IDs within the project to enable proper task discovery and execution
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.trigger()` with type-only imports to trigger tasks from backend code without importing the task implementation
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schedules.task()` for scheduled/cron tasks instead of regular `task()`
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use metadata methods (set, del, replace, append, remove, increment, decrement, stream, flush) to update metadata during task execution
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Configure OpenTelemetry instrumentations and exporters in trigger.config.ts for enhanced logging
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk/v3` with Zod schema for payload validation
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.trigger()` to trigger a task from inside another task with specified payload
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent task
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.triggerAndWait()` to trigger a task and wait for its result from a parent task
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
🧬 Code graph analysis (1)
internal-packages/run-engine/src/engine/systems/batchSystem.ts (1)
apps/webapp/app/runEngine/services/batchTrigger.server.ts (2)
batch(521-664)batch(666-722)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: Cursor Bugbot
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: typecheck / typecheck
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (7)
internal-packages/run-engine/src/engine/systems/batchSystem.ts (1)
47-49: LGTM!The Prisma selection correctly includes the new fields needed to support v2 batch processing with success/failure counts.
internal-packages/run-engine/src/engine/index.ts (5)
23-23: LGTM!The import additions correctly bring in the BatchQueue infrastructure and related types needed for the v3 two-phase batch processing system.
Also applies to: 31-38
84-84: LGTM!The
batchQueuefield is appropriately declared as private and used consistently throughout the class.
321-332: LGTM!The BatchQueue initialization is well-structured with sensible defaults and helpful logging. The configuration correctly ties consumer startup to the worker disabled flag.
Also applies to: 334-353
975-1056: LGTM!The new BatchQueue methods are well-documented and provide a clean API surface for the two-phase batch processing system. The JSDoc comments clearly explain the purpose and usage of each method, especially the distinction between Phase 1 (initialization) and Phase 2 (item streaming).
1320-1321: LGTM!The BatchQueue cleanup is correctly integrated into the shutdown sequence, ensuring proper resource cleanup and Redis connection closure.
apps/webapp/app/runEngine/services/triggerTask.server.ts (1)
270-274: Good: trace/span attributes are set consistently (queueName + runId).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
apps/webapp/app/runEngine/services/createBatch.server.ts (1)
119-153: Race condition: idempotency may be saved before batch is fully initialized.The event at line 121 triggers idempotency key saving asynchronously via
Evt.post(), which doesn't await handlers. IfinitializeBatch(line 153) fails after the idempotency key is saved, subsequent retries will return a cached batch that isn't fully initialized in Redis.Consider moving
onBatchTaskRunCreated.post(batch)afterinitializeBatchcompletes, or ensuring idempotency is only saved after all initialization succeeds.internal-packages/run-engine/src/engine/index.ts (1)
325-346: TLS configuration is incorrectly converted to boolean.Line 333 converts the TLS options object to
trueinstead of passing the actual TLS configuration, losing any custom settings (certificates, ciphers, etc.).Apply this diff:
keyPrefix: `${options.batchQueue?.redis.keyPrefix ?? ""}batch-queue:`, enableAutoPipelining: options.batchQueue?.redis.enableAutoPipelining ?? true, - tls: options.batchQueue?.redis.tls !== undefined, + tls: options.batchQueue?.redis.tls, },
🧹 Nitpick comments (1)
apps/webapp/app/entry.server.tsx (1)
1-25: Minor: remove stale inline comment on Remix node import
import { createReadableStreamFromReadable, type EntryContext } from "@remix-run/node"; // or cloudflare/deno(Line 1) reads like leftover guidance; prefer removing to reduce noise.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
apps/webapp/app/entry.server.tsx(2 hunks)apps/webapp/app/routes/api.v3.batches.$batchId.items.ts(1 hunks)apps/webapp/app/routes/api.v3.batches.ts(1 hunks)apps/webapp/app/runEngine/services/createBatch.server.ts(1 hunks)apps/webapp/app/v3/runEngine.server.ts(2 hunks)apps/webapp/app/v3/runEngineHandlers.server.ts(2 hunks)apps/webapp/test/engine/triggerTask.test.ts(0 hunks)internal-packages/run-engine/src/engine/index.ts(5 hunks)
💤 Files with no reviewable changes (1)
- apps/webapp/test/engine/triggerTask.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- apps/webapp/app/routes/api.v3.batches.ts
- apps/webapp/app/routes/api.v3.batches.$batchId.items.ts
- apps/webapp/app/v3/runEngine.server.ts
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsxinternal-packages/run-engine/src/engine/index.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsx
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsxinternal-packages/run-engine/src/engine/index.ts
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsx
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsx
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsapps/webapp/app/entry.server.tsxinternal-packages/run-engine/src/engine/index.ts
🧠 Learnings (26)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use the Run Engine 2.0 from `internal/run-engine` for new run lifecycle code in the webapp instead of the legacy run engine
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/entry.server.tsxinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent task
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/runEngine/services/createBatch.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeys.create()` to create idempotency keys for preventing duplicate task executions
Applied to files:
apps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeyTTL` option to define a time window during which duplicate triggers return the original run
Applied to files:
apps/webapp/app/runEngine/services/createBatch.server.tsinternal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/**/*.{ts,tsx} : Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/**/*.{ts,tsx} : When importing from `trigger.dev/core` in the webapp, use subpath exports from the package.json instead of importing from the root path
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/services/**/*.server.{ts,tsx} : Separate testable services from configuration files; follow the pattern of `realtimeClient.server.ts` (testable service) and `realtimeClientGlobal.server.ts` (configuration) in the webapp
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/v3/services/**/*.server.{ts,tsx} : Organize services in the webapp following the pattern `app/v3/services/*/*.server.ts`
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: The webapp at apps/webapp is a Remix 2.1 application using Node.js v20
Applied to files:
apps/webapp/app/entry.server.tsx
📚 Learning: 2025-08-14T18:35:44.370Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2390
File: apps/webapp/app/env.server.ts:764-765
Timestamp: 2025-08-14T18:35:44.370Z
Learning: The BoolEnv helper in apps/webapp/app/utils/boolEnv.ts uses z.preprocess with inconsistent default value types across the codebase - some usages pass boolean defaults (correct) while others pass string defaults (incorrect), leading to type confusion. The helper should enforce boolean-only defaults or have clearer documentation.
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : Use Prisma for database interactions in internal-packages/database with PostgreSQL
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Leverage the PostgreSQL database through the `trigger.dev/database` Prisma 6.14.0 client in the webapp for all data access patterns
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Specify runtime environment (node or bun) in trigger.config.ts using the `runtime` property
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Applied to files:
internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.
Applied to files:
internal-packages/run-engine/src/engine/index.ts
🧬 Code graph analysis (2)
apps/webapp/app/v3/runEngineHandlers.server.ts (4)
apps/webapp/app/v3/runEngine.server.ts (1)
engine(11-11)internal-packages/tracing/src/index.ts (1)
SpanKind(34-34)apps/webapp/app/v3/services/triggerTask.server.ts (1)
TriggerTaskService(51-120)apps/webapp/app/db.server.ts (1)
prisma(101-101)
apps/webapp/app/entry.server.tsx (1)
apps/webapp/app/v3/runEngineHandlers.server.ts (1)
setupBatchQueueCallbacks(645-794)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: Cursor Bugbot
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (11)
apps/webapp/app/runEngine/services/createBatch.server.ts (4)
1-12: LGTM!Imports are properly structured with subpath imports from
@trigger.dev/core/v3as per coding guidelines.
14-20: LGTM!Type definition uses
typeoverinterfaceand properly types optional fields with string union forrealtimeStreamsVersion.
34-44: LGTM!Class structure follows the existing service pattern in this codebase, extending
WithRunEngineand initializing dependencies in the constructor.
173-203: LGTM!Error handling properly distinguishes between
oneTimeUseTokenandidempotencyKeyunique constraint violations, providing clear user-facing error messages viaServiceValidationError.internal-packages/run-engine/src/engine/index.ts (3)
31-38: LGTM!BatchQueue imports and types are properly structured with
.jsextensions for ESM compatibility.
964-1048: LGTM!BatchQueue wrapper methods are well-documented with clear JSDoc comments and provide appropriate delegation to the underlying
BatchQueueinstance.
1303-1318: LGTM!BatchQueue is properly closed during engine shutdown, ensuring Redis connections are cleaned up.
apps/webapp/app/v3/runEngineHandlers.server.ts (2)
645-727: Reduce per-item overhead: reuseTriggerTaskServiceinstance
Creatingnew TriggerTaskService()per item (Line 662) will add avoidable overhead at high throughput. Prefer instantiating once insetupBatchQueueCallbacks()and closing over it.export function setupBatchQueueCallbacks() { + const triggerTaskService = new TriggerTaskService(); // Item processing callback - creates a run for each batch item engine.setBatchProcessItemCallback(async ({ batchId, friendlyId, itemIndex, item, meta }) => { … - const triggerTaskService = new TriggerTaskService(); - // Normalize payload - for application/store (R2 paths), this passes through as-is const payload = normalizePayload(item.payload, item.payloadType);[ suggest_recommended_refactor ]
1-19: Fix incomplete environment object passed to determineEngineVersionThe constructed environment object at line 675 is missing the
project.enginefield required bydetermineEngineVersion(). This will cause a runtime crash when the batch queue tries to determine the engine version.The object currently provides only
project: { id: ... }, butdetermineEngineVersionaccessesenvironment.project.engineto determine the RunEngineVersion. Additionally, theas AuthenticatedEnvironmentcast masks this incompatibility.Fetch the full environment from the database (via
findEnvironmentFromRunalready imported at line 10, or use the prisma query directly) to ensure all required fields are available, rather than constructing a partial object.⛔ Skipped due to learnings
Learnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another taskLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend codeLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for resultsLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for resultsLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloadsLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent taskLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batchLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task setsLearnt from: CR Repo: triggerdotdev/trigger.dev PR: 0 File: .cursor/rules/writing-tasks.mdc:0-0 Timestamp: 2025-11-27T16:27:35.304Z Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functionsapps/webapp/app/entry.server.tsx (2)
241-242: Good: exportingrunWithHttpContextenables request-scoped async context
This is a nice plumbing improvement for propagating per-request context across async boundaries.
196-203: This concern is based on a misunderstanding of the execution model. WhileWorker.init()andbootstrap()are invoked before thesingleton()calls, neither is awaited, meaning they return promises that execute asynchronously. Thesingleton()calls on lines 236-237 execute synchronously and immediately, registering the handlers before the async work inWorker.init()andbootstrap()begins. Since neither function contains synchronous code that emits events, the handlers will always be registered before any events can fire. The proposed reordering is unnecessary.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
♻️ Duplicate comments (9)
apps/webapp/app/runEngine/concerns/batchLimits.server.ts (1)
56-75: Duration format validation still missing.The type assertion
as Durationon line 60 bypasses validation, as previously flagged. The environment variableBATCH_RATE_LIMIT_REFILL_INTERVALis only validated as a string without format checking. An invalid duration format would pass validation but fail at runtime.apps/webapp/app/v3/runEngineHandlers.server.ts (3)
734-796: Make batch completion writes idempotent + atomic (transaction + createMany/skipDuplicates)This is the same issue previously flagged: the callback updates
batchTaskRunthen inserts errors one-by-one (Lines 749-777). If the callback is re-invoked or partially fails, you can get partial updates and/or duplicatebatchTaskRunErrorrows.
812-831:normalizePayload()should not JSON.parse whenpayloadTypeis undefinedAs previously noted: the current condition (Line 815) allows parsing when
payloadTypeisundefined, which can unintentionally transform plain string payloads.- if (payloadType !== "application/json" && payloadType !== undefined) { + if (payloadType !== "application/json") { return payload; }
645-732: PassplanTypewhen callingTriggerTaskService.call()withskipChecks: true(billing/usage correctness)
skipChecks: trueis set (line 698), but noplanTypeis provided in the options. TheRunEngineTriggerTaskServiceexplicitly requiresplanTypewhenskipChecksis enabled to ensure proper billing/usage attribution—when omitted, it logs a warning but processes the run withplanType === undefined, breaking usage tracking.The v2 batch trigger (
batchTrigger.server.ts) correctly passesplanTypefrom batch-level entitlement checks. Adopt the same pattern here: ensureplanTypeis included in the batchmetaobject during batch creation, then passmeta.planTypein theTriggerTaskService.call()options.const result = await triggerTaskService.call( item.task, environment, { payload, options: { ...(item.options as Record<string, unknown>), payloadType: item.payloadType, parentRunId: meta.parentRunId, resumeParentOnCompletion: meta.resumeParentOnCompletion, parentBatch: batchId, }, }, { triggerVersion: meta.triggerVersion, traceContext: meta.traceContext as Record<string, unknown> | undefined, spanParentAsLink: meta.spanParentAsLink, batchId, batchIndex: itemIndex, skipChecks: true, // Already validated at batch level + planType: meta.planType, realtimeStreamsVersion: meta.realtimeStreamsVersion, }, "V2" );apps/webapp/app/runEngine/services/streamBatchItems.server.ts (1)
150-170: Response does not indicate whether the batch was sealed (client can’t distinguish partial ingestion)
Line 154-169 returns the same shape as the success path (Line 194-198), so clients can’t tell “sealed & processing” vs “needs retry with missing items”. This matches the prior review feedback.Also: set span attributes before the early return so telemetry isn’t missing counts on partial ingestions.
if (enqueuedCount !== batch.runCount) { + span.setAttribute("itemsAccepted", itemsAccepted); + span.setAttribute("itemsDeduplicated", itemsDeduplicated); logger.warn("Batch item count mismatch", { batchId: batchFriendlyId, expected: batch.runCount, received: enqueuedCount, itemsAccepted, itemsDeduplicated, });packages/trigger-sdk/src/v3/shared.ts (1)
1533-1585: Good: two-phase errors now preserve phase + batchId context. That addresses a big chunk of “Phase 2 failed after Phase 1 succeeded” debuggability. Remaining question is whether you want any explicit server-side cleanup/cancel on stream failure.packages/redis-worker/src/fair-queue/index.ts (2)
798-832: Incorrect access toworkerQueuefield (should not be underpayload).
This matches a previously reported issue:workerQueueis a top-level field onStoredMessage, not insidepayload.- const workerQueueId = message.payload.workerQueue ?? queueId; + const workerQueueId = message.workerQueue ?? queueId;
1090-1103: Release reserved concurrency when dequeue-time payload validation fails.
This matches a previously reported issue: the early-return path moves to DLQ but doesn’t release the previously reserved concurrency slot.if (!result.success) { @@ // Move to DLQ await this.#moveToDeadLetterQueue(storedMessage, "Payload validation failed"); + // Release reserved concurrency so the queue/group doesn't get stuck + if (this.concurrencyManager) { + await this.concurrencyManager.release(descriptor, storedMessage.id).catch((e) => { + this.logger.error("Failed to release concurrency after validation failure", { + queueId, + messageId: storedMessage.id, + error: e instanceof Error ? e.message : String(e), + }); + }); + } return; }internal-packages/run-engine/src/batch-queue/index.ts (1)
664-700: Don’t cleanup Redis state if completion callback fails.
This matches an existing report: if the completion callback fails (e.g., DB update), deleting Redis progress makes the batch unrecoverable/stuck.if (this.completionCallback) { try { await this.completionCallback(result); } catch (error) { this.logger.error("Error in batch completion callback", { batchId, error: error instanceof Error ? error.message : String(error), }); + // Preserve Redis state for inspection/retry; do NOT cleanup on callback failure. + return; } } // Clean up Redis keys for this batch await this.completionTracker.cleanup(batchId);(If you still need eventual cleanup, consider a separate “reaper” job gated on DB state.)
🧹 Nitpick comments (6)
apps/webapp/app/runEngine/services/streamBatchItems.server.ts (2)
95-113: Index/order + payloadType handling is too loose for “in-order streaming”
- Line 105-113: only upper-bound check exists. If the contract expects “items are enqueued in order”, you should enforce
indexis an integer>= 0and (optionally) strictly increasing (or at least non-decreasing).- Line 116: deriving payload type from
item.options?.payloadTypeis brittle (and bypasses schema intent). Prefer an explicit field from the NDJSON schema (if present) or validate the option value is a string.Also applies to: 115-133
276-287: AsyncIterable wrapper should cancel the reader on early exit
If the consumer stops early (break/throw), releasing the lock (Line 285) doesn’t necessarily cancel the underlying stream. Considerawait reader.cancel()infinally(best-effort) beforereleaseLock().packages/trigger-sdk/src/v3/shared.ts (1)
613-733: Consider runtime validation for “stream inputs” (clearer errors thanfor awaitTypeError). Today, any non-array value that isn’t actuallyAsyncIterable/ReadableStreamwill fail later with a generic error. A small guard at the branch point would improve DX.Also applies to: 868-996, 1127-1250, 1387-1518
internal-packages/run-engine/src/batch-queue/types.ts (2)
22-40: Consider tighteningpayloadType/optionsshape (optional).
If the engine expects"application/json"vs"application/store"semantics, consider at least documenting/encoding those as a union (orz.enum([...]).catchall(...)) to reduce downstream ambiguity while staying permissive.
46-79:BatchMetavsInitializeBatchOptions: keep them intentionally in-sync.
These two structures mirror each other; a drift here will be painful. Consider a shared base type/schema or a single source-of-truth conversion helper.Also applies to: 133-164
packages/core/src/v3/apiClient/index.ts (1)
410-416: Harden retry option handling (avoidmaxAttempts!).
BecauseretryOptionsis built with spreads, it’s possible to end up withmaxAttempts: undefined(if a caller explicitly provides it asundefined). Prefer normalizing once (e.g.,const maxAttempts = retryOptions.maxAttempts ?? DEFAULT...) and removing the non-null assertion.Also applies to: 1602-1608, 1616-1673
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
references/hello-world/src/trigger/batches.tsis excluded by!references/**
📒 Files selected for processing (12)
apps/webapp/app/env.server.ts(3 hunks)apps/webapp/app/runEngine/concerns/batchLimits.server.ts(1 hunks)apps/webapp/app/runEngine/services/streamBatchItems.server.ts(1 hunks)apps/webapp/app/v3/runEngineHandlers.server.ts(2 hunks)internal-packages/run-engine/src/batch-queue/index.ts(1 hunks)internal-packages/run-engine/src/batch-queue/types.ts(1 hunks)packages/core/src/v3/apiClient/index.ts(6 hunks)packages/redis-worker/src/fair-queue/index.ts(1 hunks)packages/redis-worker/src/fair-queue/schedulers/weighted.ts(1 hunks)packages/redis-worker/src/fair-queue/tests/concurrency.test.ts(1 hunks)packages/redis-worker/src/fair-queue/types.ts(1 hunks)packages/trigger-sdk/src/v3/shared.ts(13 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/redis-worker/src/fair-queue/tests/concurrency.test.ts
- packages/redis-worker/src/fair-queue/schedulers/weighted.ts
- packages/redis-worker/src/fair-queue/types.ts
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
packages/core/src/v3/apiClient/index.tsapps/webapp/app/runEngine/concerns/batchLimits.server.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.tsinternal-packages/run-engine/src/batch-queue/index.tspackages/trigger-sdk/src/v3/shared.tspackages/redis-worker/src/fair-queue/index.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
packages/core/src/v3/apiClient/index.tsapps/webapp/app/runEngine/concerns/batchLimits.server.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
packages/core/src/v3/apiClient/index.tsapps/webapp/app/runEngine/concerns/batchLimits.server.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.tsinternal-packages/run-engine/src/batch-queue/index.tspackages/trigger-sdk/src/v3/shared.tspackages/redis-worker/src/fair-queue/index.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
packages/core/src/v3/apiClient/index.tsapps/webapp/app/runEngine/concerns/batchLimits.server.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.tsinternal-packages/run-engine/src/batch-queue/index.tspackages/trigger-sdk/src/v3/shared.tspackages/redis-worker/src/fair-queue/index.ts
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/runEngine/concerns/batchLimits.server.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/runEngine/concerns/batchLimits.server.tsapps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.tsapps/webapp/app/runEngine/services/streamBatchItems.server.ts
packages/trigger-sdk/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Files:
packages/trigger-sdk/src/v3/shared.ts
🧠 Learnings (29)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use `trigger.dev/redis-worker` for background job and worker system needs in the webapp and run engine
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code
Applied to files:
packages/core/src/v3/apiClient/index.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads
Applied to files:
packages/core/src/v3/apiClient/index.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results
Applied to files:
packages/core/src/v3/apiClient/index.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch
Applied to files:
packages/core/src/v3/apiClient/index.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tsinternal-packages/run-engine/src/batch-queue/index.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results
Applied to files:
packages/core/src/v3/apiClient/index.tsinternal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-14T16:03:06.917Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2681
File: apps/webapp/app/services/platform.v3.server.ts:258-302
Timestamp: 2025-11-14T16:03:06.917Z
Learning: In `apps/webapp/app/services/platform.v3.server.ts`, the `getDefaultEnvironmentConcurrencyLimit` function intentionally throws an error (rather than falling back to org.maximumConcurrencyLimit) when the billing client returns undefined plan limits. This fail-fast behavior prevents users from receiving more concurrency than their plan entitles them to. The org.maximumConcurrencyLimit fallback is only for self-hosted deployments where no billing client exists.
Applied to files:
apps/webapp/app/runEngine/concerns/batchLimits.server.tsapps/webapp/app/env.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Limit task duration using the `maxDuration` property (in seconds)
Applied to files:
apps/webapp/app/runEngine/concerns/batchLimits.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk/v3` with Zod schema for payload validation
Applied to files:
internal-packages/run-engine/src/batch-queue/types.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets
Applied to files:
internal-packages/run-engine/src/batch-queue/types.tsapps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use the Run Engine 2.0 from `internal/run-engine` for new run lifecycle code in the webapp instead of the legacy run engine
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-09-03T14:34:41.781Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 2464
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.settings/route.tsx:357-371
Timestamp: 2025-09-03T14:34:41.781Z
Learning: When using Zod's safeParse, the .data property is undefined when parsing fails, but TypeScript may still complain about accessing .data without checking .success first. The suggested approach of checking .success before accessing .data improves type safety and code clarity.
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-08-14T18:35:44.370Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2390
File: apps/webapp/app/env.server.ts:764-765
Timestamp: 2025-08-14T18:35:44.370Z
Learning: The BoolEnv helper in apps/webapp/app/utils/boolEnv.ts uses z.preprocess with inconsistent default value types across the codebase - some usages pass boolean defaults (correct) while others pass string defaults (incorrect), leading to type confusion. The helper should enforce boolean-only defaults or have clearer documentation.
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tsapps/webapp/app/env.server.ts
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent task
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp
Applied to files:
apps/webapp/app/env.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option
Applied to files:
internal-packages/run-engine/src/batch-queue/index.tspackages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `.withStreams()` to subscribe to realtime streams from task metadata in addition to run changes
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use metadata methods (set, del, replace, append, remove, increment, decrement, stream, flush) to update metadata during task execution
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Generate example payloads for tasks when possible
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.trigger()` to trigger a task from inside another task with specified payload
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.trigger()` with type-only imports to trigger tasks from backend code without importing the task implementation
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.triggerAndWait()` to trigger a task and wait for its result from a parent task
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeys.create()` to create idempotency keys for preventing duplicate task executions
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `idempotencyKeyTTL` option to define a time window during which duplicate triggers return the original run
Applied to files:
packages/trigger-sdk/src/v3/shared.ts
🧬 Code graph analysis (3)
apps/webapp/app/runEngine/concerns/batchLimits.server.ts (3)
apps/webapp/app/env.server.ts (1)
env(1289-1289)apps/webapp/app/services/rateLimiter.server.ts (3)
createRedisRateLimitClient(74-105)RateLimiter(21-72)Duration(17-17)apps/webapp/app/runEngine/concerns/batchGlobalRateLimiter.server.ts (1)
limit(27-33)
apps/webapp/app/runEngine/services/streamBatchItems.server.ts (4)
apps/webapp/app/v3/services/baseService.server.ts (1)
WithRunEngine(52-59)apps/webapp/app/runEngine/concerns/batchPayloads.server.ts (1)
BatchPayloadProcessor(28-164)packages/core/src/v3/isomorphic/friendlyId.ts (1)
BatchId(96-96)internal-packages/run-engine/src/batch-queue/types.ts (2)
BatchItem(22-40)BatchItem(41-41)
packages/trigger-sdk/src/v3/shared.ts (9)
packages/core/src/v3/schemas/api.ts (2)
BatchItemNDJSON(358-367)BatchItemNDJSON(369-369)packages/core/src/v3/utils/ioSerialization.ts (1)
stringifyIO(88-105)packages/core/src/v3/idempotencyKeys.ts (2)
makeIdempotencyKey(39-53)flattenIdempotencyKey(12-37)packages/core/src/v3/workers/index.ts (1)
getEnvVar(4-4)packages/core/src/v3/taskContext/index.ts (1)
ctx(26-28)packages/trigger-sdk/src/v3/batch.ts (1)
batch(18-24)packages/core/src/v3/runMetadata/index.ts (1)
stream(80-86)internal-packages/run-engine/src/index.ts (1)
BatchItem(16-16)packages/core/src/v3/types/schemas.ts (1)
SchemaParseFn(101-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: Cursor Bugbot
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (9)
apps/webapp/app/runEngine/concerns/batchLimits.server.ts (5)
9-16: LGTM!The BatchLimitsConfig schema is well-defined with appropriate defaults.
18-31: Good use of singleton pattern.The Redis client singleton addresses the previous concern about connection exhaustion from creating new clients on every call.
81-87: LGTM!The function provides a clean API for retrieving batch limits and rate limiter for an organization.
89-105: LGTM!The config resolution logic properly validates and falls back to defaults when needed.
111-123: LGTM!The error class provides all necessary information for constructing proper 429 responses.
apps/webapp/app/env.server.ts (2)
943-950: Defaults added for DRR settings.The previously flagged critical issue has been resolved—
BATCH_QUEUE_DRR_QUANTUMandBATCH_QUEUE_MAX_DEFICITnow have defaults (5 and 50). The optionalCONSUMER_COUNTandCONSUMER_INTERVAL_MSare safe since FairQueue provides defaults.
952-960: LGTM!Plan-based batch limits and concurrency settings are well-defined with sensible defaults for each tier.
packages/trigger-sdk/src/v3/shared.ts (2)
56-57: Type import forBatchItemNDJSONlooks right (keeps SDK isomorphic).
128-130: This concern does not apply—BatchTriggerErroris not part of the public API.
BatchTriggerErroris exported fromshared.tsbut never re-exported throughindex.ts, which is the sole public entry point (perpackage.jsonexports). As an internal-only export, the early re-export pattern does not affect consumers and poses no CJS transpilation risk. This same pattern is used consistently elsewhere in the file (e.g., line 124).Likely an incorrect or invalid review comment.
…e trigger pipeline
… efficiently create errors in the completion callback
…encyKeyError is thrown
daa0b5b to
a5791a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/trigger-sdk/src/v3/batch.ts (1)
1-9: Update JSDoc@returnsannotation to match new return typeThe function return type on line 46 has been correctly updated to
ApiPromise<RetrieveBatchV2Response>, but the JSDoc@returnsannotation on line 31 still referencesApiPromise<RetrieveBatchResponse>. Update line 31 to document the actual return type:@returns {ApiPromise<RetrieveBatchV2Response>}.
♻️ Duplicate comments (10)
.env.example (1)
90-94: Add trailing blank line to end of file.This is a duplicate of a previous review comment — the file is missing a trailing blank line at EOF. This is required by linting tools and standard POSIX text file conventions.
Apply this fix:
# Enable local observability stack (requires `pnpm run docker` to start otel-collector) # Uncomment these to send metrics to the local Prometheus via OTEL Collector: # INTERNAL_OTEL_METRIC_EXPORTER_ENABLED=1 # INTERNAL_OTEL_METRIC_EXPORTER_URL=http://localhost:4318/v1/metrics # INTERNAL_OTEL_METRIC_EXPORTER_INTERVAL_MS=15000 +docker/config/grafana/provisioning/dashboards/nodejs-runtime.json (1)
1-445: ❌ Unresolved issues from previous review: Prettier formatting and metric definition mismatch.This review comment was previously flagged and remains unresolved. Two critical issues must be addressed:
Formatting: The JSON file is not Prettier-formatted. Per coding guidelines, all
.jsonfiles must be formatted with Prettier. This violates the requirement:**/*.{js,ts,jsx,tsx,json,md,css,scss}: Format code using Prettier.Metrics mismatch: The dashboard references 10 metrics that do not appear to be exported by the application:
triggerdotdev_nodejs_event_loop_utilization_ratiotriggerdotdev_nodejs_eventloop_lag_p50_seconds,p90_seconds,p99_seconds,max_seconds,mean_secondstriggerdotdev_nodejs_uv_threadpool_size_threadstriggerdotdev_nodejs_active_handles_total_handles,active_handles_handlestriggerdotdev_nodejs_active_requests_total_requestsWithout these metrics exported, the dashboard queries will return no data, rendering it non-functional.
Required actions:
- Run Prettier on the file:
prettier --write docker/config/grafana/provisioning/dashboards/nodejs-runtime.json- Verify and confirm that your Node.js instrumentation exports all referenced metrics, or update dashboard queries to match actual exported metric names.
apps/webapp/app/runEngine/concerns/batchPayloads.server.ts (1)
150-170: Double stringification issue has been addressed.The current implementation correctly checks
typeof payload === "string"before callingJSON.stringifyforapplication/jsonpayloads (lines 153-154). Pre-serialized JSON strings from the SDK are now handled correctly without double-encoding.packages/redis-worker/src/fair-queue/types.ts (1)
62-79: StoredMessage.metadata type alignment confirmed.The
metadatafield is nowRecord<string, unknown>(line 78), consistent withQueueMessage.metadata(line 55). This addresses the type mismatch identified in past reviews.packages/redis-worker/src/fair-queue/tests/fairQueue.test.ts (1)
700-731: Potential race condition: handler registered afterstart()andenqueue().The message handler is registered at lines 714-718, after
queue.start()(line 701) andqueue.enqueue()(line 708). This ordering could cause the message to be processed before the handler is attached.While a previous review comment was marked as addressed, the code still shows the handler being registered after the queue starts and messages are enqueued. Consider moving the handler registration before
queue.start():+ const processed: string[] = []; + queue.onMessage(async (ctx) => { + processed.push(ctx.message.payload.value); + await ctx.complete(); + }); + // Start without any messages (will trigger empty dequeues) queue.start(); // Wait a bit for cooloff to kick in await new Promise((resolve) => setTimeout(resolve, 500)); await queue.enqueue({ queueId: "tenant:t1:queue:q1", tenantId: "t1", payload: { value: "after-cooloff" }, }); - const processed: string[] = []; - queue.onMessage(async (ctx) => { - processed.push(ctx.message.payload.value); - await ctx.complete(); - });internal-packages/run-engine/src/batch-queue/index.ts (2)
514-652: Batch item handling and completion callback behavior look robust
#handleMessagecorrectly:
- Validates metadata presence.
- Calls
processItemCallback, then records success/failure via the idempotent completion tracker.- Always sets
processedCount(including unexpected-error path) before completing the FairQueue message.- Triggers
#finalizeBatchonly whenprocessedCount === meta.runCount.
#finalizeBatchnow only cleans up Redis after a successfulcompletionCallback, and propagates callback errors so the batch can be retried or inspected with full Redis state, while still cleaning up when no callback is registered. This addresses the earlier “cleanup after callback failure” bug.Also applies to: 658-698
267-321: ValidateenvIdagainst batch metadata before enqueueing items
enqueueBatchItem()currently trusts the caller-providedenvIdfor routing:const meta = await this.completionTracker.getMeta(batchId); // ... const queueId = this.#makeQueueId(envId, batchId); await this.fairQueue.enqueue({ queueId, tenantId: envId, ... });But
metaalready contains the authoritativeenvironmentIdfor the batch. If a caller accidentally passes the wrongenvId, items will be enqueued under the wrong environment’s queue while the DB batch still belongs to the original environment, breaking fairness/concurrency guarantees and making debugging very hard.Add a strict check before routing:
const meta = await this.completionTracker.getMeta(batchId); if (!meta) { throw new Error(`Batch ${batchId} not found or not initialized`); } + + if (meta.environmentId !== envId) { + throw new Error( + `Batch ${batchId} belongs to environment ${meta.environmentId} but enqueue was attempted for ${envId}` + ); + }This keeps routing consistent with stored metadata and protects against cross-environment bugs.
internal-packages/run-engine/src/engine/index.ts (1)
84-85: Guard BatchQueue initialization and fix Redis keyPrefix orderingTwo issues here:
BatchQueue is created even when
options.batchQueueisundefined, which meansredisbecomes{ keyPrefix: "batch-queue:" }andcreateRedisClientwill fall back to its own host/port defaults (typicallylocalhost:6379). That’s risky in production: RunEngine will silently connect a second client to the wrong Redis ifbatchQueueisn’t explicitly configured. Either:
- Make
options.batchQueuerequired and fail fast if it’s missing, or- Treat it as truly optional and skip BatchQueue initialization (and gate the public batch methods accordingly).
The
redis.keyPrefixoverride is backwards:redis: { keyPrefix: `${options.batchQueue?.redis.keyPrefix ?? ""}batch-queue:`, ...options.batchQueue?.redis, },The spread comes after
keyPrefix, so any caller-providedkeyPrefixoverwrites your"…batch-queue:"suffix. This can cause key collisions with other queues. Flip the order so you always append the suffix:- this.batchQueue = new BatchQueue({ - redis: { - keyPrefix: `${options.batchQueue?.redis.keyPrefix ?? ""}batch-queue:`, - ...options.batchQueue?.redis, - }, + if (!options.batchQueue) { + throw new Error("RunEngine batchQueue configuration is required"); + } + + this.batchQueue = new BatchQueue({ + redis: { + ...options.batchQueue.redis, + keyPrefix: `${options.batchQueue.redis.keyPrefix ?? ""}batch-queue:`, + }, drr: { - quantum: options.batchQueue?.drr?.quantum ?? 5, - maxDeficit: options.batchQueue?.drr?.maxDeficit ?? 50, + quantum: options.batchQueue.drr?.quantum ?? 5, + maxDeficit: options.batchQueue.drr?.maxDeficit ?? 50, }, consumerCount: options.batchQueue?.consumerCount ?? 2, consumerIntervalMs: options.batchQueue?.consumerIntervalMs ?? 100, defaultConcurrency: options.batchQueue?.defaultConcurrency ?? 10, globalRateLimiter: options.batchQueue?.globalRateLimiter, startConsumers, tracer: options.tracer, meter: options.meter, });(If you prefer keeping
batchQueueoptional, drop the throw and only constructBatchQueuewhenoptions.batchQueueis set, plus null‑check in the public batch methods.)Also applies to: 321-348
packages/redis-worker/src/fair-queue/index.ts (2)
762-832: Fix StoredMessage usage: id vs messageId, and direct-mode message processingThere are a few tightly related correctness bugs here:
- Using
message.messageIdinstead ofmessage.id
StoredMessagedefinesid: stringbut nomessageId. In both two-stage and direct modes you usemessage.messageIdfor concurrency and worker-queue keys:// In #claimAndPushToWorkerQueue const reserved = await this.concurrencyManager.reserve(descriptor, message.messageId); await this.visibilityManager.release(message.messageId, queueId, queueKey, queueItemsKey); const messageKey = `${message.messageId}:${queueId}`; // In #processOneMessage const reserved = await this.concurrencyManager.reserve(descriptor, message.messageId); await this.visibilityManager.release(message.messageId, queueId, queueKey, queueItemsKey);Because
message.messageIdisundefined, you end up reserving and releasing concurrency under inconsistent keys, and building worker-queue keys like"undefined:queue". This will leak concurrency slots and corrupt worker-queue behavior.
- Direct mode calls
#processMessagewithmessage.payloadinstead of the StoredMessage
#processMessagenow expects a fullStoredMessage:async #processMessage(loopId: string, storedMessage: StoredMessage<...>, queueId: string)But in
#processOneMessageyou call:await this.#processMessage(loopId, message.payload, queueId);which passes only the payload. This will fail type-checking and, at runtime, break visibility/concurrency handling inside
#processMessage.A minimal, consistent fix is:
@@ async #claimAndPushToWorkerQueue(...) - if (this.concurrencyManager) { - const reserved = await this.concurrencyManager.reserve(descriptor, message.messageId); + if (this.concurrencyManager) { + const reserved = await this.concurrencyManager.reserve(descriptor, message.id); if (!reserved) { // Release message back to queue - await this.visibilityManager.release(message.messageId, queueId, queueKey, queueItemsKey); + await this.visibilityManager.release(message.id, queueId, queueKey, queueItemsKey); return false; } } @@ - const workerQueueId = message.payload.workerQueue ?? queueId; + const workerQueueId = message.workerQueue ?? queueId; @@ - const messageKey = `${message.messageId}:${queueId}`; + const messageKey = `${message.id}:${queueId}`;@@ async #processOneMessage(...) - if (this.concurrencyManager) { - const reserved = await this.concurrencyManager.reserve(descriptor, message.messageId); + if (this.concurrencyManager) { + const reserved = await this.concurrencyManager.reserve(descriptor, message.id); if (!reserved) { // Release message back to queue - await this.visibilityManager.release(message.messageId, queueId, queueKey, queueItemsKey); + await this.visibilityManager.release(message.id, queueId, queueKey, queueItemsKey); return false; } } @@ - await this.#processMessage(loopId, message.payload, queueId); + await this.#processMessage(loopId, message, queueId);This makes concurrency reservation/release and worker-key construction consistent with
StoredMessageand ensures direct mode uses the same processing path as worker-queue mode.Also applies to: 1003-1067, 1073-1221, 1223-1349
838-889: Worker-queue routing likely never matches consumer queuesIn worker-queue mode:
- Producers push messages to a queue derived from the
StoredMessage:// After fixing as above: const workerQueueId = message.workerQueue ?? queueId; await this.workerQueueManager!.push(workerQueueId, messageKey);
- But consumers always block-pop from
worker-{consumerId}:const loopId = `worker-${consumerId}`; const workerQueueId = loopId; // Each consumer has its own worker queue by default const messageKey = await this.workerQueueManager!.blockingPop( workerQueueId, this.workerQueueBlockingTimeoutSeconds, this.abortController.signal );Unless your
workerQueueResolverhappens to return"worker-0","worker-1", etc., messages will be pushed to one set of keys (e.g., per-queue or per-tenant) and consumed from a completely different set (worker-{n}), so worker-queue mode will sit idle.You’ll want a single, consistent routing scheme. Two concrete options:
- Per-consumer worker queues: keep
worker-{consumerId}naming, but haveworkerQueueResolver(or an internal helper) map each message deterministically to a specific consumer ID and setstoredMessage.workerQueue = worker-{id}; or- Per-queue/per-tenant worker queues: have consumers block-pop from the same IDs
workerQueueResolverwrites to (e.g., via akeys.workerQueueKey(...)helper), so all consumers can process from those queues.Right now the producer/consumer queue IDs don’t line up, so pick one model and align both
push()andblockingPop()with it, plus add tests to lock the behavior in.Also applies to: 1516-1620
🧹 Nitpick comments (11)
apps/webapp/app/routes/api.v1.batches.$batchId.ts (1)
41-52: Response extensions look good; consider slightly more defensiveerrorsaccessThe new
successfulRunCount,failedRunCount, anderrorsfields are added in a backward‑compatible way and the?? undefinedusage ensures optional fields are omitted from the JSON when not present.If you want this to be a bit more future‑proof against changes to the
includeabove, you could guard theerrorsarray access with optional chaining:- errors: - batch.errors.length > 0 - ? batch.errors.map((err) => ({ + errors: + batch.errors?.length + ? batch.errors.map((err) => ({ index: err.index, taskIdentifier: err.taskIdentifier, error: err.error, errorCode: err.errorCode ?? undefined, })) : undefined,Optionally, you might also tweak the comment to reflect that errors will be included for any batch with errors, not just
PARTIAL_FAILEDstatuses.packages/core/src/v3/idempotencyKeys.ts (1)
50-52: Add JSDoc documentation tomakeIdempotencyKeyexplaining its behavior and scope.The function
makeIdempotencyKeyexplicitly usesscope: "run"to align withcreateIdempotencyKey's default behavior. However, unlikecreateIdempotencyKey, which has comprehensive JSDoc explaining the scope options and implications,makeIdempotencyKeylacks any documentation. This makes it unclear to callers what scope behavior to expect.Add JSDoc similar to
createIdempotencyKey(lines 55-89) to document:
- The purpose of the function
- That it uses run-scoped idempotency keys by default
- How scope affects key derivation
- Usage examples for batch and non-batch scenarios
apps/webapp/seed.mts (1)
236-248: Consider reducing verbosity for existing projects (optional).The environment listing is logged every time
findOrCreateProjectis called, even when the project already exists. While this is useful for local development to see API keys, it can be verbose when re-running the seed.Consider only logging environments when the project is newly created, or add a flag to control verbosity:
console.log(`✅ Project ready: ${project.name} (${project.externalRef})`); -// list environments for this project -const environments = await prisma.runtimeEnvironment.findMany({ - where: { projectId: project.id }, - select: { - slug: true, - type: true, - apiKey: true, - }, -}); -console.log(` Environments for ${project.name}:`); -for (const env of environments) { - console.log(` - ${env.type.toLowerCase()} (${env.slug}): ${env.apiKey}`); -} +// list environments for this project +const environments = await prisma.runtimeEnvironment.findMany({ + where: { projectId: project.id }, + select: { + slug: true, + type: true, + apiKey: true, + }, +}); + +if (process.env.VERBOSE_SEED) { + console.log(` Environments for ${project.name}:`); + for (const env of environments) { + console.log(` - ${env.type.toLowerCase()} (${env.slug}): ${env.apiKey}`); + } +} return { project, environments };apps/webapp/test/engine/streamBatchItems.test.ts (1)
319-343: Clever race condition simulation via Prisma mock.The
racingPrismaobject interceptsupdateManyto simulate a concurrent request winning the race. Theas unknown as PrismaClientcast is necessary for this mocking approach.Consider extracting this pattern into a reusable test utility if similar race condition tests are needed elsewhere.
packages/redis-worker/src/fair-queue/workerQueue.ts (2)
93-150: Minor: Abort listener not removed on normal completion.The abort event listener is added with
{ once: true }, but ifblpopcompletes normally (without abort), the listener remains attached to the signal until it's aborted or garbage collected. For short-lived signals this is fine, but consider explicitly removing the listener in thefinallyblock for cleaner resource management.async blockingPop( workerQueueId: string, timeoutSeconds: number, signal?: AbortSignal ): Promise<string | null> { const workerQueueKey = this.keys.workerQueueKey(workerQueueId); // Create a separate client for blocking operation // This is required because BLPOP blocks the connection const blockingClient = this.redis.duplicate(); + let cleanup: (() => void) | undefined; + try { // Set up abort handler if (signal) { - const cleanup = () => { + cleanup = () => { blockingClient.disconnect(); }; signal.addEventListener("abort", cleanup, { once: true }); if (signal.aborted) { return null; } } const result = await blockingClient.blpop(workerQueueKey, timeoutSeconds); // ... rest of try block } catch (error) { // ... error handling } finally { + if (signal && cleanup) { + signal.removeEventListener("abort", cleanup); + } await blockingClient.quit().catch(() => { // Ignore quit errors (may already be disconnected) }); } }
231-274: Consider extracting duplicated Lua script to a constant.The
popWithLengthLua script is duplicated between#registerCommands()(lines 235-248) andregisterCommands()(lines 257-272). Consider extracting to a module-level constant for DRY compliance and easier maintenance.+const POP_WITH_LENGTH_LUA = ` +local workerQueueKey = KEYS[1] + +-- Pop the first message +local messageKey = redis.call('LPOP', workerQueueKey) +if not messageKey then + return nil +end + +-- Get remaining queue length +local queueLength = redis.call('LLEN', workerQueueKey) + +return {messageKey, queueLength} +`; + export class WorkerQueueManager { // ... #registerCommands(): void { this.redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); } registerCommands(redis: Redis): void { redis.defineCommand("popWithLength", { numberOfKeys: 1, - lua: ` -local workerQueueKey = KEYS[1] -... - `, + lua: POP_WITH_LENGTH_LUA, }); } }packages/redis-worker/src/fair-queue/types.ts (1)
461-492: Alignmetadatatypes between enqueue and queue message interfaces.
EnqueueOptions.metadataandEnqueueBatchOptions.metadatauseRecord<string, string>(lines 473, 491), whileQueueMessage.metadata,StoredMessage.metadata, andQueueDescriptor.metadatauseRecord<string, unknown>(lines 37, 55, 78). The metadata is passed through directly without conversion, creating type friction.Change
EnqueueOptionsandEnqueueBatchOptionsto useRecord<string, unknown>for consistency, unless string-only values are required for a specific reason (e.g., Redis serialization constraints), in which case document this constraint and validate on enqueue.apps/webapp/test/engine/triggerTask.test.ts (1)
782-782: Consider using explicit wait conditions instead of fixedsetTimeout(500).The fixed 500ms delays may cause flaky tests under load or slow CI environments. Consider polling for expected state or using event-based synchronization where possible.
Also applies to: 812-812, 887-887
packages/redis-worker/src/fair-queue/tests/workerQueue.test.ts (1)
8-8: Consider movingkeysinitialization into each test.The
keysvariable is declared at describe scope but reassigned in every test. Since eachredisTestcreates a fresh instance anyway, declaring it locally within each test would be cleaner and more explicit.describe("WorkerQueueManager", () => { - let keys: FairQueueKeyProducer; - describe("push and pop", () => { redisTest( "should push and pop a single message", { timeout: 10000 }, async ({ redisOptions }) => { - keys = new DefaultFairQueueKeyProducer({ prefix: "test" }); + const keys = new DefaultFairQueueKeyProducer({ prefix: "test" });packages/redis-worker/src/fair-queue/schedulers/roundRobin.ts (1)
146-149: Consider adding TTL to lastServed key.The
lastServedIndexkey persists indefinitely. If shards are removed or renamed, stale keys could accumulate. Consider adding an expiry (e.g., 24 hours) to auto-cleanup:async #setLastServedIndex(shardKey: string, index: number): Promise<void> { const key = this.#lastServedKey(shardKey); - await this.redis.set(key, index.toString()); + await this.redis.set(key, index.toString(), "EX", 86400); // 24h TTL }apps/webapp/app/runEngine/services/streamBatchItems.server.ts (1)
39-42: Streaming batch sealing and response semantics look correct nowThe service now:
- Validates
batchFriendlyIdviaparseBatchFriendlyId()and returns a 400-styleServiceValidationErroron malformed IDs.- Ensures the batch exists, is unsealed, and
PENDINGbefore accepting items.- Tracks
itemsAcceptedvsitemsDeduplicated, then comparesgetBatchEnqueuedCount(batchId)tobatch.runCountand, on mismatch, returns a response withsealed: false,enqueuedCount, andexpectedCountso clients can retry missing items instead of silently “succeeding”.- Seals via a conditional
updateMany(sealed=falseandstatus="PENDING"), handles concurrent sealers by re-reading the row, and surfaces an explicit error if the batch ends up in an unexpected state.This addresses the earlier race and “no sealed flag” issues cleanly; the flow is consistent with the v3
StreamBatchItemsResponseschema.One small nit:
StreamBatchItemsServiceOptionsis accepted but not used insidecall(). If size limits are enforced solely in the route viacreateNdjsonParserStream(maxItemBytes), consider either dropping theoptionsparameter here or documenting it as reserved for future behavior to avoid confusion.Also applies to: 48-107, 108-190, 192-277
| if (updates) { | ||
| organization = await prisma.organization.update({ | ||
| where: { id: organization.id }, | ||
| data: updates, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unconditional update may cause unexpected overwrites.
The updates parameter is applied even when the organization already exists, which means re-running the seed will overwrite existing batchQueueConcurrencyConfig values. This violates the "find or create" semantics and could reset intentional configuration changes made during testing.
Consider applying updates only during creation:
if (!organization) {
console.log(`Creating organization: ${title}...`);
organization = await createOrganization({
title: title,
userId: user.id,
companySize: "1-10",
});
-}
-
-if (updates) {
- organization = await prisma.organization.update({
- where: { id: organization.id },
- data: updates,
- });
+
+ if (updates) {
+ organization = await prisma.organization.update({
+ where: { id: organization.id },
+ data: updates,
+ });
+ }
+} else {
+ console.log(`✅ Organization already exists: ${title}`);
}Alternatively, if updates should always be applied, rename the function to findOrCreateAndUpdateOrganization to make the behavior explicit.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In apps/webapp/seed.mts around lines 194-199 the code unconditionally applies
`updates` via prisma.update which overwrites existing organization fields on
re-seed; change behavior to preserve "find or create" semantics by only applying
`updates` when the organization was newly created (i.e., move the update into
the creation branch or set `data` with `...updates` only in the create call), or
if the intent is to always modify existing records, rename the function to
`findOrCreateAndUpdateOrganization` and keep the update but update the function
name and its call sites to reflect the new contract.
| error.message.includes("exceeds maximum size") | ||
| ) { | ||
| return json({ error: error.message }, { status: 400 }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Invalid UTF-8 errors return wrong HTTP status code
The NDJSON parser in streamBatchItems.server.ts can throw "Invalid UTF-8" errors when the client sends malformed data, but the error handling in this route only checks for "Invalid JSON" and "exceeds maximum size" to return HTTP 400. Invalid UTF-8 errors fall through to return HTTP 500 with x-should-retry: false, which is semantically incorrect since encoding errors are client-side issues (not server errors). The error.message.includes() check is missing "Invalid UTF-8" as a condition for returning 400.
| batchId, | ||
| error: error instanceof Error ? error.message : String(error), | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Batch completion callback swallows errors preventing retry
The batch completion callback catches errors but doesn't re-throw them. The BatchQueue.#finalizeBatch method (in batch-queue/index.ts lines 681-697) is designed to preserve Redis data for retry only when the callback throws an error. When the error is swallowed here, the BatchQueue proceeds to call cleanup(batchId), deleting the Redis data needed for retry. If the Postgres transaction fails (e.g., database temporarily unavailable), the batch gets stuck in PROCESSING state with no recovery path since the completion data is gone.
New batch trigger system with larger payloads, streaming ingestion, larger batch sizes, and a fair processing system.
This PR introduces a new
FairQueueabstraction inspired by our ownRunQueuethat enables multi-tenant fair queueing with concurrency limits. The newBatchQueueis built on top of theFairQueue, and handles processing Batch triggers in a fair manner with per-environment concurrency limits defined per-org. Additionally, there is a global concurrency limit to prevent the BatchQueue system from creating too many runs too quickly, which can cause downstream issues.For this new BatchQueue system we have a completely new batch trigger creation and ingestion system. Previously this was a single endpoint with a single JSON body that defined details about the batch as well as all the items in the batch.
We're introducing a two-phase batch trigger ingestion system. In the first phase, the BatchTaskRun record is created (and possibly rate limited). The second phase is another endpoint that accepts an NDJSON body with each line being a single item/run with payload and options.
At ingestion time all items are added to a queue, in order, and then processed by the BatchQueue system.
New batch trigger rate limits
This PR implements a new batch trigger specific rate limit, configured on the
Organization.batchRateLimitConfigcolumn, and defaults using these environment variables:BATCH_RATE_LIMIT_REFILL_RATEdefaults to 10BATCH_RATE_LIMIT_REFILL_INTERVALthe duration interval, defaults to"10s"BATCH_RATE_LIMIT_MAXdefaults to 1200This rate limiter is scoped to the environment ID and controls how many runs can be submitted via batch triggers per interval. The SDK handles the retrying side.
Batch queue concurrency limits
The new column
Organization.batchQueueConcurrencyConfignow defines an org specificprocessingConcurrencyvalue, with a backup of the env varBATCH_CONCURRENCY_LIMIT_DEFAULTwhich defaults to 10. This controls how many batch queue items are processed concurrently per environment.There is also a global rate limit for the batch queue set via the
BATCH_QUEUE_GLOBAL_RATE_LIMITwhich defaults to being disabled. If set, the entire batch queue system won't process more thanBATCH_QUEUE_GLOBAL_RATE_LIMITitems per second. This allows controlling the maximum number of runs created per second via batch triggers.Batch trigger limits
STREAMING_BATCH_MAX_ITEMScontrols the maximum number of items in a single batchSTREAMING_BATCH_ITEM_MAXIMUM_SIZEcontrols the maximum size of each item in a batch