Add decision document for new Azure upload flow with user confirmation#236
Add decision document for new Azure upload flow with user confirmation#236kishor-gupta wants to merge 18 commits intomainfrom
Conversation
Reviewer's GuideAdds an architectural decision record describing a new deferred Azure Blob upload flow that ties blob creation and malware scanning to explicit user confirmation and a successful database transaction, replacing immediate uploads on file selection to avoid orphaned files and improve data integrity. Sequence diagram for deferred Azure Blob upload with user confirmationsequenceDiagram
actor User
participant Frontend
participant Backend
participant BlobStorage
participant MalwareScanner
participant Database
User->>Frontend: Select file and fill form
Frontend->>Frontend: Validate file and store locally
User->>Frontend: Click SaveAndContinue
Frontend->>Backend: RequestSASToken(fileName, fileType, fileSize)
Backend->>Backend: BuildBlobPathAndMetadata
Backend-->>Frontend: AuthResult(blobUrl, sasToken, tags, metadata)
Frontend->>BlobStorage: PUTFileBytes(headers, auth, tags, metadata)
BlobStorage-->>Frontend: Created(versionId)
BlobStorage->>MalwareScanner: TriggerScan(versionId)
MalwareScanner-->>Backend: ScanResult(clean)
alt ScanClean
Backend->>Database: SaveFormAndBlobLink(formData, blobReference)
alt DbSaveSuccess
Database-->>Backend: SaveSuccess
Backend-->>Frontend: Success(recordAndBlobCommitted)
Frontend-->>User: ShowSuccessAndNavigate
else DbSaveFailure
Database-->>Backend: SaveError
Backend->>BlobStorage: DeleteBlob(versionId)
Backend-->>Frontend: Error(dbFailureBlobDeleted)
Frontend-->>User: ShowErrorFormPreserved
end
else ScanMalicious
MalwareScanner-->>Backend: ScanResult(malicious)
Backend->>BlobStorage: DeleteBlob(versionId)
Backend-->>Frontend: Error(maliciousFileDetected)
Frontend-->>User: ShowReuploadPromptFormPreserved
end
Flow diagram for transactional upload, scan, and save decision logicflowchart TD
A[User selects file and fills form] --> B[Frontend validates file and stores locally]
B --> C[User clicks SaveAndContinue]
C --> D[Backend issues SAS token and blob metadata]
D --> E[Frontend uploads file to AzureBlobStorage]
E --> F[MalwareScanner scans uploaded file]
F --> G{Scan result}
G -->|Clean| H[Backend attempts DB transaction to save form and blob reference]
G -->|Malicious| I[Backend deletes blob and returns malicious file error]
H --> J{DB transaction result}
J -->|Success| K[Record and blob link committed]
J -->|Failure| L[Backend deletes blob and returns DB error]
I --> M[Frontend shows reupload prompt, form preserved]
K --> N[Frontend shows success and navigates]
L --> O[Frontend shows error, form preserved]
M --> P[User may select a new file and retry]
O --> P
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- Consider adding a note about handling large file uploads and memory constraints (e.g., using IndexedDB or temporary local storage instead of in-memory).
- Outline retry and error handling strategies for deferred uploads after user confirmation to ensure reliability in case of network failures.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider adding a note about handling large file uploads and memory constraints (e.g., using IndexedDB or temporary local storage instead of in-memory).
- Outline retry and error handling strategies for deferred uploads after user confirmation to ensure reliability in case of network failures.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
|
|
||
| Chosen option: **Option 2: Deferred upload triggered by user confirmation. Transactional Upload + Scan + Save.** | ||
| Upload, scan, and save now execute as a single atomic workflow. | ||
| No data is finalized unless every stage succeeds. Cleanup and user feedback happen immediately on failure. |
There was a problem hiding this comment.
Double check with nick about this point, basically you are saying this entire series of events is all or nothing
There was a problem hiding this comment.
I'll add onto this that it may not be a one size fits all solution. We might have to consider different circumstances of when it would be beneficial to partially save, versus clearing everything. Like for example, on a profile form with multiple inputs and an upload button for your profile image, if the user fills out the form and uploads, would we want to reject all of the other valid form inputs if the blob upload failed for whatever reason?
There was a problem hiding this comment.
in this case, we can preserve the form data in state, and also page doesn't refresh/redirect until it success so we don't have to worry about user have to fill those info again. So even if the blob upload fails, the user won’t lose their entered information and can just retry the file upload without having to re-enter the rest of the details.
…se transaction for enhanced data integrity and user feedback
…ng user clarity on file persistence linked to save actions.
|
|
||
| Chosen option: **Option 2: Deferred upload triggered by user confirmation. Transactional Upload + Scan + Save.** | ||
| Upload, scan, and save now execute as a single atomic workflow. | ||
| No data is finalized unless every stage succeeds. Cleanup and user feedback happen immediately on failure. |
There was a problem hiding this comment.
I'll add onto this that it may not be a one size fits all solution. We might have to consider different circumstances of when it would be beneficial to partially save, versus clearing everything. Like for example, on a profile form with multiple inputs and an upload button for your profile image, if the user fills out the form and uploads, would we want to reject all of the other valid form inputs if the blob upload failed for whatever reason?
…rity and consistency in positive and negative impacts.
Fixes #234
…ncing security and clarity in the process.
… of upload, scan, and save, ensuring no data is finalized unless all stages succeed, and improving user experience by retaining validated form inputs during upload failures.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In the mock participants route,
PageSizedefaults to'50'but falls back with|| 0, so a missing/invalid PageSize results inpageSize = 0and empty result sets; consider defaulting to 50 consistently (e.g.|| 50) or relying on the store’s default. - In
ServiceMessagingTwilio.startUp, the client is only instantiated whenNODE_ENV === 'development', leaving the service unusable (and throwing at call sites) in other environments; if this is meant as a feature flag, it might be clearer to either fail fast instartUpor gate the whole service behind an explicit configuration flag. - The Twilio service constructor currently ignores the commented-out environment-variable code and always sets
accountSid/authTokento empty strings by default, which only surfaces as runtime errors later; consider restoring the env-based configuration or enforcing required constructor arguments so misconfiguration fails earlier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the mock participants route, `PageSize` defaults to `'50'` but falls back with `|| 0`, so a missing/invalid PageSize results in `pageSize = 0` and empty result sets; consider defaulting to 50 consistently (e.g. `|| 50`) or relying on the store’s default.
- In `ServiceMessagingTwilio.startUp`, the client is only instantiated when `NODE_ENV === 'development'`, leaving the service unusable (and throwing at call sites) in other environments; if this is meant as a feature flag, it might be clearer to either fail fast in `startUp` or gate the whole service behind an explicit configuration flag.
- The Twilio service constructor currently ignores the commented-out environment-variable code and always sets `accountSid`/`authToken` to empty strings by default, which only surfaces as runtime errors later; consider restoring the env-based configuration or enforcing required constructor arguments so misconfiguration fails earlier.
## Individual Comments
### Comment 1
<location> `packages/sthrift/mock-messaging-server/dist/src/routes/participants.js:62` </location>
<code_context>
+ // biome-ignore lint/complexity/useLiteralKeys: Required by TypeScript noPropertyAccessFromIndexSignature
+ const page = Number.parseInt(req.query['Page'] ?? '0', 10) || 0;
+ // biome-ignore lint/complexity/useLiteralKeys: Required by TypeScript noPropertyAccessFromIndexSignature
+ const pageSize = Number.parseInt(req.query['PageSize'] ?? '50', 10) || 0;
+ const conversation = store.getConversation(conversationSid);
+ if (!conversation) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Defaulting `pageSize` to 0 breaks pagination and can lead to inconsistent responses.
Using `|| 0` means any missing, zero, or NaN `PageSize` produces `pageSize = 0`, so you always request `slice(start, 0)` while `hasNextPage` may still be `true`. This makes the pagination metadata inconsistent. To match the other routes and keep behaviour coherent, `pageSize` should default to `50` instead of `0`.
</issue_to_address>
### Comment 2
<location> `packages/sthrift/mock-messaging-server/dist/src/index.js:45` </location>
<code_context>
+ more_info: 'https://www.twilio.com/docs/errors/20404',
+ });
+ });
+ app.use((err, _req, res) => {
+ console.error('Unhandled error:', err);
+ res.status(500).json({
</code_context>
<issue_to_address>
**issue (bug_risk):** Express error handler is missing the `next` parameter, so it will not be treated as an error-handling middleware.
Express only treats middleware with four parameters `(err, req, res, next)` as error handlers. With the current `(err, _req, res)` signature, this will behave like normal middleware and won’t be called for errors. Add the `next` parameter (even if unused), e.g. `app.use((err, _req, res, _next) => { ... })` so it runs on errors as intended.
</issue_to_address>
### Comment 3
<location> `packages/sthrift/mock-messaging-server/dist/src/store.d.ts:1` </location>
<code_context>
+import type { Conversation, Message, Participant } from './types.ts';
+declare class MockMessagingStore {
+ private readonly conversations;
</code_context>
<issue_to_address>
**issue (bug_risk):** The `.d.ts` file imports from `./types.ts`, which is unlikely to exist in the published dist and may break type resolution.
In the emitted dist you have `types.d.ts` / `types.js`, but this declaration file points to `./types.ts`. That file won’t exist in the published package for consumers, so the import should target the built artifact instead (e.g. `./types.js` or `./types`, depending on how your bundler resolves extensions).
</issue_to_address>
### Comment 4
<location> `packages/sthrift/messaging-service-twilio/dist/src/index.d.ts:10` </location>
<code_context>
+ private readonly accountSid;
+ private readonly authToken;
+ constructor(accountSid?: string, authToken?: string);
+ startUp(): Promise<Exclude<ServiceMessagingTwilio, ServiceBase>>;
+ shutDown(): Promise<void>;
+ get service(): TwilioClient;
</code_context>
<issue_to_address>
**issue (bug_risk):** The declared return type of `startUp()` does not match the runtime implementation, which returns a plain instance rather than a Promise.
This mismatch means `startUp()` returns `this` synchronously, so `service.startUp().then(...)` will fail at runtime because the result has no `.then`. Either change the implementation to be truly async or update the `.d.ts` return type to reflect the synchronous behavior and avoid this runtime contract violation.
</issue_to_address>
### Comment 5
<location> `apps/docs/docs/decisions/0023-new-azure-upload.md:55` </location>
<code_context>
+## Consequences
+- Good: Orphaned and unreferenced files are eliminated.
+- Good: Ensures consistent and secure linkage between uploaded blobs and saved data.
+- Good: Regulatory and operational risks are minimized (by ensuring that only properly scanned: malware-free files are stored and linked to committed database records).
+- Good: No background sweeps or delayed cleanups required.
+- Good: Improves user clarity by creating a direct, predictable link between user actions (“Save”) and data persistence — users know their file and form are saved together, or not at all.
</code_context>
<issue_to_address>
**issue (typo):** Fix punctuation in "properly scanned: malware-free files"
The colon in that phrase reads as a punctuation error. Please switch to a comma ("properly scanned, malware-free files") or parentheses ("properly scanned (malware-free) files") to improve grammar and clarity.
```suggestion
- Good: Regulatory and operational risks are minimized (by ensuring that only properly scanned, malware-free files are stored and linked to committed database records).
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| // biome-ignore lint/complexity/useLiteralKeys: Required by TypeScript noPropertyAccessFromIndexSignature | ||
| const page = Number.parseInt(req.query['Page'] ?? '0', 10) || 0; | ||
| // biome-ignore lint/complexity/useLiteralKeys: Required by TypeScript noPropertyAccessFromIndexSignature | ||
| const pageSize = Number.parseInt(req.query['PageSize'] ?? '50', 10) || 0; |
There was a problem hiding this comment.
issue (bug_risk): Defaulting pageSize to 0 breaks pagination and can lead to inconsistent responses.
Using || 0 means any missing, zero, or NaN PageSize produces pageSize = 0, so you always request slice(start, 0) while hasNextPage may still be true. This makes the pagination metadata inconsistent. To match the other routes and keep behaviour coherent, pageSize should default to 50 instead of 0.
| more_info: 'https://www.twilio.com/docs/errors/20404', | ||
| }); | ||
| }); | ||
| app.use((err, _req, res) => { |
There was a problem hiding this comment.
issue (bug_risk): Express error handler is missing the next parameter, so it will not be treated as an error-handling middleware.
Express only treats middleware with four parameters (err, req, res, next) as error handlers. With the current (err, _req, res) signature, this will behave like normal middleware and won’t be called for errors. Add the next parameter (even if unused), e.g. app.use((err, _req, res, _next) => { ... }) so it runs on errors as intended.
| @@ -0,0 +1,34 @@ | |||
| import type { Conversation, Message, Participant } from './types.ts'; | |||
There was a problem hiding this comment.
issue (bug_risk): The .d.ts file imports from ./types.ts, which is unlikely to exist in the published dist and may break type resolution.
In the emitted dist you have types.d.ts / types.js, but this declaration file points to ./types.ts. That file won’t exist in the published package for consumers, so the import should target the built artifact instead (e.g. ./types.js or ./types, depending on how your bundler resolves extensions).
| private readonly accountSid; | ||
| private readonly authToken; | ||
| constructor(accountSid?: string, authToken?: string); | ||
| startUp(): Promise<Exclude<ServiceMessagingTwilio, ServiceBase>>; |
There was a problem hiding this comment.
issue (bug_risk): The declared return type of startUp() does not match the runtime implementation, which returns a plain instance rather than a Promise.
This mismatch means startUp() returns this synchronously, so service.startUp().then(...) will fail at runtime because the result has no .then. Either change the implementation to be truly async or update the .d.ts return type to reflect the synchronous behavior and avoid this runtime contract violation.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The document describes upload + malware scan + DB save as a single atomic transaction, but malware scanning is typically asynchronous; consider clarifying the actual transactional boundaries and failure modes (e.g., how long-running or async scans are handled and what ‘atomic’ precisely guarantees).
- It might be worth calling out how retries and idempotency are handled for the ‘Save and Continue’ flow (e.g., user double-clicks, network retries, or partial failures) to avoid duplicate uploads or records while still guaranteeing cleanup on failure.
- Since files are held locally until confirmation, consider adding a short note on maximum file sizes, memory constraints, and whether any browser-side persistence (e.g., IndexedDB) is needed to avoid issues on slower devices or large uploads.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The document describes upload + malware scan + DB save as a single atomic transaction, but malware scanning is typically asynchronous; consider clarifying the actual transactional boundaries and failure modes (e.g., how long-running or async scans are handled and what ‘atomic’ precisely guarantees).
- It might be worth calling out how retries and idempotency are handled for the ‘Save and Continue’ flow (e.g., user double-clicks, network retries, or partial failures) to avoid duplicate uploads or records while still guaranteeing cleanup on failure.
- Since files are held locally until confirmation, consider adding a short note on maximum file sizes, memory constraints, and whether any browser-side persistence (e.g., IndexedDB) is needed to avoid issues on slower devices or large uploads.
## Individual Comments
### Comment 1
<location> `apps/docs/docs/decisions/0023-new-azure-upload.md:55` </location>
<code_context>
+## Consequences
+- Good: Orphaned and unreferenced files are eliminated.
+- Good: Ensures consistent and secure linkage between uploaded blobs and saved data.
+- Good: Regulatory and operational risks are minimized (by ensuring that only properly scanned: malware-free files are stored and linked to committed database records).
+- Good: No background sweeps or delayed cleanups required.
+- Good: Improves user clarity by creating a direct, predictable link between user actions (“Save”) and data persistence — users know their file and form are saved together, or not at all.
</code_context>
<issue_to_address>
**issue (typo):** Replace the colon in "properly scanned: malware-free" with a comma or dash for correct grammar.
The colon makes this phrase read awkwardly. Please change it to use a comma or em dash instead (e.g., "properly scanned, malware-free files" or "properly scanned—malware-free—files").
```suggestion
- Good: Regulatory and operational risks are minimized (by ensuring that only properly scanned, malware-free files are stored and linked to committed database records).
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Introduce a decision document outlining a new Azure upload flow that delays file uploads until user confirmation. This change aims to prevent orphaned files, reduce storage costs, and enhance data integrity by ensuring uploads align with user actions. The proposed approach improves user experience while maintaining security and compliance.
Summary by Sourcery
Documentation: