Skip to content
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
4ccd515
Clarify SAS token terminology in Azure upload documentation - Because…
kishor-gupta Dec 4, 2025
ab97e93
Fix grammatical error in decision outcome for Azure upload documentation
kishor-gupta Dec 4, 2025
70c857d
Clarify authorization strategies for Azure Blob uploads and emphasize…
kishor-gupta Dec 4, 2025
e5934a2
Clarify terminology for Shared Key SAS in Azure upload documentation
kishor-gupta Dec 4, 2025
cde3bf1
Fix grammatical errors and improve clarity in Azure upload documentation
kishor-gupta Dec 4, 2025
1159e59
Refine authorization strategies for Azure Blob uploads, emphasizing S…
kishor-gupta Dec 4, 2025
d233d9d
Correct option labeling for Shared Key SAS tokens in Azure upload doc…
kishor-gupta Dec 4, 2025
25c5d7b
Refine wording for clarity in SAS token generation description for Az…
kishor-gupta Dec 4, 2025
c58ea43
Refine documentation to clarify the use of Valet Key pattern (Shared …
kishor-gupta Dec 17, 2025
454cb0d
Enhance documentation for Azure upload by clarifying security measure…
kishor-gupta Dec 17, 2025
dec9c61
Refine Valet Key pattern documentation by updating HTML syntax for im…
kishor-gupta Dec 17, 2025
fc5ce34
Refactor Azure upload documentation for clarity and consistency; upda…
kishor-gupta Dec 17, 2025
342f22a
Fix description wording for clarity and consistency in Azure upload d…
kishor-gupta Dec 17, 2025
4b20164
Refine documentation language for clarity in Azure upload consequence…
kishor-gupta Dec 17, 2025
2e5770a
Clarify Shared Key authorization details in Azure upload documentatio…
kishor-gupta Dec 18, 2025
6f58b3b
Expand documentation on Shared Key authorization, detailing reasons f…
kishor-gupta Dec 23, 2025
3e8e0fb
Merge branch 'main' into dev-kishor/issue230-2
kishor-gupta Dec 23, 2025
4051767
Update documentation to correct 'Microsoft Defender for Cloud' to 'Mi…
kishor-gupta Dec 23, 2025
252de7f
Merge branch 'dev-kishor/issue230-2' of https://github.com/simnova/sh…
kishor-gupta Dec 23, 2025
743c391
Clarify documentation on Shared Key authorization, specifying server-…
kishor-gupta Dec 23, 2025
791be7f
Clarify documentation on the Valet Key pattern in the consequences se…
kishor-gupta Dec 23, 2025
eb8d8b7
Clarify documentation on the Valet Key pattern in the consequences se…
kishor-gupta Dec 24, 2025
deac6df
Merge branch 'main' into dev-kishor/issue230-2
kishor-gupta Dec 26, 2025
3f4608d
Refine documentation on server-signed request header generation for A…
kishor-gupta Jan 6, 2026
dd9a761
Merge branch 'dev-kishor/issue230-2' of https://github.com/simnova/sh…
kishor-gupta Jan 6, 2026
398621f
Merge branch 'main' into dev-kishor/issue230-2
kishor-gupta Jan 6, 2026
ba1a59a
Revised || Clarify reasons for not adopting Entra ID for Azure Blob S…
kishor-gupta Jan 6, 2026
30dabed
Quick Summary || Enhance documentation on Azure Blob Storage uploads,…
kishor-gupta Jan 6, 2026
7462486
Refine documentation on frontend components, removing redundancy in t…
kishor-gupta Jan 6, 2026
40885c3
Enhance security description in Azure upload documentation, detailing…
kishor-gupta Jan 6, 2026
09175fc
Clarify terminology in Azure upload documentation, defining the Valet…
kishor-gupta Jan 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 102 additions & 16 deletions apps/docs/docs/decisions/0022-existing-azure-upload.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
sidebar_position: 22
sidebar_label: 0022 Existing Azure Upload
description: "Existing Azure Upload with Direct client uploads using Azure Blob SAS tokens."
description: "Existing Azure Upload using the Valet Key architectural pattern with direct client uploads and Shared Key authorization."
status: proposed
date: 2025-10-21
deciders: gidich
Expand All @@ -11,35 +11,121 @@ informed:

# Existing Azure Upload Implementation

# tl;dr (Summary)

**Decision:** Use Azure Blob Storage uploads with Shared Key authorization and server-signed request headers (sometimes called "Valet Key").

**Key reasons:**
- Enables secure, time-limited, and permission-scoped uploads without exposing storage account keys to clients.
- Avoids the complexity and risks of SAS tokens and direct key distribution.
- Aligns with our DDD security model and allows for future domain-driven permission checks.


## Terminology

- **Valet Key architectural pattern**: A Microsoft-defined architectural pattern in which a backend service acts as a gatekeeper and grants a client narrowly scoped, limited permission to access a storage resource directly. The backend validates intent and issues temporary or constrained access, while the client performs the data transfer directly to the storage service. The Valet Key pattern is independent of the specific authorization mechanism used.

- **Shared Key authorization**: An Azure Blob Storage authentication mechanism where requests are authenticated by signing a canonical representation of the HTTP request using the storage account access key. This mechanism allows the backend to cryptographically bind authorization to exact request details such as HTTP method, resource path, headers, metadata, and index tags.

- **Server-signed request headers**: The specific implementation used in this system to apply the Valet Key pattern. The backend generates and signs the required HTTP request headers using Shared Key authorization and provides those headers to the client, which must send them unmodified when uploading directly to Azure Blob Storage.

For clarity, this ADR uses **Valet Key architectural pattern** to describe the overall design approach and **server-signed request headers (Shared Key authorization)** to describe the specific authorization mechanism chosen for direct client uploads.

## Context and Problem Statement

Our application requires a secure and scalable mechanism for handling file uploads. From the start, the design approach was to leverage Azure Blob Storage as the primary file storage service due to its reliability, scalability, and seamless integration with the Azure ecosystem.

Users need to upload various file types (PDFs, images) along with metadata and tags for tracking. The system implements a direct upload flow where the client requests authorization from the backend, which issues a short-lived SAS (Shared Access Signature) token allowing the file to be uploaded directly to Azure Blob Storage.
Users need to upload various file types (PDFs, images) along with metadata and tags for tracking. The system implements a direct upload flow where the client requests authorization from the backend, which issues server-signed request headers (using Shared Key authorization) allowing the file to be uploaded directly to Azure Blob Storage.

## Decision Drivers

- **Scalability**: The system must efficiently handle large file uploads and multiple concurrent requests without overloading backend services.
- **Performance**: Direct client-to-Azure Blob uploads reduce backend latency and improve user upload speed.
- **Cost Optimization**: Offloading upload bandwidth from backend servers to Azure Blob Storage minimizes infrastructure and data transfer costs.
- **Security**: Uploads must be secure and authenticated. Short-lived SAS tokens (valet keys) provide controlled, time-bound access to the storage account.
- **Security**: Uploads must be secure and authenticated. The Valet Key pattern enables the backend to grant narrowly scoped upload permissions with strong server-side control over the blob path, headers, metadata, and index tags, preventing clients from altering or escalating the approved upload intent.
- **Malware Scanning**: Uploaded files must undergo malware scanning. Any malicious files must be identified, quarantined, and deleted immediately to maintain data integrity and user safety.

## Considered Options
## Considered Architectural Upload Options

- **Option 1: Backend-mediated uploads (server uploads to Blob Storage)**
- The client uploads files to the backend, which then transfers them to Azure Blob Storage.

- **Option 2: Direct client uploads using Azure Blob SAS tokens (current approach)**
- The backend generates a short-lived SAS token authorizing the client to upload directly to Blob Storage.
- **Option 2: Valet Key architectural pattern (direct client uploads) (current approach)**
- Client requests permission from the backend; backend validates and grants time‑bound, scope‑limited permission for a specific upload; client uploads directly to Azure Blob Storage.
- The Valet Key pattern is an architectural approach where the backend acts as a gatekeeper that grants narrowly scoped, time‑bound permission for a specific operation against storage. The client never has broad storage credentials and only performs the single intended operation within a short window using parameters the server defines (path, headers, metadata, tags, and expiry).

## Pros and Cons of the Options

### Backend-mediated uploads
- Good, because simpler client logic; backend can stream/transform data inline.
- Bad, because higher backend load and egress, potential bottlenecks, increased infrastructure cost.

### Valet Key architectural pattern
- Good, because excellent scalability, reduced backend bandwidth/cost, improved upload performance and latency, precise server control over per‑upload intent (path, metadata, tags).
- Bad, because validation and security checks occur after upload, adding complexity to post-upload workflows (e.g., malware scanning) and requiring careful orchestration of server-issued parameters and client behavior.

## Architectural Upload Decision Outcome

Chosen option: **Valet Key architectural pattern (direct client uploads) (current approach)**, because it offloads upload traffic from the backend, reduces latency, lowers cost, and maintains security.

## Considered Authorization Mechanism (within Valet Key)

- **Option 1: Short‑lived SAS token**
- A short‑lived Shared Access Signature (SAS) is a token with a very limited validity period that grants specific permissions to access a blob or container. If such a token is intercepted, it can still be misused for the duration of its lifetime, so it must be kept extremely short and carefully managed.

- **Option 2: Shared Key authorization with server-signed request headers (current approach)**
- The server signs the request using the storage account’s key (request signing). The client sends the exact signed headers as provided by the backend to perform the upload. This is not a token; it is a request authenticated via Shared Key authorization. It allows the client to perform a single, narrowly scoped upload operation using server-signed request headers, without exposing the account key, and the backend can control exactly which permissions, metadata, and expiry apply for each individual upload request.

### Pros/Cons

### Short‑lived SAS token
- Good, because it's simple client integration; permission and expiry embedded in a single token; easy link‑style distribution when appropriate.
- Bad, because token becomes a reusable artifact during its lifetime; needs strict expirations and storage discipline; leakage risk exists for the token window.

### Shared Key (current approach)
- Good, because no reusable token artifact; tighter server‑side control over the exact method, resource path, headers, metadata, tags, and narrow time window; matches current implementation.
- Bad, because backend must construct canonicalized request details; stronger coupling between backend‑issued headers and client upload code.

## Authorization Mechanism (within Valet Key) Decision Outcome

Chosen option: **Option 2: Shared Key authorization with server-signed request headers (current approach)**, because server-signed request headers are generated by the backend per request with the correct blob path, metadata, and tags, and are cryptographically signed so that clients or third parties cannot tamper with or escalate their permissions. Although Shared Key authorization does not include an explicit expiry field like SAS, request validity is effectively time-bounded by Azure Blob Storage through the x-ms-date header and strict clock-skew enforcement, with the backend issuing these signed headers just-in-time per upload.

### Why we did not choose SAS tokens

- **Insufficient granularity for headers/metadata/tags**: SAS scopes permissions (read/write/list/set permissions) and time, but it does not bind the request to specific header values like `x-ms-meta-*` or blob index tags. A SAS with `w` (write) or `c` (create) permission allows any actor possessing the token to upload any content and arbitrary metadata/tags to the scoped resource during the token lifetime.
- **Reusable artifact risk**: A leaked SAS can be used by unauthorized parties for the duration of its validity window. While short expirations help, they do not eliminate the risk of misuse within that window.
- **No content-level binding**: SAS does not bind the authorization to the server-approved payload (file digest, exact headers, or preapproved metadata/tags). With server-signed request headers, we include the precise headers the client must send; any deviation invalidates the signature.
- **Operational blast radius**: Using SAS for direct client uploads would require wide distribution of short-lived tokens and introduce complex token lifecycle management without improving our fine-grained control over what is uploaded.

We considered placing a proxy in front of uploads (client → Function App → Blob) to perform server-side validation with SAS or Entra ID, but this approach reintroduces backend bandwidth/latency bottlenecks and increases exposure to DDoS and malicious payload processing on the server. Our current design avoids those risks by keeping uploads direct while maintaining strong, per-request constraints through server-signed request headers.

### Why we have not switched to Entra ID (Azure AD)

We evaluated Entra ID–based authorization patterns (including Managed Identity and user delegation SAS) as part of the design. While Entra ID is our preferred approach for service-to-service authentication in many areas of the platform, it does not meet the specific requirements of this upload flow.

- **Header-binding requirement**: Our design relies on cryptographically binding the upload authorization to the exact canonical request, including HTTP method, blob path, headers, metadata, index tags, and timing constraints. Entra ID–based RBAC and OAuth tokens authorize *who* can access storage, but they do not provide a mechanism to bind authorization to exact per-request header values in the way Shared Key request signing does.

- **Managed Identity design choice**: Although Managed Identity can authenticate the backend to Azure Resource Manager and the storage data plane, it intentionally does not expose raw storage account keys to workloads. We deliberately chose not to retrieve or manage account keys via the control plane or alternative mechanisms, as doing so would undermine our narrowly scoped, request-level signing model.

- **User delegation SAS limitations**: Entra ID can mint user delegation SAS tokens, but these still follow the SAS model—time- and permission-scoped access that is reusable within its validity window. User delegation SAS does not bind uploads to exact server-approved headers, metadata, or tags, which is a core requirement of this system.

For uploads requiring strict, per-request binding to exact headers, metadata, and tags, Shared Key–based request signing remains the only mechanism that satisfies our constraints. We continue to prefer Entra ID where applicable and periodically re-evaluate whether future platform capabilities can meet these requirements without compromising security or control.


### Mitigations for storage account key exposure

## Decision Outcome
Although Shared Key authorization requires access to the storage account key, the associated risk is mitigated through the following controls:

Chosen option: **Option 2: Direct client uploads using Azure Blob SAS tokens (current approach)**, because it offloads upload traffic from the backend, reduces latency, lower cost, maintains security via short-lived tokens.
- **Key Vault only, no inline secrets**: The storage account key is never stored in code or configuration. It resides solely in Azure Key Vault.
- **Access via Managed Identity**: The Function App accesses the Key Vault using its Managed Identity, with least-privilege RBAC and no direct exposure of the key outside the process memory of the signing operation.
- **Network and access controls**: Key Vault firewall, private endpoints, and access policies restrict retrieval to the Function App identity. Storage account network rules further limit access.
- **Key rotation**: Regular rotation of storage account keys reduces exposure window if compromise were ever to occur.
- **Monitoring and scanning**: Defender for Storage malware scanning, audit logs, and alerts monitor anomalous activity; malicious uploads are quarantined and rolled back using blob versioning.
- **Time-bounded signed requests**: Each upload is signed just-in-time with strict `x-ms-date` constraints; signatures expire quickly and cannot be reused or altered by clients.

## Consequences

- Good, because uploading directly from the client to Azure Blob Storage significantly reduces backend bandwidth usage and infrastructure costs.
- Good, because uploading directly from the client to Azure Blob Storage using the Valet Key/direct upload pattern significantly reduces backend bandwidth usage and infrastructure costs.
- Good, because versioning support allows easy rollback in case of corruption or malicious file detection.
- Bad, because malware scanning occurs after upload, introducing a brief exposure window before a file is fully validated.

Expand All @@ -48,14 +134,14 @@ Chosen option: **Option 2: Direct client uploads using Azure Blob SAS tokens (cu
**Frontend Components:**
- Handles client-side file validation (type, size, dimensions).
- Requests authorization from the backend to upload a specific file.
- Uses the received SAS token to upload the file directly to Azure Blob Storage.
- Uses the server-signed request headers to upload the file directly to Azure Blob Storage.
- After upload, notifies the backend to trigger malware scanning and persist upload metadata.

**Backend Services:**
- SAS Token Generation:
- The backend handles SAS token generation and validation for Azure Blob Storage uploads, ensuring secure and controlled access for file uploads. There are different mutations for PDF and image files. The backend service encapsulates all business logic enforcing file upload restrictions and security requirements before enabling clients to upload files directly to Azure Blob Storage with short-lived and carefully permissioned SAS tokens.
- Server-signed request header generation:
- The backend handles generation and validation of server-signed request headers for Azure Blob Storage uploads, ensuring secure and controlled access for file uploads. There are different mutations for PDF and image files. The backend service encapsulates all business logic enforcing file upload restrictions and security requirements before enabling clients to upload files directly to Azure Blob Storage using carefully permissioned, server-signed request headers.
- Post-Upload Malware Handling:
- The backend polls the blob for the Microsoft Defender for Cloud scan result tag: `No threats found` or `Malicious`.
- The backend polls the blob for the Microsoft Defender for Storage scan result tag: `No threats found` or `Malicious`.
- `No threats found` → retain the blob.
- `Malicious` → delete the current blob version and restore the previous non-malicious version.
- The previous version is promoted to current using copyBlob.
Expand All @@ -76,9 +162,9 @@ participant Blob

User->>Frontend: Click upload and select file
Frontend->>Frontend: Sanitize & validate (type, size, dimensions)
Frontend->>Backend: Request SAS token (name, type, size)
Frontend->>Backend: Request server-signed request headers (name, type, size)
Backend->>Backend: Build blob path + tags + metadata, validate upload rules
Backend-->>Frontend: AuthResult (blob URL + SAS token + x-ms-date + tags + metadata)
Backend-->>Frontend: AuthResult (blob URL + server-signed request headers + x-ms-date + tags + metadata)
Frontend->>Blob: PUT file bytes (headers + auth + tags + metadata)
Blob-->>Frontend: 201 Created (x-ms-version-id)
Frontend->>Backend: Persist blob reference/version ID
Expand All @@ -97,4 +183,4 @@ Frontend-->>User: Show success / preview / error if malicious

- Malware scanning occurs after upload using Azure Blob Storage’s capabilities. Files flagged as malicious are deleted or reverted to ensure data integrity. The system uses Microsoft Defender for Storage to automatically scan uploaded blobs for malware. Defender checks for known malware signatures, embedded scripts, and other suspicious file patterns. This introduces a small window where a malicious file may exist in storage before removal.
- At present, backend permission enforcement for blob upload is minimal. The frontend restricts upload actions according to application state, but users could potentially bypass this if they possess valid credentials.
- Future improvements will focus on implementing domain-driven permission checks before SAS tokens are issued and exploring pre-upload scanning alternatives to further reduce risk.
Future improvements will focus on implementing domain-driven permission checks before server-signed request headers (upload authorizations) are issued and exploring pre-upload scanning alternatives to further reduce risk.