feat(enclave-aws-nitro): self-hosted Nitro enclave#12
Conversation
…f Evervault Drop-in replacement for the Evervault enclave layer using AWS Nitro Enclaves hosted on a self-managed EC2 instance. Exposes the identical internal HTTP API surface so api/src/enclaveClient.ts requires zero changes. Key design decisions: - enclave/src/index.ts mirrors enclave/src/index.ts 1-to-1 (same endpoints, same JWT ticket protocol, same secp256k1 Schnorr signing) - Key sealing replaced: Evervault's http://127.0.0.1:9999 → AWS KMS Encrypt/ Decrypt with NSM attestation (RecipientInfo flow) - Transport replaced: Evervault data-plane sidecar → vsock with two helpers: vsock-connect.c (enclave→parent, for KMS proxy) and vsock-listen on parent - parent/ daemon bridges HTTPS from API to enclave (http-bridge.ts) and proxies KMS calls from enclave to AWS KMS (kms-proxy.ts) - Dev mode (ENCLAVE_DEV_MODE/NITRO_DEV_MODE) falls back to local AES-256-GCM and TCP sockets — identical behaviour to the existing enclave dev mode - infra/ contains KMS key policy (PCR-gated decrypt), IAM policy, and scripts for setup, EIF build, and enclave launch (analogous to ev enclave build/deploy) Evervault → Nitro equivalences documented in README.md and nitro.toml. https://claude.ai/code/session_01SfrY68KpQBFG71qu8SAiJk
Dialectical autocoding loop: write → run → fix → repeat until green.
## Test coverage (74 tests, 3 files)
### test/enclave-aws-nitro/enclave.test.ts (HTTP API surface)
- GET /health → 200 without auth
- Auth enforcement: missing / wrong key → 401
- POST /internal/generate: 201 + compressed secp256k1 pubkey, 409 duplicate, 400 bad UUID
- POST /internal/sign: Schnorr sig roundtrip, 404 unknown, 401 tampered ticket,
403 wrong identity, 403 wrong digest, 409 replay, 0x-prefixed digest accepted
- POST /internal/destroy: 200 ok, 404 not found, key inaccessible after destroy
- POST /internal/backup/export + import: AES roundtrip, public key preserved,
signing works after restore, 404 for non-existent key
### test/enclave-aws-nitro/kms.test.ts (unit)
- sealKeyLocal/unsealKeyLocal: AES-256-GCM roundtrip, unique IVs, malformed error
- sealKey/unsealKey dev mode: full roundtrip, non-kms: passthrough
- sealKey production: mocked spawn simulates vsock-connect + KMS proxy response
- kmsRequest: error propagation (ENOENT, KMS error)
### test/enclave-aws-nitro/kms-proxy.test.ts (parent daemon)
- Encrypt → KMSClient.send called, CiphertextBlob returned
- Decrypt + RecipientInfo → CiphertextForRecipient returned (NSM flow)
- KMS SDK throws → { error } response
- Invalid JSON request → { error } response
- Decrypt without Recipient → plaintext fallback as CiphertextBlob
## Fixes applied during loop
- enclave/src/app.ts: extracted createApp() factory for test isolation (no
global state — each test call gets a fresh key store + nonce cache)
- enclave/src/kms.ts: replaced dynamic require() with static import { spawn };
fixed unsealKey return to plaintextBuf.toString('hex') (not base64 chain);
exported kmsRequest, sealKeyLocal, unsealKeyLocal for unit testing
- enclave/src/nsm.ts: removed misleading fs.openSync/closeSync (nsm-ioctl.c
binary opens /dev/nsm itself); nsm.ts now only calls execFileSync
- test/enclave-aws-nitro/helpers.ts: uuid() uses crypto.randomUUID() for
RFC 4122 v4 compliance (Zod v4 z.string().uuid() is strict)
- test/enclave-aws-nitro/kms-proxy.test.ts: KMSClient mock uses regular
function constructor (not arrow) so `new KMSClient()` works in vitest 4;
port 0 + dynamic port discovery for parallel-safe test isolation
- pnpm-workspace.yaml: added enclave-aws-nitro/enclave + parent packages
- package.json: added test-time devDependencies (jose, @aws-sdk/client-kms,
@noble/secp256k1, express, zod) so vitest can resolve imports
https://claude.ai/code/session_01SfrY68KpQBFG71qu8SAiJk
Deploying claw-cash-landing-page with
|
| Latest commit: |
184fbfa
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://66e69a9b.claw-cash-landing-page.pages.dev |
| Branch Preview URL: | https://claude-evervault-aws-nitro-y.claw-cash-landing-page.pages.dev |
There was a problem hiding this comment.
Pull request overview
This PR introduces a self-hosted AWS Nitro Enclave implementation as a drop-in replacement for the Evervault enclave layer. The implementation maintains API compatibility while replacing the underlying infrastructure with AWS KMS for key sealing and vsock for inter-process communication between the enclave and parent EC2 instance.
Changes:
- Added complete AWS Nitro Enclave infrastructure with enclave (inside Nitro) and parent (EC2 daemon) components
- Implemented KMS-based key sealing with NSM attestation for production, AES-256-GCM fallback for dev mode
- Added vsock communication layer with C helper binaries for AF_VSOCK socket support
- Created comprehensive test suite covering KMS operations, HTTP API surface, and proxy functionality
- Provided infrastructure scripts for EC2 setup, EIF building, and enclave deployment
Reviewed changes
Copilot reviewed 33 out of 34 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| test/enclave-aws-nitro/helpers.ts | Shared test utilities and constants |
| test/enclave-aws-nitro/kms.test.ts | Unit tests for KMS seal/unseal operations |
| test/enclave-aws-nitro/kms-proxy.test.ts | Tests for parent-side KMS vsock proxy |
| test/enclave-aws-nitro/enclave.test.ts | HTTP API surface integration tests |
| enclave-aws-nitro/enclave/src/config.ts | Enclave configuration management |
| enclave-aws-nitro/enclave/src/kms.ts | KMS integration with vsock proxy |
| enclave-aws-nitro/enclave/src/nsm.ts | NSM attestation document client |
| enclave-aws-nitro/enclave/src/app.ts | Express application with signing endpoints |
| enclave-aws-nitro/enclave/src/index.ts | Enclave entry point |
| enclave-aws-nitro/enclave/src/graceful-shutdown.ts | Shutdown handler |
| enclave-aws-nitro/parent/src/config.ts | Parent daemon configuration |
| enclave-aws-nitro/parent/src/vsock.ts | vsock socket abstraction layer |
| enclave-aws-nitro/parent/src/kms-proxy.ts | KMS vsock proxy implementation |
| enclave-aws-nitro/parent/src/http-bridge.ts | HTTP-to-vsock bridge |
| enclave-aws-nitro/parent/src/index.ts | Parent daemon entry point |
| enclave-aws-nitro/enclave/vsock-connect.c | C helper for vsock connections |
| enclave-aws-nitro/enclave/nsm-ioctl.c | C helper for NSM attestation |
| enclave-aws-nitro/enclave/Dockerfile | Multi-stage enclave image build |
| enclave-aws-nitro/infra/setup-parent.sh | EC2 instance bootstrap script |
| enclave-aws-nitro/infra/build-eif.sh | EIF build and PCR extraction script |
| enclave-aws-nitro/infra/run-enclave.sh | Enclave launch script |
| enclave-aws-nitro/infra/kms-key-policy.json | KMS key policy with PCR conditions |
| enclave-aws-nitro/infra/iam-policy.json | EC2 instance IAM policy |
| enclave-aws-nitro/nitro.toml | Nitro enclave configuration |
| enclave-aws-nitro/README.md | Architecture and usage documentation |
| package.json | Added test dependencies |
| pnpm-workspace.yaml | Added enclave-aws-nitro workspaces |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export async function kmsRequest(payload: KmsProxyRequest): Promise<KmsProxyResponse> { | ||
| return new Promise((resolve, reject) => { | ||
| const child = spawn( | ||
| "/usr/local/bin/vsock-connect", | ||
| ["3", String(config.kmsProxyPort)], | ||
| { stdio: ["pipe", "pipe", "inherit"] } | ||
| ); | ||
|
|
||
| let out = ""; | ||
| child.stdout.on("data", (chunk: Buffer) => { out += chunk.toString(); }); | ||
| child.stdout.on("end", () => { | ||
| try { resolve(JSON.parse(out) as KmsProxyResponse); } | ||
| catch { reject(new Error(`KMS proxy bad response: ${out}`)); } | ||
| }); | ||
| child.on("error", reject); | ||
| child.stdin.write(JSON.stringify(payload) + "\n"); | ||
| child.stdin.end(); | ||
| }); | ||
| } |
There was a problem hiding this comment.
The kmsRequest function accumulates stdout data in the 'out' variable without any size limit, similar to the issue in kms-proxy.ts. A malicious or malfunctioning KMS proxy could send unlimited data, causing memory exhaustion. Additionally, there's no timeout mechanism, so if the vsock-connect process hangs, the promise will never resolve or reject, causing a memory leak.
| const enclaveSocket = connectVsock(enclaveCid, enclavePort); | ||
|
|
||
| enclaveSocket.on("error", (err) => { | ||
| console.error("[bridge] vsock error:", err.message); | ||
| if (!res.headersSent) { | ||
| res.writeHead(502, { "content-type": "application/json" }); | ||
| res.end(JSON.stringify({ error: "Enclave connection failed" })); | ||
| } | ||
| }); | ||
|
|
||
| enclaveSocket.on("connect", () => { |
There was a problem hiding this comment.
The HTTP bridge implementation has a critical timing issue. The "connect" event listener is attached after calling connectVsock(), but for TCP connections in dev mode, the connection may already be established synchronously. This means the "connect" event might never fire, causing requests to hang. The proper pattern is to attach event listeners before initiating the connection, or check if the socket is already connected before setting up listeners.
| // Parse the enclave's HTTP/1.1 response and write back to the client | ||
| let responseBuffer = Buffer.alloc(0); | ||
| let headersParsed = false; | ||
|
|
||
| enclaveSocket.on("data", (chunk: Buffer) => { | ||
| responseBuffer = Buffer.concat([responseBuffer, chunk]); | ||
|
|
||
| if (!headersParsed) { | ||
| const headerEnd = responseBuffer.indexOf("\r\n\r\n"); | ||
| if (headerEnd === -1) return; // still accumulating headers | ||
|
|
||
| const headerSection = responseBuffer.subarray(0, headerEnd).toString(); | ||
| const body = responseBuffer.subarray(headerEnd + 4); | ||
|
|
||
| const [statusLine, ...headerLines] = headerSection.split("\r\n"); | ||
| const statusMatch = statusLine?.match(/^HTTP\/1\.[01] (\d+)/); | ||
| const statusCode = statusMatch ? parseInt(statusMatch[1]!, 10) : 200; | ||
|
|
||
| const responseHeaders: Record<string, string> = {}; | ||
| for (const line of headerLines) { | ||
| const colon = line.indexOf(":"); | ||
| if (colon > 0) { | ||
| const key = line.slice(0, colon).trim().toLowerCase(); | ||
| const val = line.slice(colon + 1).trim(); | ||
| responseHeaders[key] = val; | ||
| } | ||
| } | ||
|
|
||
| res.writeHead(statusCode, responseHeaders); | ||
| if (body.length > 0) res.write(body); | ||
| headersParsed = true; | ||
| } else { | ||
| res.write(chunk); | ||
| } | ||
| }); | ||
|
|
||
| enclaveSocket.on("end", () => res.end()); | ||
| }); |
There was a problem hiding this comment.
The HTTP parsing implementation lacks proper error handling. If the response from the enclave is malformed (e.g., missing status line, invalid headers), the code will crash or behave unexpectedly. The statusLine could be undefined if the split returns an empty array, and accessing statusMatch[1] without proper null checks could throw. Additionally, there's no timeout mechanism, so a slow or hanging enclave response will leave the client connection open indefinitely.
| const { publicKey: pubDer, privateKey: privDer } = generateKeyPairSync("rsa", { | ||
| modulusLength: 2048, | ||
| publicKeyEncoding: { type: "spki", format: "der" }, | ||
| privateKeyEncoding: { type: "pkcs8", format: "der" } | ||
| }); |
There was a problem hiding this comment.
The unsealKey function generates a new RSA-2048 keypair on every call for the RecipientInfo flow. RSA key generation is computationally expensive and can take tens to hundreds of milliseconds. This will significantly impact performance if unsealing happens frequently. Consider caching the ephemeral keypair and reusing it for multiple unseal operations, or using a faster key size if security requirements allow.
| if (exp <= now) nonceReplayCache.delete(nonce); | ||
| } | ||
| }; | ||
|
|
There was a problem hiding this comment.
The nonceReplayCache Map grows unbounded and is never automatically pruned except when pruneNonces() is called during sign requests. If sign requests stop coming in but the server continues running, expired nonces will accumulate indefinitely, causing a memory leak. Consider using a setInterval to periodically prune expired nonces, or use a time-based eviction data structure.
| // Periodically prune expired nonces to prevent unbounded growth of the cache | |
| setInterval(pruneNonces, 60_000); |
| /** A zeroed-out 32-byte hex digest, suitable as test input to /internal/sign. */ | ||
| export const ZERO_DIGEST = "a".repeat(64); |
There was a problem hiding this comment.
ZERO_DIGEST is defined as "a".repeat(64) which creates a string of 64 'a' characters. However, this is not actually a "zeroed-out" digest as described in the comment. A zero digest should be "0".repeat(64) (all zeros). The current value represents the hex string "aaaa..." which is not zero.
| function createVsockConnectionViaHelper(cid: number, port: number): net.Socket { | ||
| const child = spawn("/usr/local/bin/vsock-connect", [String(cid), String(port)], { | ||
| stdio: ["pipe", "pipe", "inherit"] | ||
| }); | ||
|
|
||
| // Wrap child stdio as a net.Socket-compatible duplex | ||
| const socket = new net.Socket(); | ||
| socket.connect(0); // not actually used — we override read/write | ||
| // @ts-expect-error — duck-type: wire child's streams into the socket-like object | ||
| socket.pipe = (dest: NodeJS.WritableStream) => { child.stdout.pipe(dest); return dest; }; | ||
| child.stdout.pipe(socket as unknown as NodeJS.WritableStream); | ||
| (socket as unknown as NodeJS.WritableStream).write = (chunk: Buffer | string) => | ||
| child.stdin.write(chunk); | ||
|
|
||
| child.on("exit", () => socket.destroy()); | ||
| return socket; |
There was a problem hiding this comment.
The createVsockConnectionViaHelper function creates a mock socket object that overrides the write method but doesn't properly implement the full Socket interface. The socket.connect(0) call on line 54 will fail because 0 is not a valid port. Additionally, the implementation doesn't handle errors from child.stdin.write properly - it should check the return value and handle backpressure. The pipe operation on line 57 is also incorrect as socket is cast to WritableStream but it's actually a Socket.
| // The vsock-listen helper writes accepted connection fds via SCM_RIGHTS. | ||
| // We use a net.Server with a pre-created vsock listening fd. | ||
| const helperPath = "/usr/local/bin/vsock-listen"; | ||
| const child = spawn(helperPath, [String(port)], { | ||
| stdio: ["inherit", "pipe", "inherit"] | ||
| }); | ||
|
|
||
| const emitter = new net.Server(); | ||
| child.stdout.on("data", (fdMsg: Buffer) => { | ||
| // The helper writes: [u32 fd] for each accepted connection | ||
| const fd = fdMsg.readUInt32LE(0); | ||
| const sock = new net.Socket({ fd, readable: true, writable: true }); | ||
| emitter.emit("connection", sock); | ||
| }); | ||
|
|
||
| return emitter as unknown as VsockServer; | ||
| } |
There was a problem hiding this comment.
The createVsockServer function creates a net.Server emitter but never calls its listen() method, meaning it won't actually bind to any port. Additionally, there's no error handling for the spawned child process, and the emitter is never properly closed when the child exits. The implementation also assumes the vsock-listen helper exists and is executable, but there's no validation or error handling if it doesn't.
| function handleConnection(socket: net.Socket): void { | ||
| let buf = ""; | ||
|
|
||
| socket.on("data", (chunk: Buffer) => { | ||
| buf += chunk.toString(); | ||
| // Each request is a single JSON line terminated with \n | ||
| const newline = buf.indexOf("\n"); | ||
| if (newline === -1) return; | ||
|
|
||
| const line = buf.slice(0, newline); | ||
| buf = buf.slice(newline + 1); |
There was a problem hiding this comment.
The handleConnection function has a security vulnerability. It accumulates data in the 'buf' variable without any size limit, which could lead to a memory exhaustion attack if a malicious client sends a very large request without a newline character. There should be a maximum buffer size check to prevent this denial-of-service vector.
| function encodeCborAttestationRequest(publicKeyDer: Buffer): Buffer { | ||
| const cborNull = Buffer.from([0xf6]); | ||
| const encodeText = (s: string): Buffer => { | ||
| const b = Buffer.from(s, "utf8"); | ||
| if (b.length >= 24) throw new Error("NSM: text key too long"); | ||
| return Buffer.concat([Buffer.from([0x60 | b.length]), b]); | ||
| }; | ||
| const encodeBytes = (b: Buffer): Buffer => { | ||
| if (b.length === 0) return Buffer.from([0x40]); | ||
| if (b.length < 24) return Buffer.concat([Buffer.from([0x40 | b.length]), b]); | ||
| if (b.length < 256) return Buffer.concat([Buffer.from([0x58, b.length]), b]); | ||
| return Buffer.concat([Buffer.from([0x59, b.length >> 8, b.length & 0xff]), b]); | ||
| }; | ||
| const innerMap = Buffer.concat([ | ||
| Buffer.from([0xa3]), | ||
| encodeText("UserData"), cborNull, | ||
| encodeText("Nonce"), cborNull, | ||
| encodeText("PublicKey"), encodeBytes(publicKeyDer) | ||
| ]); | ||
| return Buffer.concat([Buffer.from([0xa1]), encodeText("AttestationDoc"), innerMap]); | ||
| } | ||
|
|
||
| function decodeCborAttestationResponse(buf: Buffer): Buffer { | ||
| let cursor = 0; | ||
| cursor += 1; // skip outer map header 0xa1 | ||
| const keyLen = buf[cursor]! & 0x1f; | ||
| cursor += 1 + keyLen; // skip key | ||
| const majorType = (buf[cursor]! & 0xe0) >> 5; | ||
| if (majorType !== 2) throw new Error("NSM: expected bytes in response"); | ||
| const additionalInfo = buf[cursor]! & 0x1f; | ||
| cursor += 1; | ||
| let docLen: number; | ||
| if (additionalInfo < 24) { | ||
| docLen = additionalInfo; | ||
| } else if (additionalInfo === 24) { | ||
| docLen = buf[cursor]!; cursor += 1; | ||
| } else if (additionalInfo === 25) { | ||
| docLen = (buf[cursor]! << 8) | buf[cursor + 1]!; cursor += 2; | ||
| } else { | ||
| throw new Error("NSM: unsupported bytes length"); | ||
| } | ||
| return buf.subarray(cursor, cursor + docLen); | ||
| } |
There was a problem hiding this comment.
The CBOR encoding and decoding functions lack proper bounds checking. In encodeCborAttestationRequest, the encodeBytes function doesn't handle buffers larger than 65535 bytes (the two-byte length encoding maxes out at 0xFFFF). In decodeCborAttestationResponse, there's no validation that cursor + docLen doesn't exceed buf.length, which could lead to an out-of-bounds read. Additionally, array access like buf[cursor]! uses non-null assertions without checking if cursor is within bounds.
Adds a new `localstack-integration` CI job that spins up LocalStack (via Docker service) and runs a real Encrypt→Decrypt roundtrip through the kms-proxy without any mocks. - test/enclave-aws-nitro/kms-proxy-localstack.test.ts: integration test that auto-skips locally (no LOCALSTACK_ENDPOINT) and activates in CI; creates a throwaway KMS key, exercises the proxy's full JSON framing and base64 codec, and verifies cryptographic roundtrip correctness. - .github/workflows/ci.yml: new `localstack-integration` job using a localstack/localstack Docker service (SERVICES=kms), waits for the health endpoint, then runs only the integration test file. Existing 74 unit tests are unchanged and still pass with 2 new skips. https://claude.ai/code/session_01SfrY68KpQBFG71qu8SAiJk
|
Hint: Recently I was experimenting with AWS Nitro Enclaves too (inspired by your project). You can vibecode enclave code in rust - this is so much more effective. My tests show you can run ~2000 enclaves on single EC2 machine instead of ~150 when you replace Node with Rust. |
Drop-in replacement for the Evervault enclave layer using AWS Nitro Enclaves
hosted on a self-managed EC2 instance. Exposes the identical internal HTTP API
surface so api/src/enclaveClient.ts requires zero changes.
Key design decisions:
same JWT ticket protocol, same secp256k1 Schnorr signing)
Decrypt with NSM attestation (RecipientInfo flow)
vsock-connect.c (enclave→parent, for KMS proxy) and vsock-listen on parent
proxies KMS calls from enclave to AWS KMS (kms-proxy.ts)
and TCP sockets — identical behaviour to the existing enclave dev mode
for setup, EIF build, and enclave launch (analogous to ev enclave build/deploy)
Evervault → Nitro equivalences documented in README.md and nitro.toml.
https://claude.ai/code/session_01SfrY68KpQBFG71qu8SAiJk