-
Notifications
You must be signed in to change notification settings - Fork 0
feat/optional write keys #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
davidkelley
commented
Jan 29, 2026
- chore: add workspace lock symlinks for release-plz
- chore: temporarily force release-plz
- fix: pushed up changes to release workflow
- chore: release (chore: release #56)
- chore(searchlite-cli): release v0.1.2 (chore(searchlite-cli): release v0.1.2 #57)
- fix: update release command
- fix: updated wasm build
- chore(searchlite-wasm): release v0.1.2 (chore(searchlite-wasm): release v0.1.2 #58)
- chore/updated examples (chore/updated examples #60)
- chore: release (chore: release #59)
- remove filters final (remove filters final #62)
- chore: release (chore: release #61)
- perf/optimize wand and http (perf/optimize wand and http #63)
- chore: release (chore: release #64)
- fix: stream ndjson batches with single writer (fix: stream ndjson batches with single writer #65)
- fix: default search request limit and return_stored
- fix: unify doc id validation across ingest and delete paths
- fix(ffi): contain panics across C boundary and stabilize ffi tests
- fix: address remaining limit=0 review feedback
- chore/skills (chore/skills #78)
- feat: vector cap env
- feat: implemented docker release
- chore: release (chore: release #82)
- fix: make ffi/wasm search requests atomic and align return_stored defaults (fix: make ffi/wasm search requests atomic and align return_stored defaults #73)
- chore: release (chore: release #83)
- feat: introduced optional write keys for protecting index files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3c086ba626
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let index_res = if create_if_missing && !path_buf.exists() && write_key.is_some() { | ||
| Index::create_with_write_key( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honor write key when directory exists but index is missing
The open_index helper only calls Index::create_with_write_key when !path_buf.exists(), but Index::open will also create a new index when create_if_missing is true and the manifest is missing—even if the directory already exists. That means searchlite_index_open_with_write_key can silently create an unprotected index whenever the target directory exists (e.g., pre-created or reused path), and subsequent writes won’t require the key. This breaks the write-key security promise for a common setup; the create decision should be based on manifest presence (or retry create-with-key on “index does not exist”) rather than the directory existence check.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds optional “write key” protection for index mutation operations across the core library, HTTP server, CLI, and C FFI, including persisted metadata/bindings to detect tampering.
Changes:
- Introduces write-key KDF + binding utilities and persists write-key metadata/bindings in the manifest/WAL/segment metadata.
- Extends core APIs (create/writer/compact) and updates CLI/HTTP/FFI entrypoints to accept an optional write key.
- Updates OpenAPI and workspace dependencies to reflect the new write-key support.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| searchlite-http/src/lib.rs | Accepts X-Write-Key header for init/write/commit/compact and forwards it into core APIs. |
| searchlite-ffi/src/lib.rs | Adds FFI entrypoints that accept optional write keys for open/add/commit. |
| searchlite-ffi/searchlite.h | Exposes new FFI functions and an error code for write-key failures. |
| searchlite-core/src/util/write_key.rs | Implements write-key hashing/verification and UUID-bound binding utilities. |
| searchlite-core/src/util/mod.rs | Exposes the new write_key util module. |
| searchlite-core/src/index/wal.rs | Adds WAL entry type to persist a write-binding token. |
| searchlite-core/src/index/segment.rs | Persists optional write-binding metadata alongside segments. |
| searchlite-core/src/index/mod.rs | Adds create_with_write_key, writer_with_key, and compact_with_key plumbing. |
| searchlite-core/src/index/manifest.rs | Extends manifest/segment metadata with optional write-key and binding fields. |
| searchlite-core/src/api/writer.rs | Enforces write-key requirements and binding checks when opening a writer and writing segments. |
| searchlite-core/src/api/builder.rs | Adds builder helper for creating an index with an optional write key. |
| searchlite-core/Cargo.toml | Adds crypto/KDF dependencies needed for write-key support. |
| searchlite-cli/src/main.rs | Adds --write-key options to mutating CLI commands and wires them to core APIs. |
| openapi.yaml | Documents X-Write-Key header on relevant endpoints. |
| Cargo.toml | Adds workspace dependencies for crypto/KDF libraries used by write-key support. |
| Cargo.lock | Locks new dependencies added for write-key support. |
| .codex/skills/notify-on-completion/SKILL.md | Adds a new Codex skill document (non-runtime change). |
Comments suppressed due to low confidence (1)
openapi.yaml:326
BulkRequest(and subsequent objects likeDocument,InspectResponse,Manifest, etc.) are currently nested undercomponents.parameters, but they are schemas. This makes the OpenAPI document invalid for tooling; move these back undercomponents.schemasand keep onlyWriteKeyHeaderundercomponents.parameters.
BulkRequest:
type: object
properties:
docs:
type: array
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| HttpError::from_anyhow( | ||
| "writer_open", | ||
| if write_key.is_some() { | ||
| StatusCode::FORBIDDEN | ||
| } else { | ||
| StatusCode::UNAUTHORIZED | ||
| }, | ||
| e, | ||
| ) |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writer_with_key failures can be caused by non-auth problems (I/O, corruption, etc.), but this mapping turns any error into 401/403 purely based on whether the client provided a key. Consider mapping 401/403 only when the error indicates a missing/invalid write key, and otherwise returning 500 to avoid masking real server failures.
| HttpError::from_anyhow( | |
| "writer_open", | |
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| }, | |
| e, | |
| ) | |
| // Only treat errors that look like write-key auth failures as 401/403. | |
| // All other errors are treated as internal server errors to avoid | |
| // masking real server-side problems (I/O, corruption, etc.). | |
| let status = if e.to_string().to_lowercase().contains("write key") { | |
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| } | |
| } else { | |
| StatusCode::INTERNAL_SERVER_ERROR | |
| }; | |
| HttpError::from_anyhow("writer_open", status, e) |
| HttpError::from_anyhow( | ||
| "writer_open", | ||
| if write_key.is_some() { | ||
| StatusCode::FORBIDDEN | ||
| } else { | ||
| StatusCode::UNAUTHORIZED | ||
| }, | ||
| e, | ||
| ) |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as other writer open paths: mapping all writer_with_key errors to 401/403 based only on header presence will misclassify internal failures. Consider distinguishing auth errors (missing/invalid key) from other errors and returning 500 for non-auth failures.
| HttpError::from_anyhow( | |
| "writer_open", | |
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| }, | |
| e, | |
| ) | |
| let msg = e.to_string(); | |
| let status = if msg.contains("write key") | |
| || msg.contains("unauthorized") | |
| || msg.contains("forbidden") | |
| { | |
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| } | |
| } else { | |
| StatusCode::INTERNAL_SERVER_ERROR | |
| }; | |
| HttpError::from_anyhow("writer_open", status, anyhow::anyhow!(msg)) |
| HttpError::from_anyhow( | ||
| "writer_open", | ||
| if write_key.is_some() { | ||
| StatusCode::FORBIDDEN | ||
| } else { | ||
| StatusCode::UNAUTHORIZED | ||
| }, |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as other writer open paths: mapping all writer_with_key errors to 401/403 based only on header presence will misclassify internal failures. Consider distinguishing auth errors (missing/invalid key) from other errors and returning 500 for non-auth failures.
| HttpError::from_anyhow( | |
| "writer_open", | |
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| }, | |
| // Errors from opening the writer here are treated as internal failures. | |
| HttpError::from_anyhow( | |
| "writer_open", | |
| StatusCode::INTERNAL_SERVER_ERROR, |
| .filter_map(|b64| BASE64.decode(b64).ok()) | ||
| .collect(); |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_binding_b64 decode failures are silently ignored via .ok(). If the field is present but invalid base64, that indicates corruption/tampering and should error out (otherwise a bad/modified binding can be dropped and bypass the binding check).
| .filter_map(|b64| BASE64.decode(b64).ok()) | |
| .collect(); | |
| .map(|b64| { | |
| BASE64 | |
| .decode(b64) | |
| .map_err(|e| anyhow!("invalid base64 in segment write_binding_b64: {e}")) | |
| }) | |
| .collect::<Result<Vec<_>>>()?; |
| let index_res = if create_if_missing && !path_buf.exists() && write_key.is_some() { | ||
| Index::create_with_write_key( | ||
| &path_buf, | ||
| Schema::default_text_body(), | ||
| opts, |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This create_if_missing branch keys off !path_buf.exists(). If the directory exists but MANIFEST.json does not (e.g., empty dir), providing a write_key will still fall through to Index::open and create an unprotected index. Consider checking for MANIFEST.json existence instead so the write key is honored whenever the index is actually being created.
| manifest.write_key.is_some() || wal_binding.is_some() || !segments_binding.is_empty(); | ||
| if binding_required { | ||
| let key = write_key.ok_or_else(|| anyhow!("write key required for this index"))?; | ||
| if let Some(meta) = manifest.write_key.as_ref() { | ||
| verify_write_key(key, meta)?; |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new write-key enforcement/binding logic is security-sensitive, but there are no unit tests in this crate exercising: (1) creating an index with a write key, (2) failing writes/commit/compact without a key, and (3) failing with a wrong key. Adding direct core-level tests (in addition to the FFI test) would help prevent regressions.
| if write_key.is_some() { | ||
| StatusCode::FORBIDDEN | ||
| } else { | ||
| StatusCode::UNAUTHORIZED | ||
| }, |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maps any failure from the blocking commit task to 401/403 purely based on whether a key header was present. Commit can fail for non-auth reasons (fsync, disk full, corruption), which should surface as 500; only return 401/403 when the failure is specifically missing/invalid write key.
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| }, | |
| StatusCode::INTERNAL_SERVER_ERROR, |
| if write_key.is_some() { | ||
| StatusCode::FORBIDDEN | ||
| } else { | ||
| StatusCode::UNAUTHORIZED | ||
| }, |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maps any failure from compaction to 401/403 purely based on whether a key header was present. Compaction can fail for many non-auth reasons; consider returning 500 for non-auth failures and reserving 401/403 for missing/invalid write key cases.
| if write_key.is_some() { | |
| StatusCode::FORBIDDEN | |
| } else { | |
| StatusCode::UNAUTHORIZED | |
| }, | |
| StatusCode::INTERNAL_SERVER_ERROR, |
| const DEFAULT_M_COST: u32 = 19; // ~512 MiB | ||
| const DEFAULT_T_COST: u32 = 2; | ||
| const DEFAULT_P_COST: u32 = 1; | ||
| const DEFAULT_OUTPUT_LEN: u32 = 32; | ||
|
|
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_M_COST is set to 19 with a comment indicating "~512 MiB", but Argon2 Params::new expects memory cost in KiB. If the intent is ~512 MiB, this should be ~524_288 KiB (or rename/comment to reflect the real cost) to avoid substantially weakening the KDF.
| let mut segments_binding: Vec<Vec<u8>> = manifest | ||
| .segments | ||
| .iter() | ||
| .filter_map(|seg| seg.write_binding_b64.as_deref()) | ||
| .filter_map(|b64| BASE64.decode(b64).ok()) | ||
| .collect(); | ||
| for seg in manifest.segments.iter() { | ||
| if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) { | ||
| if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) { | ||
| if let Some(b64) = seg_meta.write_binding_b64.as_deref() { | ||
| if let Ok(decoded) = BASE64.decode(b64) { | ||
| segments_binding.push(decoded); | ||
| } |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_binding_b64 decode failures are silently ignored via .ok(). If a segment/manifest contains an invalid binding string, that should be treated as corruption/tampering (error) rather than being dropped, otherwise binding enforcement can be bypassed.
| let mut segments_binding: Vec<Vec<u8>> = manifest | |
| .segments | |
| .iter() | |
| .filter_map(|seg| seg.write_binding_b64.as_deref()) | |
| .filter_map(|b64| BASE64.decode(b64).ok()) | |
| .collect(); | |
| for seg in manifest.segments.iter() { | |
| if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) { | |
| if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) { | |
| if let Some(b64) = seg_meta.write_binding_b64.as_deref() { | |
| if let Ok(decoded) = BASE64.decode(b64) { | |
| segments_binding.push(decoded); | |
| } | |
| let mut segments_binding: Vec<Vec<u8>> = Vec::new(); | |
| for seg in manifest.segments.iter() { | |
| if let Some(b64) = seg.write_binding_b64.as_deref() { | |
| let decoded = BASE64 | |
| .decode(b64) | |
| .map_err(|e| anyhow!("invalid base64 in segment manifest write_binding_b64: {e}"))?; | |
| segments_binding.push(decoded); | |
| } | |
| } | |
| for seg in manifest.segments.iter() { | |
| if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) { | |
| if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) { | |
| if let Some(b64) = seg_meta.write_binding_b64.as_deref() { | |
| let decoded = BASE64 | |
| .decode(b64) | |
| .map_err(|e| anyhow!("invalid base64 in segment metadata write_binding_b64: {e}"))?; | |
| segments_binding.push(decoded); |