Skip to content

Conversation

@davidkelley
Copy link
Owner

Copilot AI review requested due to automatic review settings January 29, 2026 15:13
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c086ba626

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +128 to +129
let index_res = if create_if_missing && !path_buf.exists() && write_key.is_some() {
Index::create_with_write_key(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor write key when directory exists but index is missing

The open_index helper only calls Index::create_with_write_key when !path_buf.exists(), but Index::open will also create a new index when create_if_missing is true and the manifest is missing—even if the directory already exists. That means searchlite_index_open_with_write_key can silently create an unprotected index whenever the target directory exists (e.g., pre-created or reused path), and subsequent writes won’t require the key. This breaks the write-key security promise for a common setup; the create decision should be based on manifest presence (or retry create-with-key on “index does not exist”) rather than the directory existence check.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional “write key” protection for index mutation operations across the core library, HTTP server, CLI, and C FFI, including persisted metadata/bindings to detect tampering.

Changes:

  • Introduces write-key KDF + binding utilities and persists write-key metadata/bindings in the manifest/WAL/segment metadata.
  • Extends core APIs (create/writer/compact) and updates CLI/HTTP/FFI entrypoints to accept an optional write key.
  • Updates OpenAPI and workspace dependencies to reflect the new write-key support.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
searchlite-http/src/lib.rs Accepts X-Write-Key header for init/write/commit/compact and forwards it into core APIs.
searchlite-ffi/src/lib.rs Adds FFI entrypoints that accept optional write keys for open/add/commit.
searchlite-ffi/searchlite.h Exposes new FFI functions and an error code for write-key failures.
searchlite-core/src/util/write_key.rs Implements write-key hashing/verification and UUID-bound binding utilities.
searchlite-core/src/util/mod.rs Exposes the new write_key util module.
searchlite-core/src/index/wal.rs Adds WAL entry type to persist a write-binding token.
searchlite-core/src/index/segment.rs Persists optional write-binding metadata alongside segments.
searchlite-core/src/index/mod.rs Adds create_with_write_key, writer_with_key, and compact_with_key plumbing.
searchlite-core/src/index/manifest.rs Extends manifest/segment metadata with optional write-key and binding fields.
searchlite-core/src/api/writer.rs Enforces write-key requirements and binding checks when opening a writer and writing segments.
searchlite-core/src/api/builder.rs Adds builder helper for creating an index with an optional write key.
searchlite-core/Cargo.toml Adds crypto/KDF dependencies needed for write-key support.
searchlite-cli/src/main.rs Adds --write-key options to mutating CLI commands and wires them to core APIs.
openapi.yaml Documents X-Write-Key header on relevant endpoints.
Cargo.toml Adds workspace dependencies for crypto/KDF libraries used by write-key support.
Cargo.lock Locks new dependencies added for write-key support.
.codex/skills/notify-on-completion/SKILL.md Adds a new Codex skill document (non-runtime change).
Comments suppressed due to low confidence (1)

openapi.yaml:326

  • BulkRequest (and subsequent objects like Document, InspectResponse, Manifest, etc.) are currently nested under components.parameters, but they are schemas. This makes the OpenAPI document invalid for tooling; move these back under components.schemas and keep only WriteKeyHeader under components.parameters.
    BulkRequest:
      type: object
      properties:
        docs:
          type: array

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +530 to +538
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
e,
)
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writer_with_key failures can be caused by non-auth problems (I/O, corruption, etc.), but this mapping turns any error into 401/403 purely based on whether the client provided a key. Consider mapping 401/403 only when the error indicates a missing/invalid write key, and otherwise returning 500 to avoid masking real server failures.

Suggested change
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
e,
)
// Only treat errors that look like write-key auth failures as 401/403.
// All other errors are treated as internal server errors to avoid
// masking real server-side problems (I/O, corruption, etc.).
let status = if e.to_string().to_lowercase().contains("write key") {
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
}
} else {
StatusCode::INTERNAL_SERVER_ERROR
};
HttpError::from_anyhow("writer_open", status, e)

Copilot uses AI. Check for mistakes.
Comment on lines +727 to +735
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
e,
)
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as other writer open paths: mapping all writer_with_key errors to 401/403 based only on header presence will misclassify internal failures. Consider distinguishing auth errors (missing/invalid key) from other errors and returning 500 for non-auth failures.

Suggested change
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
e,
)
let msg = e.to_string();
let status = if msg.contains("write key")
|| msg.contains("unauthorized")
|| msg.contains("forbidden")
{
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
}
} else {
StatusCode::INTERNAL_SERVER_ERROR
};
HttpError::from_anyhow("writer_open", status, anyhow::anyhow!(msg))

Copilot uses AI. Check for mistakes.
Comment on lines +785 to +791
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as other writer open paths: mapping all writer_with_key errors to 401/403 based only on header presence will misclassify internal failures. Consider distinguishing auth errors (missing/invalid key) from other errors and returning 500 for non-auth failures.

Suggested change
HttpError::from_anyhow(
"writer_open",
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
// Errors from opening the writer here are treated as internal failures.
HttpError::from_anyhow(
"writer_open",
StatusCode::INTERNAL_SERVER_ERROR,

Copilot uses AI. Check for mistakes.
Comment on lines +151 to +152
.filter_map(|b64| BASE64.decode(b64).ok())
.collect();
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_binding_b64 decode failures are silently ignored via .ok(). If the field is present but invalid base64, that indicates corruption/tampering and should error out (otherwise a bad/modified binding can be dropped and bypass the binding check).

Suggested change
.filter_map(|b64| BASE64.decode(b64).ok())
.collect();
.map(|b64| {
BASE64
.decode(b64)
.map_err(|e| anyhow!("invalid base64 in segment write_binding_b64: {e}"))
})
.collect::<Result<Vec<_>>>()?;

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +132
let index_res = if create_if_missing && !path_buf.exists() && write_key.is_some() {
Index::create_with_write_key(
&path_buf,
Schema::default_text_body(),
opts,
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This create_if_missing branch keys off !path_buf.exists(). If the directory exists but MANIFEST.json does not (e.g., empty dir), providing a write_key will still fall through to Index::open and create an unprotected index. Consider checking for MANIFEST.json existence instead so the write key is honored whenever the index is actually being created.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +86
manifest.write_key.is_some() || wal_binding.is_some() || !segments_binding.is_empty();
if binding_required {
let key = write_key.ok_or_else(|| anyhow!("write key required for this index"))?;
if let Some(meta) = manifest.write_key.as_ref() {
verify_write_key(key, meta)?;
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new write-key enforcement/binding logic is security-sensitive, but there are no unit tests in this crate exercising: (1) creating an index with a write key, (2) failing writes/commit/compact without a key, and (3) failing with a wrong key. Adding direct core-level tests (in addition to the FFI test) would help prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +846 to +850
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maps any failure from the blocking commit task to 401/403 purely based on whether a key header was present. Commit can fail for non-auth reasons (fsync, disk full, corruption), which should surface as 500; only return 401/403 when the failure is specifically missing/invalid write key.

Suggested change
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
StatusCode::INTERNAL_SERVER_ERROR,

Copilot uses AI. Check for mistakes.
Comment on lines +905 to +909
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maps any failure from compaction to 401/403 purely based on whether a key header was present. Compaction can fail for many non-auth reasons; consider returning 500 for non-auth failures and reserving 401/403 for missing/invalid write key cases.

Suggested change
if write_key.is_some() {
StatusCode::FORBIDDEN
} else {
StatusCode::UNAUTHORIZED
},
StatusCode::INTERNAL_SERVER_ERROR,

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +29
const DEFAULT_M_COST: u32 = 19; // ~512 MiB
const DEFAULT_T_COST: u32 = 2;
const DEFAULT_P_COST: u32 = 1;
const DEFAULT_OUTPUT_LEN: u32 = 32;

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEFAULT_M_COST is set to 19 with a comment indicating "~512 MiB", but Argon2 Params::new expects memory cost in KiB. If the intent is ~512 MiB, this should be ~524_288 KiB (or rename/comment to reflect the real cost) to avoid substantially weakening the KDF.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +76
let mut segments_binding: Vec<Vec<u8>> = manifest
.segments
.iter()
.filter_map(|seg| seg.write_binding_b64.as_deref())
.filter_map(|b64| BASE64.decode(b64).ok())
.collect();
for seg in manifest.segments.iter() {
if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) {
if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) {
if let Some(b64) = seg_meta.write_binding_b64.as_deref() {
if let Ok(decoded) = BASE64.decode(b64) {
segments_binding.push(decoded);
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_binding_b64 decode failures are silently ignored via .ok(). If a segment/manifest contains an invalid binding string, that should be treated as corruption/tampering (error) rather than being dropped, otherwise binding enforcement can be bypassed.

Suggested change
let mut segments_binding: Vec<Vec<u8>> = manifest
.segments
.iter()
.filter_map(|seg| seg.write_binding_b64.as_deref())
.filter_map(|b64| BASE64.decode(b64).ok())
.collect();
for seg in manifest.segments.iter() {
if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) {
if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) {
if let Some(b64) = seg_meta.write_binding_b64.as_deref() {
if let Ok(decoded) = BASE64.decode(b64) {
segments_binding.push(decoded);
}
let mut segments_binding: Vec<Vec<u8>> = Vec::new();
for seg in manifest.segments.iter() {
if let Some(b64) = seg.write_binding_b64.as_deref() {
let decoded = BASE64
.decode(b64)
.map_err(|e| anyhow!("invalid base64 in segment manifest write_binding_b64: {e}"))?;
segments_binding.push(decoded);
}
}
for seg in manifest.segments.iter() {
if let Ok(bytes) = inner.storage.read_to_end(Path::new(&seg.paths.meta)) {
if let Ok(seg_meta) = serde_json::from_slice::<SegmentFileMeta>(&bytes) {
if let Some(b64) = seg_meta.write_binding_b64.as_deref() {
let decoded = BASE64
.decode(b64)
.map_err(|e| anyhow!("invalid base64 in segment metadata write_binding_b64: {e}"))?;
segments_binding.push(decoded);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants