-
Notifications
You must be signed in to change notification settings - Fork 10
Feature/anonymizer dev #598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
If you would like me to review this PR, either tag me in a comment or check this box:
You can disable automatic comments on my settings page |
WalkthroughThe PR updates model detection and usage to qwen3-17b and qwen3-06b, simplifies LlamaCpp setup to use the 0.6b model as primary, and expands llama-server CLI flags. The anonymizer flow is refactored to use OpenAI-style function tool calls (structured JSON) instead of manual parsing. CI workflow string quoting and several environment values were standardized, a POSTHOG key was set, .gguf files are ignored, and minor formatting/array changes were made in dependency download logic. A test and a frontend dependency-check interval were also adjusted. Sequence Diagram(s)sequenceDiagram
participant Frontend
participant LlamaCppServer
participant Backend (Go)
participant OllamaClient
Frontend->>LlamaCppServer: ensure models present & start server (qwen3-06b primary)
Frontend->>Backend (Go): send anonymization request
Backend (Go)->>OllamaClient: Anonymize(prompt) with tool definition
OllamaClient->>OllamaClient: send system+user messages + tools to model
OllamaClient->>Backend (Go): receive tool_call with JSON args -> return replacements map
Backend (Go)->>Frontend: respond with anonymized replacements
Estimated code review effort🎯 4 (Complex) | ⏱️ ~35 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the "Integrations" page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
backend/golang/pkg/localmodel/ollama/client_test.go (1)
11-20: Add assertions to verify anonymization behavior.While the test checks for no errors, it should also verify that the anonymization actually occurred and produced expected results.
Add assertions to verify the anonymization results:
func TestNewOllamaClient_Anonymize(t *testing.T) { // t.Skip() client := NewOllamaClient("http://localhost:11435/v1", "qwen3-4b_q4_k_m", log.Default()) prompt := "Im cooking chicken with Elisabeth and David. Then going to the park with Bishal" result, err := client.Anonymize(context.Background(), prompt) assert.NoError(t, err) + assert.NotNil(t, result) + + // Verify that PII entities were detected + assert.Greater(t, len(result), 0, "Expected at least one PII entity to be anonymized") + + // Verify specific entities were detected (optional, depends on model consistency) + for original := range result { + assert.Contains(t, prompt, original, "Original text should contain the PII entity") + } t.Logf("Anonymization result: %+v", result) }
🧹 Nitpick comments (2)
backend/golang/pkg/localmodel/ollama/client.go (1)
92-97: Consider making MaxTokens configurable.The hardcoded limit of 256 tokens might be insufficient for anonymizing longer texts with many entities. Consider making this configurable or dynamically calculating based on input length.
app/src/main/llamaCppServer.ts (1)
245-255: Variable naming inconsistency with model size.The function searches for "qwen3-17b" models but still uses "model4b" variable names and return keys. This creates confusion about what model is actually being used.
Consider updating variable names to match the actual model:
- const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b')) + const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b')) const model06b = ggufs.find((file) => file.toLowerCase().includes('qwen3-06b')) - const model17bPath = model17b ? path.join(modelDir, model17b) : null + const model17bPath = model17b ? path.join(modelDir, model17b) : null const model06bPath = model06b ? path.join(modelDir, model06b) : null log.info(`[LlamaCpp] Found 17b model: ${model17bPath}`) log.info(`[LlamaCpp] Found 0.6b model: ${model06bPath}`) - return { model4b: model17bPath, model06b: model06bPath } + return { model17b: model17bPath, model06b: model06bPath }Note: This would require updating all callers of this function.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (7)
.github/workflows/dev-release.yml(1 hunks)app/src/main/dependenciesDownload.ts(7 hunks)app/src/main/llamaCppServer.ts(3 hunks)app/src/renderer/src/components/setup/DependenciesGate.tsx(2 hunks)backend/golang/.gitignore(1 hunks)backend/golang/pkg/localmodel/ollama/client.go(2 hunks)backend/golang/pkg/localmodel/ollama/client_test.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
backend/golang/pkg/localmodel/ollama/client_test.go (1)
backend/golang/pkg/localmodel/ollama/client.go (1)
NewOllamaClient(61-67)
backend/golang/pkg/localmodel/ollama/client.go (1)
backend/golang/pkg/ai/ai.go (1)
Completion(9-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-build-test
- GitHub Check: lint-build-test
🔇 Additional comments (5)
backend/golang/.gitignore (1)
20-21: LGTM!Appropriate additions to exclude VS Code debug configurations and GGUF model files from version control.
backend/golang/pkg/localmodel/ollama/client.go (2)
21-53: Well-structured tool definition for entity replacement.The tool schema correctly follows OpenAI's function calling specification with appropriate parameter types and constraints.
105-148: Excellent refactoring to use structured tool calls.The new implementation is more robust with proper error handling, structured response parsing, and comprehensive logging. This approach eliminates the fragility of manual JSON extraction.
.github/workflows/dev-release.yml (1)
103-103: LGTM!Enabling the local anonymizer aligns with the PR's objective to implement local anonymization functionality.
app/src/renderer/src/components/setup/DependenciesGate.tsx (1)
211-211: Good improvement to monitoring frequency.Reducing the health check interval to 30 seconds provides more responsive server monitoring without being excessive.
app/src/main/dependenciesDownload.ts
Outdated
| const extractVersionFromUrl = (url: string): string | null => { | ||
| const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/) | ||
| return match ? match[1] : null | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused version extraction and duplicated logic.
The extractVersionFromUrl function is defined but its result is not meaningfully used. The conditional block (lines 178-188) duplicates the exact same logic as lines 189-195.
-const extractVersionFromUrl = (url: string): string | null => {
- const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/)
- return match ? match[1] : null
-}
-And simplify the isDownloaded method:
try {
- const version = extractVersionFromUrl(this.url)
- if (!version) {
- const files = fs.readdirSync(this.dir)
- const ggufs = files.filter((file) => file.endsWith('.gguf'))
-
- const has17bModel = ggufs.some((file) => file.toLowerCase().includes('qwen3-17b'))
-
- const has06bModel = ggufs.some((file) => file.toLowerCase().includes('qwen3-06b'))
-
- return has17bModel && has06bModel
- }
-
const files = fs.readdirSync(this.dir)
const ggufs = files.filter((file) => file.endsWith('.gguf'))Also applies to: 177-195
🤖 Prompt for AI Agents
In app/src/main/dependenciesDownload.ts between lines 13-16 and 177-195, remove
the unused extractVersionFromUrl function and eliminate the duplicated logic
found in the conditional blocks from lines 178-188 and 189-195 by consolidating
them into a single implementation. Additionally, simplify the isDownloaded
method by refactoring it to avoid redundant checks and streamline its logic for
clarity and efficiency.
app/src/main/dependenciesDownload.ts
Outdated
| // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip | ||
| url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-4b_q4_k_m-2025-08-01.zip', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistency between URL comment and model checks.
The comment mentions "qwen3-4b" naming pattern, but the code checks for "qwen3-17b" models. This suggests either the comment is outdated or the wrong model URL is being used.
Update the comment to match the actual model being checked:
- // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip
+ // Naming pattern: qwen3-17b_q4_k_m-YYYY-MM-DD.zip and qwen3-06b_q4_k_m-YYYY-MM-DD.zipOr verify if the correct model URL is being used that contains both 17b and 0.6b models.
Also applies to: 182-183, 192-193
🤖 Prompt for AI Agents
In app/src/main/dependenciesDownload.ts around lines 147-148, the comment
describing the model naming pattern references "qwen3-4b" while the code checks
for "qwen3-17b" models, causing inconsistency. Update the comment to accurately
reflect the model naming pattern used in the code, or verify and correct the URL
to match the intended model version (either 17b or 4b). Apply the same
correction to lines 182-183 and 192-193 to maintain consistency throughout the
file.
app/src/main/llamaCppServer.ts
Outdated
| '-md', | ||
| this.model06bPath || '', | ||
| '--flash-attn', | ||
| '--ctx-size', | ||
| '8192', | ||
| '4096', | ||
| '--cache-type-k', | ||
| 'q4_0', | ||
| 'q8_0', | ||
| '--cache-type-v', | ||
| 'q4_0', | ||
| 'q8_0', | ||
| '-ngl', | ||
| '99', | ||
| '-t', | ||
| '-1', | ||
| '-ngld', | ||
| '32', | ||
| '--draft-max', | ||
| '32', | ||
| '--draft-min', | ||
| '12', | ||
| '--draft-p-min', | ||
| '0.5', | ||
| '-b', | ||
| '2048', | ||
| '8192', | ||
| '--ubatch-size', | ||
| '3072', | ||
| '--parallel', | ||
| '2', | ||
| '--reasoning-format', | ||
| 'none', | ||
| '--reasoning-budget', | ||
| '0', | ||
| '--kv-unified', | ||
| '--mlock', | ||
| '-t', | ||
| '-1', | ||
| '-tb', | ||
| '4', | ||
| '--poll', | ||
| '0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Significant server configuration changes need documentation.
The server parameters have been extensively modified with potential performance implications:
- Context size reduced from 8192 to 4096 (50% reduction)
- Cache quantization changed from q4_0 to q8_0 (higher memory usage)
- Multiple new parameters added without explanation
Consider documenting the rationale for these changes, especially the context size reduction which may impact the model's ability to handle longer conversations.
Also, validate that all new parameters are supported by your llama.cpp version:
🏁 Script executed:
#!/bin/bash
# Check if llama.cpp documentation or help mentions the new parameters
fd -e md -e txt . | xargs rg -i "reasoning-format|reasoning-budget|kv-unified|draft-p-min" || echo "Parameters not found in documentation"Length of output: 177
🌐 Web query:
Does llama.cpp support the command-line flags --reasoning-format, --reasoning-budget, --kv-unified, and --draft-p-min?
💡 Result:
Based on recent public sources, llama.cpp's command-line support for the specified flags is as follows:
Supported Flags:
- --reasoning-format:
This flag is supported. Recent merged pull requests and code discussions for llama.cpp document usage of--reasoning-format, especially related to tool-calling support and models like DeepSeek R1 and Qwen, to control how reasoning or step-by-step thinking is output[2]. You can use values such asnoneto disable or alternative formats to enable structured reasoning output[2].
Not Found / Unconfirmed, Possibly Unsupported:
-
--reasoning-budget:
No current official documentation, merged pull requests, or discussions explicitly reference a flag called--reasoning-budget. Some users discuss influencing "thinking" or reasoning time (e.g., for DeepSeek models) through indirect means like logit bias or custom code, but there is no command-line flag by this name in authoritative sources[10]. -
--kv-unified:
There is no official mention in either documentation, default CLI argument examples, or model usage guides for llama.cpp referring to a--kv-unifiedflag. There are flags managing KV (key-value) cache types (e.g.,--cache-type-k,--ctx-size), but not specifically--kv-unified[1]. If this flag exists, it is undocumented or highly experimental as of the latest available sources. -
--draft-p-min:
No references exist to a--draft-p-minflag in discussions, PRs, or example usages[4]. There is ongoing discussion regarding draft models and adding more CLI flags for draft device control[4], suggesting related features are being explored, but--draft-p-minis not established as a supported flag.
Summary Table
| Flag | Supported? | Notes |
|---|---|---|
| --reasoning-format | Yes | Controls reasoning output format, used for tool-calling, DeepSeek, etc.[2] |
| --reasoning-budget | No | Not documented as CLI flag; no evidence of support[10] |
| --kv-unified | No | Not documented; closest are KV cache-related flags[1] |
| --draft-p-min | No | Not documented; draft model options discussed but not this flag[4] |
If you need to check flags supported by your build, it is advisable to run:
./llama-cli --helpThis will output the complete and current list of supported flags for your compiled version.
If you have a specific use case for any of the unsupported flags, you may need to build custom versions or request the feature upstream.
Citations:
- 1: https://modal.com/docs/examples/llama_cpp
- 2: https://app.semanticdiff.com/gh/ggml-org/llama.cpp/pull/11607/overview
- 3: https://news.ycombinator.com/item?id=42886680
- 4: How to use a second system to run a draft model, and use it with the main model in the primary system? ggml-org/llama.cpp#12928
- 5: https://withcatai.github.io/node-llama-cpp/guide/
- 6: https://huggingface.co/Triangle104/Deep-Reasoning-Llama-3.2-COT-3B-Q5_K_S-GGUF
- 7: https://news.ycombinator.com/item?id=35393284
- 8: https://www.youtube.com/watch?v=r-05yuXTEPE
- 9: https://lmql.ai/docs/models/llama.cpp.html
- 10: Reasoning effort experiments ggml-org/llama.cpp#12339
Review and Update Unsupported CLI Flags & Document Configuration Changes
The new server arguments include several flags that aren’t recognized by the current llama.cpp CLI and will cause startup errors. Please:
-
Document the rationale for the major tuning changes (context size → 4096, cache quantization → q8_0, draft model always enabled, batch sizes, threading, etc.).
-
Remove or replace unsupported flags in
app/src/main/llamaCppServer.ts(around lines 50–65):--reasoning-budget(Not a documented CLI flag)--kv-unified(Not supported by llama.cpp)--draft-p-min(Not recognized by CLI)
-
Retain
--reasoning-format(this is supported). -
Validate all other flags against your compiled binary by running:
./llama-cli --help
This will ensure no invalid parameters slip into production.
🤖 Prompt for AI Agents
In app/src/main/llamaCppServer.ts between lines 41 and 77, some CLI flags like
--reasoning-budget, --kv-unified, and --draft-p-min are not supported by the
current llama.cpp binary and will cause startup errors. Remove these unsupported
flags from the server arguments list. Keep the --reasoning-format flag as it is
supported. Additionally, add comments explaining the reasoning behind key tuning
parameters such as context size set to 4096, cache quantization to q8_0,
enabling draft model, batch sizes, and threading settings. Finally, verify all
remaining flags by running ./llama-cli --help against your compiled binary to
ensure compatibility.
| if (process.env.NODE_ENV === 'development') { | ||
| startLlamaCpp() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for startLlamaCpp call.
The development mode auto-start is a good addition, but should handle potential errors.
if (process.env.NODE_ENV === 'development') {
- startLlamaCpp()
+ startLlamaCpp().catch((error) => {
+ console.error('[DependenciesGate] Failed to start LlamaCpp in development:', error)
+ })
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (process.env.NODE_ENV === 'development') { | |
| startLlamaCpp() | |
| } | |
| if (process.env.NODE_ENV === 'development') { | |
| startLlamaCpp().catch((error) => { | |
| console.error('[DependenciesGate] Failed to start LlamaCpp in development:', error) | |
| }) | |
| } |
🤖 Prompt for AI Agents
In app/src/renderer/src/components/setup/DependenciesGate.tsx around lines 194
to 196, the call to startLlamaCpp() inside the development environment check
lacks error handling. Wrap the startLlamaCpp() call in a try-catch block to
catch any potential errors and handle them appropriately, such as logging the
error or displaying a notification, to ensure the app remains stable during
development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/golang/pkg/localmodel/ollama/client.go (1)
131-138: Consider handling multiple tool calls more robustly.The code only processes the first tool call without checking if multiple tool calls exist. While this may be the expected behavior, consider adding validation or logging when multiple tool calls are present.
if len(response.ToolCalls) == 0 { c.logger.Warn("Anonymization deserialization failed", "error", "no tool calls found in response") return nil, fmt.Errorf("no tool calls found in response") } + if len(response.ToolCalls) > 1 { + c.logger.Warn("Multiple tool calls received, using first one", "count", len(response.ToolCalls)) + } + toolCall := response.ToolCalls[0]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
backend/golang/pkg/localmodel/ollama/client.go(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-release-dev
- GitHub Check: lint-build-test
- GitHub Check: lint-build-test
🔇 Additional comments (3)
backend/golang/pkg/localmodel/ollama/client.go (3)
21-53: Well-structured tool definition for PII anonymization.The
replaceEntitiesTooldefinition follows OpenAI's function calling schema correctly. The schema is comprehensive with proper type definitions, clear descriptions, and appropriate required fields for the anonymization use case.
105-161: Excellent refactor to use structured tool calls.The complete refactor from manual JSON parsing to OpenAI function tool calls significantly improves the robustness and reliability of the anonymization process. The comprehensive system prompt provides clear guidelines, and the structured response handling is much more reliable than manual parsing.
92-95: Verify ToolChoice Union FieldPlease confirm that the
ChatCompletionToolChoiceOptionUnionParamvariant is being used correctly to enforce “required” tool usage:
- File:
backend/golang/pkg/localmodel/ollama/client.go(lines 92–95)ToolChoice: openai.ChatCompletionToolChoiceOptionUnionParam{ OfAuto: openai.String("required"), },- Check the OpenAI Go SDK documentation or source for the
ChatCompletionToolChoiceOptionUnionParamunion to ensure there isn’t anOfRequiredfield or other intended way to specify “required” instead of usingOfAuto.- Update the field variant if necessary so that the SDK actually enforces required tool invocation.
|
|
||
| jsonStr := strings.TrimSpace(content[startIndex : endIndex+1]) | ||
| jsonStr = strings.ReplaceAll(jsonStr, "'", "\"") | ||
| c.logger.Info("Local anonymizer response", "original", prompt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security concern: Logging original prompt may expose PII.
The original prompt is being logged, which could contain the PII that the function is designed to anonymize. This creates a potential data leak in logs.
- c.logger.Info("Local anonymizer response", "original", prompt)
+ c.logger.Info("Local anonymizer response received")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| c.logger.Info("Local anonymizer response", "original", prompt) | |
| c.logger.Info("Local anonymizer response received") |
🤖 Prompt for AI Agents
In backend/golang/pkg/localmodel/ollama/client.go at line 129, the code logs the
original prompt which may contain PII, risking data exposure. Remove or replace
the logging of the original prompt with a safer alternative that does not
include sensitive data, such as logging only metadata or anonymized information
to prevent leaking PII in logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/dev-release.yml (1)
31-31: Pin Node.js version with a minor/patch wildcard for smoother future updates.
Using the bare major version can break builds when Node 21 becomes the default.- node-version: "20" + node-version: "20.x"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
.github/workflows/dev-release.yml(5 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
.github/workflows/dev-release.yml
93-93: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 actionlint (1.7.7)
.github/workflows/dev-release.yml
116-116: the runner of "softprops/action-gh-release@v1" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-build-test
- GitHub Check: lint-build-test
🔇 Additional comments (1)
.github/workflows/dev-release.yml (1)
41-42: Invalid Go version — workflow will fail to resolve1.24.2.Go 1.24 has not been released (current latest is 1.22.x). The action will error out trying to download a non-existent toolchain.
- go-version: "1.24.2" + # Pin to the latest stable Go version or a specific released version. + go-version: "stable"Likely an incorrect or invalid review comment.
| AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
| ENCHANTED_MCP_URL: https://proxy-api-dev.ep-use1.ghostagent.org/mcp | ||
| POSTHOG_API_KEY: '' | ||
| POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard-coded PostHog API key committed — immediate secret leakage.
Secrets must never be committed in plaintext; this key is now publicly exposed and should be revoked.
- POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4
+ POSTHOG_API_KEY: ${{ secrets.POSTHOG_API_KEY }}Action items:
- Revoke the exposed key in PostHog immediately.
- Rotate it and store the new value in the repository’s secrets.
- Force-push a commit that references the secret via
${{ secrets.POSTHOG_API_KEY }}instead of hard-coding it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4 | |
| POSTHOG_API_KEY: ${{ secrets.POSTHOG_API_KEY }} |
🧰 Tools
🪛 Gitleaks (8.27.2)
93-93: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
In .github/workflows/dev-release.yml at line 93, the PostHog API key is
hard-coded, exposing a secret publicly. Immediately revoke the exposed key in
PostHog, generate a new key, and store it securely in the repository's secrets.
Then, replace the hard-coded key in the workflow file with a reference to the
secret using the syntax `${{ secrets.POSTHOG_API_KEY }}` and force-push the
updated commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
app/src/main/dependenciesDownload.ts (2)
13-16: Remove the unusedextractVersionFromUrlfunction.This function is defined but never used, causing TypeScript compilation errors and linting warnings.
-const extractVersionFromUrl = (url: string): string | null => { - const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/) - return match ? match[1] : null -} -
147-149: Fix the inconsistency between URL comment and model checks.The comment mentions "qwen3-4b" but the code now checks for "qwen3-06b" models, and the URL points to a "qwen3-06b" model.
- // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip + // Naming pattern: qwen3-06b-q4_K_M-YYYY-MM-DD.zip - e.g: qwen3-06b-q4_K_M-2025-08-04.zip // url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-4b_q4_k_m-2025-08-01.zip', url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-06b-q4_K_M-2025-08-04.zip',app/src/main/llamaCppServer.ts (1)
43-74: Remove unsupported CLI flags and document configuration changes.Several CLI flags are not supported by llama.cpp and will cause startup failures. Additionally, significant configuration changes need documentation.
Remove the unsupported flags:
'--draft-max', '12', '--draft-min', '8', - '--draft-p-min', - '0.5', '-b', '4096', '--ubatch-size', '1536', '--reasoning-format', 'none', - '--reasoning-budget', - '0', - '--kv-unified', '--mlock',Also document the rationale for major changes:
- Context size reduced from 8192 to 4096 (50% reduction)
- Cache quantization changed from q4_0 to q8_0 (higher memory usage)
- New draft model parameters added
🧹 Nitpick comments (1)
app/src/main/dependenciesDownload.ts (1)
184-184: Remove unused error variable.The
errorvariable in the catch block is defined but not used, causing static analysis warnings.- } catch (error) { + } catch { return false }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
app/src/main/dependenciesDownload.ts(7 hunks)app/src/main/llamaCppServer.ts(3 hunks)
🧰 Additional context used
🪛 GitHub Check: lint-build-test
app/src/main/dependenciesDownload.ts
[warning] 13-13:
'extractVersionFromUrl' is assigned a value but never used
[warning] 184-184:
'error' is defined but never used
🪛 GitHub Actions: Frontend CI (app)
app/src/main/dependenciesDownload.ts
[error] 13-13: TypeScript error TS6133: 'extractVersionFromUrl' is declared but its value is never read.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: lint-build-test
🔇 Additional comments (3)
app/src/main/dependenciesDownload.ts (1)
181-183: LGTM! Simplified model checking logic is more appropriate.The updated logic now only checks for the "qwen3-06b" model, which aligns with the new model usage pattern and eliminates unnecessary complexity.
app/src/main/llamaCppServer.ts (2)
187-194: LGTM! Primary model switch to 0.6b is consistent.The change to use the 0.6b model as the primary model aligns with the dependency management updates and simplifies the server setup.
239-249: Model detection logic updated correctly for new versions.The function now correctly detects "qwen3-17b" and "qwen3-06b" models instead of the previous "4b" and "0.6b" patterns. The variable naming maintains backward compatibility while supporting the new model versions.
Note: The
model4bkey now holds the 17b model path for backward compatibility, which might be confusing but maintains the existing API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
app/src/main/llamaCppServer.ts (1)
43-75: Address the unsupported CLI flags identified in previous review.The server arguments still contain several flags that were identified as unsupported by llama.cpp and will cause startup errors:
--reasoning-budget(lines 62-63)--kv-unified(line 64)--draft-p-min(lines 54-55)Additionally, the significant parameter changes (context size reduction to 4096, cache quantization to q8_0, new threading settings) still lack documentation explaining the rationale.
Remove the unsupported flags and add documentation:
- '--draft-p-min', - '0.5', '-b', '4096', '--ubatch-size', '1536', '--reasoning-format', 'none', - '--reasoning-budget', - '0', - '--kv-unified', '--mlock',
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
app/src/main/dependenciesDownload.ts(7 hunks)app/src/main/llamaCppServer.ts(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- app/src/main/dependenciesDownload.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-and-release-dev
- GitHub Check: lint-build-test
- GitHub Check: lint-build-test
- GitHub Check: dead-code-check
| const { model06b } = findModelFiles(modelDir) | ||
|
|
||
| if (!model4b) { | ||
| log.warn('[LlamaCpp] No 4b GGUF model file found in anonymizer directory, skipping setup') | ||
| if (!model06b) { | ||
| log.warn('[LlamaCpp] No 6gb GGUF model file found in anonymizer directory, skipping setup') | ||
| return | ||
| } | ||
|
|
||
| if (!llamaCppInstance) { | ||
| llamaCppInstance = new LlamaCppServerManager(model4b, model06b, 11435) | ||
| llamaCppInstance = new LlamaCppServerManager(model06b, null, 11435) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clarify misleading parameter naming in constructor call.
The setup logic now passes model06b to the model4bPath parameter, which creates confusion about which model is actually being used. The constructor parameter names don't match the actual model being passed.
Consider either:
- Updating the constructor parameter names to reflect actual usage:
- constructor(model4bPath: string, model06bPath: string | null = null, port: number = 11435) {
- this.model4bPath = model4bPath
+ constructor(primaryModelPath: string, secondaryModelPath: string | null = null, port: number = 11435) {
+ this.primaryModelPath = primaryModelPath- Or passing the correct model based on the parameter name:
- llamaCppInstance = new LlamaCppServerManager(model06b, null, 11435)
+ llamaCppInstance = new LlamaCppServerManager(model17b || model06b, model06b, 11435)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In app/src/main/llamaCppServer.ts around lines 185 to 194, the constructor call
to LlamaCppServerManager passes model06b to a parameter named model4bPath,
causing confusion. To fix this, either rename the constructor parameters to
accurately reflect the models being passed or adjust the arguments in the
constructor call to match the expected parameter names, ensuring the correct
model file is passed to the corresponding parameter.
| const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b')) | ||
|
|
||
| const model06b = ggufs.find( | ||
| (file) => | ||
| file.toLowerCase().includes('qwen') && | ||
| (file.toLowerCase().includes('0.6b') || file.toLowerCase().includes('0.6-b')) | ||
| ) | ||
| const model06b = ggufs.find((file) => file.toLowerCase().includes('qwen3-06b')) | ||
|
|
||
| const model4bPath = model4b ? path.join(modelDir, model4b) : null | ||
| const model17bPath = model17b ? path.join(modelDir, model17b) : null | ||
| const model06bPath = model06b ? path.join(modelDir, model06b) : null | ||
|
|
||
| log.info(`[LlamaCpp] Found 4b model: ${model4bPath}`) | ||
| log.info(`[LlamaCpp] Found 17b model: ${model17bPath}`) | ||
| log.info(`[LlamaCpp] Found 0.6b model: ${model06bPath}`) | ||
|
|
||
| return { model4b: model4bPath, model06b: model06bPath } | ||
| return { model4b: model17bPath, model06b: model06bPath } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix misleading return structure - 17b model returned as model4b.
The function finds a 17b model but returns it under the model4b key, creating significant confusion:
- Variable
model17bPathcontains the 17b model path - Log message correctly identifies it as "17b model"
- But it's returned as
model4bin the result object
This naming inconsistency will confuse future maintainers and could lead to bugs.
Update the return structure to match the actual models:
- return { model4b: model17bPath, model06b: model06bPath }
+ return { model17b: model17bPath, model06b: model06bPath }And update the caller accordingly:
- const { model06b } = findModelFiles(modelDir)
+ const { model17b, model06b } = findModelFiles(modelDir)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In app/src/main/llamaCppServer.ts around lines 239 to 249, the 17b model path is
mistakenly returned under the key `model4b`, which is misleading. Change the
return object key from `model4b` to `model17b` to accurately reflect the model
version. Also, ensure any code calling this function is updated to use the new
`model17b` key instead of `model4b`.
Summary by CodeRabbit
New Features
Improvements
Chores
Tests