Feature/anonymizer dev #598

gluonfield · 2025-08-01T21:37:59Z

Local anonymiser model

Summary by CodeRabbit

New Features
- Support for additional model variants (17b and 0.6b) and expanded server startup options for improved reasoning and performance tuning.
Improvements
- Streamlined model presence checks and reduced server status polling to 30s.
- Conditional automatic start of the local model server in development mode after downloads complete.
- Anonymization now uses structured tool-based responses for more reliable PII replacement.
Chores
- Added .gguf to ignore list and updated dev workflow environment values and metadata.
Tests
- Updated anonymization test configuration (still skipped).

…date logic to startup llamma

mentatbot · 2025-08-01T21:38:05Z

If you would like me to review this PR, either tag me in a comment or check this box:

Review PR

_{You can disable automatic comments on my settings page}

coderabbitai · 2025-08-01T21:38:06Z

Walkthrough

The PR updates model detection and usage to qwen3-17b and qwen3-06b, simplifies LlamaCpp setup to use the 0.6b model as primary, and expands llama-server CLI flags. The anonymizer flow is refactored to use OpenAI-style function tool calls (structured JSON) instead of manual parsing. CI workflow string quoting and several environment values were standardized, a POSTHOG key was set, .gguf files are ignored, and minor formatting/array changes were made in dependency download logic. A test and a frontend dependency-check interval were also adjusted.

Sequence Diagram(s)

sequenceDiagram
    participant Frontend
    participant LlamaCppServer
    participant Backend (Go)
    participant OllamaClient

    Frontend->>LlamaCppServer: ensure models present & start server (qwen3-06b primary)
    Frontend->>Backend (Go): send anonymization request
    Backend (Go)->>OllamaClient: Anonymize(prompt) with tool definition
    OllamaClient->>OllamaClient: send system+user messages + tools to model
    OllamaClient->>Backend (Go): receive tool_call with JSON args -> return replacements map
    Backend (Go)->>Frontend: respond with anonymized replacements

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Poem

In the burrow I nibble on code and on thyme,
Swapped models and tools — now the workflow's in rhyme.
I shepherd lost PII into neat little holes,
Start servers at dawn and update the roles.
Hop, patch, and commit — this rabbit's on time! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the "Integrations" page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 83843a4 and a45b8a0.

📒 Files selected for processing (1)

backend/golang/pkg/localmodel/ollama/client.go (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

backend/golang/pkg/localmodel/ollama/client.go

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/anonymizer-dev

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🔭 Outside diff range comments (1)

backend/golang/pkg/localmodel/ollama/client_test.go (1)

11-20: Add assertions to verify anonymization behavior.

While the test checks for no errors, it should also verify that the anonymization actually occurred and produced expected results.

Add assertions to verify the anonymization results:

 func TestNewOllamaClient_Anonymize(t *testing.T) {
 	// t.Skip()
 	client := NewOllamaClient("http://localhost:11435/v1", "qwen3-4b_q4_k_m", log.Default())
 
 	prompt := "Im cooking chicken with Elisabeth and David. Then going to the park with Bishal"
 	result, err := client.Anonymize(context.Background(), prompt)
 	assert.NoError(t, err)
+	assert.NotNil(t, result)
+	
+	// Verify that PII entities were detected
+	assert.Greater(t, len(result), 0, "Expected at least one PII entity to be anonymized")
+	
+	// Verify specific entities were detected (optional, depends on model consistency)
+	for original := range result {
+		assert.Contains(t, prompt, original, "Original text should contain the PII entity")
+	}
 
 	t.Logf("Anonymization result: %+v", result)
 }

🧹 Nitpick comments (2)

backend/golang/pkg/localmodel/ollama/client.go (1)

92-97: Consider making MaxTokens configurable.

The hardcoded limit of 256 tokens might be insufficient for anonymizing longer texts with many entities. Consider making this configurable or dynamically calculating based on input length.
app/src/main/llamaCppServer.ts (1)
245-255: Variable naming inconsistency with model size.

The function searches for "qwen3-17b" models but still uses "model4b" variable names and return keys. This creates confusion about what model is actually being used.

Consider updating variable names to match the actual model:
-    const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b'))
+    const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b'))

     const model06b = ggufs.find((file) => file.toLowerCase().includes('qwen3-06b'))

-    const model17bPath = model17b ? path.join(modelDir, model17b) : null
+    const model17bPath = model17b ? path.join(modelDir, model17b) : null
     const model06bPath = model06b ? path.join(modelDir, model06b) : null

     log.info(`[LlamaCpp] Found 17b model: ${model17bPath}`)
     log.info(`[LlamaCpp] Found 0.6b model: ${model06bPath}`)

-    return { model4b: model17bPath, model06b: model06bPath }
+    return { model17b: model17bPath, model06b: model06bPath }
Note: This would require updating all callers of this function.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3f14f and cd87e10.

📒 Files selected for processing (7)

.github/workflows/dev-release.yml (1 hunks)
app/src/main/dependenciesDownload.ts (7 hunks)
app/src/main/llamaCppServer.ts (3 hunks)
app/src/renderer/src/components/setup/DependenciesGate.tsx (2 hunks)
backend/golang/.gitignore (1 hunks)
backend/golang/pkg/localmodel/ollama/client.go (2 hunks)
backend/golang/pkg/localmodel/ollama/client_test.go (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

backend/golang/pkg/localmodel/ollama/client_test.go (1)

backend/golang/pkg/localmodel/ollama/client.go (1)

NewOllamaClient (61-67)

backend/golang/pkg/localmodel/ollama/client.go (1)

backend/golang/pkg/ai/ai.go (1)

Completion (9-11)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: lint-build-test
GitHub Check: lint-build-test

🔇 Additional comments (5)

backend/golang/.gitignore (1)

20-21: LGTM!

Appropriate additions to exclude VS Code debug configurations and GGUF model files from version control.

backend/golang/pkg/localmodel/ollama/client.go (2)

21-53: Well-structured tool definition for entity replacement.

The tool schema correctly follows OpenAI's function calling specification with appropriate parameter types and constraints.

105-148: Excellent refactoring to use structured tool calls.

The new implementation is more robust with proper error handling, structured response parsing, and comprehensive logging. This approach eliminates the fragility of manual JSON extraction.

.github/workflows/dev-release.yml (1)

103-103: LGTM!

Enabling the local anonymizer aligns with the PR's objective to implement local anonymization functionality.

app/src/renderer/src/components/setup/DependenciesGate.tsx (1)

211-211: Good improvement to monitoring frequency.

Reducing the health check interval to 30 seconds provides more responsive server monitoring without being excessive.

coderabbitai · 2025-08-01T21:43:04Z

app/src/main/dependenciesDownload.ts

+const extractVersionFromUrl = (url: string): string | null => {
+  const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/)
+  return match ? match[1] : null
+}


🛠️ Refactor suggestion

Remove unused version extraction and duplicated logic.

The extractVersionFromUrl function is defined but its result is not meaningfully used. The conditional block (lines 178-188) duplicates the exact same logic as lines 189-195.

-const extractVersionFromUrl = (url: string): string | null => { - const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/) - return match ? match[1] : null -} -

And simplify the isDownloaded method:

try { - const version = extractVersionFromUrl(this.url) - if (!version) { - const files = fs.readdirSync(this.dir) - const ggufs = files.filter((file) => file.endsWith('.gguf')) - - const has17bModel = ggufs.some((file) => file.toLowerCase().includes('qwen3-17b')) - - const has06bModel = ggufs.some((file) => file.toLowerCase().includes('qwen3-06b')) - - return has17bModel && has06bModel - } - const files = fs.readdirSync(this.dir) const ggufs = files.filter((file) => file.endsWith('.gguf'))

Also applies to: 177-195

🤖 Prompt for AI Agents

In app/src/main/dependenciesDownload.ts between lines 13-16 and 177-195, remove the unused extractVersionFromUrl function and eliminate the duplicated logic found in the conditional blocks from lines 178-188 and 189-195 by consolidating them into a single implementation. Additionally, simplify the isDownloaded method by refactoring it to avoid redundant checks and streamline its logic for clarity and efficiency.

coderabbitai · 2025-08-01T21:43:04Z

app/src/main/dependenciesDownload.ts

+    // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip
+    url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-4b_q4_k_m-2025-08-01.zip',


⚠️ Potential issue

Inconsistency between URL comment and model checks.

The comment mentions "qwen3-4b" naming pattern, but the code checks for "qwen3-17b" models. This suggests either the comment is outdated or the wrong model URL is being used.

Update the comment to match the actual model being checked:

- // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip + // Naming pattern: qwen3-17b_q4_k_m-YYYY-MM-DD.zip and qwen3-06b_q4_k_m-YYYY-MM-DD.zip

Or verify if the correct model URL is being used that contains both 17b and 0.6b models.

Also applies to: 182-183, 192-193

🤖 Prompt for AI Agents

In app/src/main/dependenciesDownload.ts around lines 147-148, the comment describing the model naming pattern references "qwen3-4b" while the code checks for "qwen3-17b" models, causing inconsistency. Update the comment to accurately reflect the model naming pattern used in the code, or verify and correct the URL to match the intended model version (either 17b or 4b). Apply the same correction to lines 182-183 and 192-193 to maintain consistency throughout the file.

coderabbitai · 2025-08-01T21:43:04Z

app/src/main/llamaCppServer.ts

+      '-md',
+      this.model06bPath || '',
      '--flash-attn',
      '--ctx-size',
-      '8192',
+      '4096',
      '--cache-type-k',
-      'q4_0',
+      'q8_0',
      '--cache-type-v',
-      'q4_0',
+      'q8_0',
      '-ngl',
      '99',
-      '-t',
-      '-1',
+      '-ngld',
+      '32',
+      '--draft-max',
+      '32',
+      '--draft-min',
+      '12',
+      '--draft-p-min',
+      '0.5',
      '-b',
-      '2048',
+      '8192',
+      '--ubatch-size',
+      '3072',
+      '--parallel',
+      '2',
+      '--reasoning-format',
+      'none',
+      '--reasoning-budget',
+      '0',
+      '--kv-unified',
      '--mlock',
+      '-t',
+      '-1',
+      '-tb',
+      '4',
+      '--poll',
+      '0',


💡 Verification agent

🧩 Analysis chain

Significant server configuration changes need documentation.

The server parameters have been extensively modified with potential performance implications:

Context size reduced from 8192 to 4096 (50% reduction)

Cache quantization changed from q4_0 to q8_0 (higher memory usage)

Multiple new parameters added without explanation

Consider documenting the rationale for these changes, especially the context size reduction which may impact the model's ability to handle longer conversations.

Also, validate that all new parameters are supported by your llama.cpp version:

🏁 Script executed:

#!/bin/bash # Check if llama.cpp documentation or help mentions the new parameters fd -e md -e txt . | xargs rg -i "reasoning-format|reasoning-budget|kv-unified|draft-p-min" || echo "Parameters not found in documentation"

Length of output: 177

🌐 Web query:

Does llama.cpp support the command-line flags --reasoning-format, --reasoning-budget, --kv-unified, and --draft-p-min?

💡 Result:

Based on recent public sources, llama.cpp's command-line support for the specified flags is as follows:

Supported Flags:

--reasoning-format:
This flag is supported. Recent merged pull requests and code discussions for llama.cpp document usage of --reasoning-format, especially related to tool-calling support and models like DeepSeek R1 and Qwen, to control how reasoning or step-by-step thinking is output[2]. You can use values such as none to disable or alternative formats to enable structured reasoning output[2].

Not Found / Unconfirmed, Possibly Unsupported:

--reasoning-budget:
No current official documentation, merged pull requests, or discussions explicitly reference a flag called --reasoning-budget. Some users discuss influencing "thinking" or reasoning time (e.g., for DeepSeek models) through indirect means like logit bias or custom code, but there is no command-line flag by this name in authoritative sources[10].

--kv-unified:
There is no official mention in either documentation, default CLI argument examples, or model usage guides for llama.cpp referring to a --kv-unified flag. There are flags managing KV (key-value) cache types (e.g., --cache-type-k, --ctx-size), but not specifically --kv-unified[1]. If this flag exists, it is undocumented or highly experimental as of the latest available sources.

--draft-p-min:
No references exist to a --draft-p-min flag in discussions, PRs, or example usages[4]. There is ongoing discussion regarding draft models and adding more CLI flags for draft device control[4], suggesting related features are being explored, but --draft-p-min is not established as a supported flag.

Summary Table

Flag Supported? Notes

--reasoning-format Yes Controls reasoning output format, used for tool-calling, DeepSeek, etc.[2]

--reasoning-budget No Not documented as CLI flag; no evidence of support[10]

--kv-unified No Not documented; closest are KV cache-related flags[1]

--draft-p-min No Not documented; draft model options discussed but not this flag[4]

If you need to check flags supported by your build, it is advisable to run:

./llama-cli --help

This will output the complete and current list of supported flags for your compiled version.

If you have a specific use case for any of the unsupported flags, you may need to build custom versions or request the feature upstream.

Citations:

1: https://modal.com/docs/examples/llama_cpp

2: https://app.semanticdiff.com/gh/ggml-org/llama.cpp/pull/11607/overview

3: https://news.ycombinator.com/item?id=42886680

4: How to use a second system to run a draft model, and use it with the main model in the primary system? ggml-org/llama.cpp#12928

5: https://withcatai.github.io/node-llama-cpp/guide/

6: https://huggingface.co/Triangle104/Deep-Reasoning-Llama-3.2-COT-3B-Q5_K_S-GGUF

7: https://news.ycombinator.com/item?id=35393284

8: https://www.youtube.com/watch?v=r-05yuXTEPE

9: https://lmql.ai/docs/models/llama.cpp.html

10: Reasoning effort experiments ggml-org/llama.cpp#12339

Review and Update Unsupported CLI Flags & Document Configuration Changes

The new server arguments include several flags that aren’t recognized by the current llama.cpp CLI and will cause startup errors. Please:

Document the rationale for the major tuning changes (context size → 4096, cache quantization → q8_0, draft model always enabled, batch sizes, threading, etc.).

Remove or replace unsupported flags in app/src/main/llamaCppServer.ts (around lines 50–65):

--reasoning-budget (Not a documented CLI flag)

--kv-unified (Not supported by llama.cpp)

--draft-p-min (Not recognized by CLI)

Retain --reasoning-format (this is supported).

Validate all other flags against your compiled binary by running:

./llama-cli --help

This will ensure no invalid parameters slip into production.

🤖 Prompt for AI Agents

In app/src/main/llamaCppServer.ts between lines 41 and 77, some CLI flags like --reasoning-budget, --kv-unified, and --draft-p-min are not supported by the current llama.cpp binary and will cause startup errors. Remove these unsupported flags from the server arguments list. Keep the --reasoning-format flag as it is supported. Additionally, add comments explaining the reasoning behind key tuning parameters such as context size set to 4096, cache quantization to q8_0, enabling draft model, batch sizes, and threading settings. Finally, verify all remaining flags by running ./llama-cli --help against your compiled binary to ensure compatibility.

coderabbitai · 2025-08-01T21:43:04Z

app/src/renderer/src/components/setup/DependenciesGate.tsx

+      if (process.env.NODE_ENV === 'development') {
+        startLlamaCpp()
+      }


🛠️ Refactor suggestion

Add error handling for startLlamaCpp call.

The development mode auto-start is a good addition, but should handle potential errors.

if (process.env.NODE_ENV === 'development') { - startLlamaCpp() + startLlamaCpp().catch((error) => { + console.error('[DependenciesGate] Failed to start LlamaCpp in development:', error) + }) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (process.env.NODE_ENV === 'development') {

startLlamaCpp()

}

if (process.env.NODE_ENV === 'development') {

startLlamaCpp().catch((error) => {

console.error('[DependenciesGate] Failed to start LlamaCpp in development:', error)

})

}

🤖 Prompt for AI Agents

In app/src/renderer/src/components/setup/DependenciesGate.tsx around lines 194 to 196, the call to startLlamaCpp() inside the development environment check lacks error handling. Wrap the startLlamaCpp() call in a try-catch block to catch any potential errors and handle them appropriately, such as logging the error or displaying a notification, to ensure the app remains stable during development.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

backend/golang/pkg/localmodel/ollama/client.go (1)
131-138: Consider handling multiple tool calls more robustly.

The code only processes the first tool call without checking if multiple tool calls exist. While this may be the expected behavior, consider adding validation or logging when multiple tool calls are present.
 	if len(response.ToolCalls) == 0 {
 		c.logger.Warn("Anonymization deserialization failed", "error", "no tool calls found in response")
 		return nil, fmt.Errorf("no tool calls found in response")
 	}
 
+	if len(response.ToolCalls) > 1 {
+		c.logger.Warn("Multiple tool calls received, using first one", "count", len(response.ToolCalls))
+	}
+
 	toolCall := response.ToolCalls[0]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 97650b9 and 440ae25.

📒 Files selected for processing (1)

backend/golang/pkg/localmodel/ollama/client.go (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-and-release-dev
GitHub Check: lint-build-test
GitHub Check: lint-build-test

🔇 Additional comments (3)

backend/golang/pkg/localmodel/ollama/client.go (3)

21-53: Well-structured tool definition for PII anonymization.

The replaceEntitiesTool definition follows OpenAI's function calling schema correctly. The schema is comprehensive with proper type definitions, clear descriptions, and appropriate required fields for the anonymization use case.

105-161: Excellent refactor to use structured tool calls.

The complete refactor from manual JSON parsing to OpenAI function tool calls significantly improves the robustness and reliability of the anonymization process. The comprehensive system prompt provides clear guidelines, and the structured response handling is much more reliable than manual parsing.

92-95: Verify ToolChoice Union Field

Please confirm that the ChatCompletionToolChoiceOptionUnionParam variant is being used correctly to enforce “required” tool usage:
File: backend/golang/pkg/localmodel/ollama/client.go (lines 92–95)
ToolChoice: openai.ChatCompletionToolChoiceOptionUnionParam{
    OfAuto: openai.String("required"),
},
Check the OpenAI Go SDK documentation or source for the ChatCompletionToolChoiceOptionUnionParam union to ensure there isn’t an OfRequired field or other intended way to specify “required” instead of using OfAuto.

Update the field variant if necessary so that the SDK actually enforces required tool invocation.

coderabbitai · 2025-08-02T00:04:48Z

backend/golang/pkg/localmodel/ollama/client.go


-	jsonStr := strings.TrimSpace(content[startIndex : endIndex+1])
-	jsonStr = strings.ReplaceAll(jsonStr, "'", "\"")
+	c.logger.Info("Local anonymizer response", "original", prompt)


⚠️ Potential issue

Security concern: Logging original prompt may expose PII.

The original prompt is being logged, which could contain the PII that the function is designed to anonymize. This creates a potential data leak in logs.

- c.logger.Info("Local anonymizer response", "original", prompt) + c.logger.Info("Local anonymizer response received")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

c.logger.Info("Local anonymizer response", "original", prompt)

c.logger.Info("Local anonymizer response received")

🤖 Prompt for AI Agents

In backend/golang/pkg/localmodel/ollama/client.go at line 129, the code logs the original prompt which may contain PII, risking data exposure. Remove or replace the logging of the original prompt with a safer alternative that does not include sensitive data, such as logging only metadata or anonymized information to prevent leaking PII in logs.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.github/workflows/dev-release.yml (1)
31-31: Pin Node.js version with a minor/patch wildcard for smoother future updates.
Using the bare major version can break builds when Node 21 becomes the default.
-          node-version: "20"
+          node-version: "20.x"

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 440ae25 and b1daeaf.

📒 Files selected for processing (1)

.github/workflows/dev-release.yml (5 hunks)

🧰 Additional context used

🪛 Gitleaks (8.27.2)

.github/workflows/dev-release.yml

93-93: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🪛 actionlint (1.7.7)

.github/workflows/dev-release.yml

116-116: the runner of "softprops/action-gh-release@v1" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: lint-build-test
GitHub Check: lint-build-test

🔇 Additional comments (1)

.github/workflows/dev-release.yml (1)

41-42: Invalid Go version — workflow will fail to resolve 1.24.2.

Go 1.24 has not been released (current latest is 1.22.x). The action will error out trying to download a non-existent toolchain.
-          go-version: "1.24.2"
+          # Pin to the latest stable Go version or a specific released version.
+          go-version: "stable"
Likely an incorrect or invalid review comment.

coderabbitai · 2025-08-02T01:35:30Z

.github/workflows/dev-release.yml

          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          ENCHANTED_MCP_URL: https://proxy-api-dev.ep-use1.ghostagent.org/mcp
-          POSTHOG_API_KEY: ''
+          POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4


⚠️ Potential issue

Hard-coded PostHog API key committed — immediate secret leakage.

Secrets must never be committed in plaintext; this key is now publicly exposed and should be revoked.

- POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4 + POSTHOG_API_KEY: ${{ secrets.POSTHOG_API_KEY }}

Action items:

Revoke the exposed key in PostHog immediately.

Rotate it and store the new value in the repository’s secrets.

Force-push a commit that references the secret via ${{ secrets.POSTHOG_API_KEY }} instead of hard-coding it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4

POSTHOG_API_KEY: ${{ secrets.POSTHOG_API_KEY }}

🧰 Tools

🪛 Gitleaks (8.27.2)

93-93: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents

In .github/workflows/dev-release.yml at line 93, the PostHog API key is hard-coded, exposing a secret publicly. Immediately revoke the exposed key in PostHog, generate a new key, and store it securely in the repository's secrets. Then, replace the hard-coded key in the workflow file with a reference to the secret using the syntax `${{ secrets.POSTHOG_API_KEY }}` and force-push the updated commit.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

app/src/main/dependenciesDownload.ts (2)
13-16: Remove the unused extractVersionFromUrl function.

This function is defined but never used, causing TypeScript compilation errors and linting warnings.
-const extractVersionFromUrl = (url: string): string | null => {
-  const match = url.match(/-(\d{4}-\d{2}-\d{2})\.zip$/)
-  return match ? match[1] : null
-}
-
147-149: Fix the inconsistency between URL comment and model checks.

The comment mentions "qwen3-4b" but the code now checks for "qwen3-06b" models, and the URL points to a "qwen3-06b" model.
-    // Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip
+    // Naming pattern: qwen3-06b-q4_K_M-YYYY-MM-DD.zip - e.g: qwen3-06b-q4_K_M-2025-08-04.zip
     // url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-4b_q4_k_m-2025-08-01.zip',
     url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-06b-q4_K_M-2025-08-04.zip',
app/src/main/llamaCppServer.ts (1)
43-74: Remove unsupported CLI flags and document configuration changes.

Several CLI flags are not supported by llama.cpp and will cause startup failures. Additionally, significant configuration changes need documentation.

Remove the unsupported flags:
       '--draft-max',
       '12',
       '--draft-min',
       '8',
-      '--draft-p-min',
-      '0.5',
       '-b',
       '4096',
       '--ubatch-size',
       '1536',
       '--reasoning-format',
       'none',
-      '--reasoning-budget',
-      '0',
-      '--kv-unified',
       '--mlock',
Also document the rationale for major changes:

Context size reduced from 8192 to 4096 (50% reduction)

Cache quantization changed from q4_0 to q8_0 (higher memory usage)

New draft model parameters added

🧹 Nitpick comments (1)

app/src/main/dependenciesDownload.ts (1)
184-184: Remove unused error variable.

The error variable in the catch block is defined but not used, causing static analysis warnings.
-      } catch (error) {
+      } catch {
         return false
       }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between b1daeaf and 1c29f5f.

📒 Files selected for processing (2)

app/src/main/dependenciesDownload.ts (7 hunks)
app/src/main/llamaCppServer.ts (3 hunks)

🧰 Additional context used

🪛 GitHub Check: lint-build-test

app/src/main/dependenciesDownload.ts

[warning] 13-13:
'extractVersionFromUrl' is assigned a value but never used

[warning] 184-184:
'error' is defined but never used

🪛 GitHub Actions: Frontend CI (app)

app/src/main/dependenciesDownload.ts

[error] 13-13: TypeScript error TS6133: 'extractVersionFromUrl' is declared but its value is never read.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: lint-build-test

🔇 Additional comments (3)

app/src/main/dependenciesDownload.ts (1)

181-183: LGTM! Simplified model checking logic is more appropriate.

The updated logic now only checks for the "qwen3-06b" model, which aligns with the new model usage pattern and eliminates unnecessary complexity.

app/src/main/llamaCppServer.ts (2)

187-194: LGTM! Primary model switch to 0.6b is consistent.

The change to use the 0.6b model as the primary model aligns with the dependency management updates and simplifies the server setup.

239-249: Model detection logic updated correctly for new versions.

The function now correctly detects "qwen3-17b" and "qwen3-06b" models instead of the previous "4b" and "0.6b" patterns. The variable naming maintains backward compatibility while supporting the new model versions.

Note: The model4b key now holds the 17b model path for backward compatibility, which might be confusing but maintains the existing API.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

app/src/main/llamaCppServer.ts (1)
43-75: Address the unsupported CLI flags identified in previous review.

The server arguments still contain several flags that were identified as unsupported by llama.cpp and will cause startup errors:

--reasoning-budget (lines 62-63)

--kv-unified (line 64)

--draft-p-min (lines 54-55)

Additionally, the significant parameter changes (context size reduction to 4096, cache quantization to q8_0, new threading settings) still lack documentation explaining the rationale.

Remove the unsupported flags and add documentation:
-      '--draft-p-min',
-      '0.5',
       '-b',
       '4096',
       '--ubatch-size',
       '1536',
       '--reasoning-format',
       'none',
-      '--reasoning-budget',
-      '0',
-      '--kv-unified',
       '--mlock',

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between ded9904 and 83843a4.

📒 Files selected for processing (2)

app/src/main/dependenciesDownload.ts (7 hunks)
app/src/main/llamaCppServer.ts (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

app/src/main/dependenciesDownload.ts

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-and-release-dev
GitHub Check: lint-build-test
GitHub Check: lint-build-test
GitHub Check: dead-code-check

coderabbitai · 2025-08-04T18:57:28Z

app/src/main/llamaCppServer.ts

+    const { model06b } = findModelFiles(modelDir)

-    if (!model4b) {
-      log.warn('[LlamaCpp] No 4b GGUF model file found in anonymizer directory, skipping setup')
+    if (!model06b) {
+      log.warn('[LlamaCpp] No 6gb GGUF model file found in anonymizer directory, skipping setup')
      return
    }

    if (!llamaCppInstance) {
-      llamaCppInstance = new LlamaCppServerManager(model4b, model06b, 11435)
+      llamaCppInstance = new LlamaCppServerManager(model06b, null, 11435)
    }


🛠️ Refactor suggestion

Clarify misleading parameter naming in constructor call.

The setup logic now passes model06b to the model4bPath parameter, which creates confusion about which model is actually being used. The constructor parameter names don't match the actual model being passed.

Consider either:

Updating the constructor parameter names to reflect actual usage:

- constructor(model4bPath: string, model06bPath: string | null = null, port: number = 11435) { - this.model4bPath = model4bPath + constructor(primaryModelPath: string, secondaryModelPath: string | null = null, port: number = 11435) { + this.primaryModelPath = primaryModelPath

Or passing the correct model based on the parameter name:

- llamaCppInstance = new LlamaCppServerManager(model06b, null, 11435) + llamaCppInstance = new LlamaCppServerManager(model17b || model06b, model06b, 11435)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In app/src/main/llamaCppServer.ts around lines 185 to 194, the constructor call to LlamaCppServerManager passes model06b to a parameter named model4bPath, causing confusion. To fix this, either rename the constructor parameters to accurately reflect the models being passed or adjust the arguments in the constructor call to match the expected parameter names, ensuring the correct model file is passed to the corresponding parameter.

coderabbitai · 2025-08-04T18:57:29Z

app/src/main/llamaCppServer.ts

+    const model17b = ggufs.find((file) => file.toLowerCase().includes('qwen3-17b'))

-    const model06b = ggufs.find(
-      (file) =>
-        file.toLowerCase().includes('qwen') &&
-        (file.toLowerCase().includes('0.6b') || file.toLowerCase().includes('0.6-b'))
-    )
+    const model06b = ggufs.find((file) => file.toLowerCase().includes('qwen3-06b'))

-    const model4bPath = model4b ? path.join(modelDir, model4b) : null
+    const model17bPath = model17b ? path.join(modelDir, model17b) : null
    const model06bPath = model06b ? path.join(modelDir, model06b) : null

-    log.info(`[LlamaCpp] Found 4b model: ${model4bPath}`)
+    log.info(`[LlamaCpp] Found 17b model: ${model17bPath}`)
    log.info(`[LlamaCpp] Found 0.6b model: ${model06bPath}`)

-    return { model4b: model4bPath, model06b: model06bPath }
+    return { model4b: model17bPath, model06b: model06bPath }


🛠️ Refactor suggestion

Fix misleading return structure - 17b model returned as model4b.

The function finds a 17b model but returns it under the model4b key, creating significant confusion:

Variable model17bPath contains the 17b model path

Log message correctly identifies it as "17b model"

But it's returned as model4b in the result object

This naming inconsistency will confuse future maintainers and could lead to bugs.

Update the return structure to match the actual models:

- return { model4b: model17bPath, model06b: model06bPath } + return { model17b: model17bPath, model06b: model06bPath }

And update the caller accordingly:

- const { model06b } = findModelFiles(modelDir) + const { model17b, model06b } = findModelFiles(modelDir)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In app/src/main/llamaCppServer.ts around lines 239 to 249, the 17b model path is mistakenly returned under the key `model4b`, which is misleading. Change the return object key from `model4b` to `model17b` to accurately reflect the model version. Also, ensure any code calling this function is updated to use the new `model17b` key instead of `model4b`.

Arthurgallina1 and others added 3 commits August 1, 2025 10:25

add new link to new model, add local anonymizer flag to dev setup, up…

25c2c70

…date logic to startup llamma

fix: anonymiser

1ee879e

Merge remote-tracking branch 'origin/main' into feature/anonymizer-dev

cd87e10

coderabbitai bot reviewed Aug 1, 2025

View reviewed changes

gluonfield added 2 commits August 1, 2025 15:10

fix: skip test

97650b9

fix: anonymiser prompt

440ae25

coderabbitai bot reviewed Aug 2, 2025

View reviewed changes

fix: key

b1daeaf

coderabbitai bot reviewed Aug 2, 2025

View reviewed changes

update to only use 06b model

1c29f5f

coderabbitai bot reviewed Aug 4, 2025

View reviewed changes

Arthurgallina1 added 2 commits August 4, 2025 15:41

fix lint issue

ded9904

fix lint issue

83843a4

coderabbitai bot reviewed Aug 4, 2025

View reviewed changes

fix: lint

a45b8a0

		// Naming pattern: qwen3-4b_q4_k_m-YYYY-MM-DD.zip - e.g: qwen3-4b_q4_k_m-2025-01-15.zip
		url: 'https://dgbaewh9qedok.cloudfront.net/models/qwen3-4b_q4_k_m-2025-08-01.zip',

Flag	Supported?	Notes
--reasoning-format	Yes	Controls reasoning output format, used for tool-calling, DeepSeek, etc.[2]
--reasoning-budget	No	Not documented as CLI flag; no evidence of support[10]
--kv-unified	No	Not documented; closest are KV cache-related flags[1]
--draft-p-min	No	Not documented; draft model options discussed but not this flag[4]

	c.logger.Info("Local anonymizer response", "original", prompt)
	c.logger.Info("Local anonymizer response received")

	POSTHOG_API_KEY: phc_j0wkdCa4BCngP92yQBK3PbsuH5LWv3tliaasJEhw6v4
	POSTHOG_API_KEY: ${{ secrets.POSTHOG_API_KEY }}

Feature/anonymizer dev #598

Are you sure you want to change the base?

Feature/anonymizer dev #598

Uh oh!

Conversation

gluonfield commented Aug 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

mentatbot bot commented Aug 1, 2025

Uh oh!

coderabbitai bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gluonfield commented Aug 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 1, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)