Conversation
Reviewer's GuideUpdates the Ramalama remote inference provider to align with llama-stack 0.5.1 by refactoring the adapter to use the shared OpenAIMixin, migrating configuration and provider spec to the new llama-stack-api types, replacing the run configuration with the starter distro template, and updating Python dependencies while removing the obsolete ramalama package and providers.d wiring. File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 5 issues, and left some high level feedback:
- In setup.py, the error message in the run step still references
providers_dir, which is no longer defined after removing the providers copying logic and will raise a NameError; update the exception handler to log the actual variables used (e.g.,run_yamlandtarget_dir). - RamalamaImplConfig defaults
base_urltohttp://localhost:8080/v1while ramalama-run.yaml wiresRAMALAMA_URLdirectly intobase_urlwithout appending/v1; consider aligning these so users don’t accidentally pass a non-/v1 URL and break the OpenAI-compatible endpoint expectations. - In RamalamaInferenceAdapter.get_api_key, returning the literal string "NO KEY REQUIRED" for unauthenticated setups may not match what OpenAIMixin expects for a missing key; it’s safer to return None or an empty string and let the mixin handle the absence of credentials.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In setup.py, the error message in the run step still references `providers_dir`, which is no longer defined after removing the providers copying logic and will raise a NameError; update the exception handler to log the actual variables used (e.g., `run_yaml` and `target_dir`).
- RamalamaImplConfig defaults `base_url` to `http://localhost:8080/v1` while ramalama-run.yaml wires `RAMALAMA_URL` directly into `base_url` without appending `/v1`; consider aligning these so users don’t accidentally pass a non-/v1 URL and break the OpenAI-compatible endpoint expectations.
- In RamalamaInferenceAdapter.get_api_key, returning the literal string "NO KEY REQUIRED" for unauthenticated setups may not match what OpenAIMixin expects for a missing key; it’s safer to return None or an empty string and let the mixin handle the absence of credentials.
## Individual Comments
### Comment 1
<location path="setup.py" line_range="20-21" />
<code_context>
+ os.makedirs(target_dir, exist_ok=True)
+ shutil.copy(run_yaml, target_dir)
+ print(f"Copied {run_yaml} to {target_dir}")
except Exception as error:
- print(f"Failed to copy {providers_dir} to {target_dir_1}. Error: {error}")
+ print(f"Failed to copy {providers_dir} to {target_dir}. Error: {error}")
raise
</code_context>
<issue_to_address>
**issue (bug_risk):** Exception message references `providers_dir`, which is no longer defined and will raise a `NameError` in the error path.
Since `providers_dir` is no longer defined, this error path will raise a `NameError` and hide the real copy failure. Update the message to use an existing variable that reflects what was being copied (e.g., `run_yaml`), or remove the reference entirely.
</issue_to_address>
### Comment 2
<location path="src/ramalama_stack/ramalama-run.yaml" line_range="22-31" />
<code_context>
+ - provider_id: ${env.RAMALAMA_URL:+ramalama}
</code_context>
<issue_to_address>
**issue (bug_risk):** The `ramalama` inference provider is conditional on `RAMALAMA_URL`, but the registered model always references `ramalama`.
This creates a config where the model can point to a provider that may not exist. Either make the provider unconditional (e.g., use `:=` with a default URL) or apply the same condition to the model registration so they remain consistent.
</issue_to_address>
### Comment 3
<location path="src/ramalama_stack/ramalama-run.yaml" line_range="157-163" />
<code_context>
+ provider_id: ramalama
+ model_type: llm
+ shields:
+ - shield_id: llama-guard
+ provider_id: ${env.SAFETY_MODEL:+llama-guard}
+ provider_shield_id: ${env.SAFETY_MODEL:=}
+ - shield_id: code-scanner
+ provider_id: ${env.CODE_SCANNER_MODEL:+code-scanner}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Safety shield `provider_id`/`provider_shield_id` environment interpolation can produce an invalid or partially configured shield.
With `provider_id: ${env.SAFETY_MODEL:+llama-guard}`, if `SAFETY_MODEL` is unset, `provider_id` resolves to an empty string, and `provider_shield_id: ${env.SAFETY_MODEL:=}` also becomes empty. This creates a shield with a valid `shield_id` but no usable provider reference. If you want to fall back to the built‑in `llama-guard` when the env var is unset, either set `provider_id: llama-guard` unconditionally and use `SAFETY_MODEL` only for `provider_shield_id`, or use `:=llama-guard` so both fields default to a valid value.
```suggestion
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: ${env.SAFETY_MODEL:=llama-guard}
- shield_id: code-scanner
provider_id: code-scanner
provider_shield_id: ${env.CODE_SCANNER_MODEL:=code-scanner}
```
</issue_to_address>
### Comment 4
<location path="src/ramalama_stack/ramalama-run.yaml" line_range="177-179" />
<code_context>
server:
port: 8321
-external_providers_dir: ${env.EXTERNAL_PROVIDERS_DIR:=~/.llama/providers.d}
+vector_stores:
+ default_provider_id: faiss
+ default_embedding_model:
+ provider_id: sentence-transformers
+ model_id: nomic-ai/nomic-embed-text-v1.5
</code_context>
<issue_to_address>
**issue (bug_risk):** `vector_stores.default_provider_id` is set to `faiss`, but no matching `faiss` vector_io provider is configured.
Because only a conditional `milvus` `vector_io` provider is configured, this `faiss` default will not resolve unless a matching `faiss` provider is defined elsewhere. Please either point `default_provider_id` to an existing provider (e.g. `milvus`) or add a corresponding `faiss` `vector_io` provider configuration.
</issue_to_address>
### Comment 5
<location path="src/ramalama_stack/ramalama-run.yaml" line_range="1-10" />
<code_context>
+version: 2
</code_context>
<issue_to_address>
**question:** The distro is named `starter` in paths, but setup still installs under the `ramalama` distribution directory, which may be confusing or inconsistent.
This file uses `distro_name: starter` and paths under `~/.llama/distributions/starter`, while `setup.py` installs `ramalama-run.yaml` under `~/.llama/distributions/ramalama`. Please align the distro name and directory paths (all `starter` or all `ramalama`) to avoid confusion about where the distro’s state lives.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly modernizes the project's integration with the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
a0a72b0 to
5970cef
Compare
There was a problem hiding this comment.
Code Review
This pull request upgrades llama-stack to version 0.5.1, which is a significant update. The changes correctly adapt the codebase to the new APIs and dependency requirements of llama-stack. The refactoring of RamalamaInferenceAdapter to use OpenAIMixin is a great simplification. I found one issue in the setup.py script that needs to be addressed.
36e61f2 to
a72bcd9
Compare
Signed-off-by: Christian Meier <meier.kristian@gmail.com>
a72bcd9 to
178ef45
Compare
|
I will have a look at the tests now . . . |
Signed-off-by: Christian Meier <meier.kristian@gmail.com>
03e89e2 to
97d1f27
Compare
|
test-build.sh needs to be modified to no longer check for |
|
same for the other two tests. |
Signed-off-by: Christian Meier <meier.kristian@gmail.com>
Signed-off-by: Christian Meier <meier.kristian@gmail.com>
|
@cdoern could re-run the tests or do I nee to do something to trigger them ? |
basically a copy and paste of https://github.com/llamastack/llama-stack/tree/main/src/llama_stack/providers/remote/inference/llama_cpp_server as it is not released.
Adjusted the run.yml by taking the current starter run.yml from llama-stack and adjusted it, more or less following what was there before.
Removed obsolete ramalama dependency, removing the circular dependency to ramalama.
It renders #148 obsolete.
It fixes containers/ramalama#2480
Summary by Sourcery
Upgrade the Ramalama remote inference provider to align with llama-stack 0.5.1 and its new remote provider infrastructure.
New Features:
Bug Fixes:
Enhancements:
Build: