Update embedding model defaults for Gemini API compatibility#46
Update embedding model defaults for Gemini API compatibility#46Asesino-K wants to merge 1 commit intodbpedia:mainfrom
Conversation
📝 WalkthroughWalkthroughUpdates default embedding model identifiers from "embedding-001" to "gemini-embedding-001" across embedding and retrieval modules, including REST batch endpoints, request payloads, and PredicateEmbeddingRetriever initialization signature, with explicit CLI assignment of the model parameter. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
GSoC25/NEF/NEF.py (1)
449-450: Prefer constructor wiring over post-init mutation forembed_model.Right now the retriever is initialized with a hardcoded model and then overridden later. Passing
embed_modelthroughEnhancedNEFPipeline.__init__avoids split configuration and future ordering bugs.♻️ Proposed refactor
class EnhancedNEFPipeline: def __init__( self, client: "genai.Client", embeddings_path: Optional[str] = None, predicates_path: Optional[str] = None, + embed_model: str = "gemini-embedding-001", llm_model: str = "gemini-2.5-flash", predicate_threshold: float = 0.5, new_predicate_namespace: str = "http://nef.local/rel/", redis_host: Optional[str] = None, redis_port: Optional[int] = None, @@ self.pred = PredicateEmbeddingRetriever( client=self.client, embeddings_path=embeddings_path, predicates_path=predicates_path, - embed_model="gemini-embedding-001", + embed_model=embed_model, verbose=verbose, ) @@ pipe = EnhancedNEFPipeline( client=client, embeddings_path=args.embeddings, predicates_path=args.predicates, + embed_model=args.embed_model, llm_model=args.llm_model, predicate_threshold=args.predicate_threshold, new_predicate_namespace=args.new_predicate_namespace, redis_host=args.redis_host, redis_port=args.redis_port, redis_password=args.redis_password, verbose=args.verbose, ) - # ensure retriever uses requested embed model - pipe.pred.embed_model = args.embed_modelAlso applies to: 777-777
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@GSoC25/NEF/NEF.py` around lines 449 - 450, The code currently constructs the retriever with a hardcoded embed_model and mutates it afterward; update EnhancedNEFPipeline to accept embed_model as a constructor parameter and wire it into retriever creation (instead of setting embed_model post-init). Specifically, add an embed_model parameter to EnhancedNEFPipeline.__init__, propagate that value into the retriever initialization (the place where embed_model="gemini-embedding-001" is passed), and remove any subsequent post-construction assignment to retriever.embed_model to ensure configuration is done only in the constructor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@GSoC25/NEF/NEF.py`:
- Around line 449-450: The code currently constructs the retriever with a
hardcoded embed_model and mutates it afterward; update EnhancedNEFPipeline to
accept embed_model as a constructor parameter and wire it into retriever
creation (instead of setting embed_model post-init). Specifically, add an
embed_model parameter to EnhancedNEFPipeline.__init__, propagate that value into
the retriever initialization (the place where embed_model="gemini-embedding-001"
is passed), and remove any subsequent post-construction assignment to
retriever.embed_model to ensure configuration is done only in the constructor.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 95fbd87b-18c3-414c-859e-a4a79437f696
📒 Files selected for processing (2)
GSoC25/NEF/Emeddings.pyGSoC25/NEF/NEF.py
This PR updates the default embedding model names used in the current GSoC25 NEF workflow to match the currently supported Gemini embedding API.
While testing the predicate artifact generation and retrieval workflow locally, I encountered model resolution errors with the older embedding model defaults. This patch updates the endpoint and default model settings to the currently working Gemini embedding model so that the local setup path is consistent with the active API behavior.
This is intended as a small compatibility/setup fix for the current pipeline.
Summary by CodeRabbit
Release Notes