Hi, awesome project!
Many RAG projects use offline embedding/LLM models for local usage. These projects usually use hosting services (Ollama, VLLM, XInference and more) which support the OpenAI schema.
Adding a base URL to the OpenAI schema might allow more users to use this project!
Side note:
Model selection in LlamaIndex for OpenAI is limited to the actual OpenAI models, so an OpenAI-like option could be awesome, or users could just use a proxy like LiteLLM on their end :)