From e603d90d673e2273797d327dff4840301fbfaae1 Mon Sep 17 00:00:00 2001 From: Drew Minnear Date: Tue, 3 Jun 2025 12:42:49 -0400 Subject: [PATCH] add blog post about deploying rag-llm-gitops pattern on Azure with Azure SQL Server --- content/blog/2025-06-03-rag-llm-azure.adoc | 148 +++++++++++++++++++++ 1 file changed, 148 insertions(+) create mode 100644 content/blog/2025-06-03-rag-llm-azure.adoc diff --git a/content/blog/2025-06-03-rag-llm-azure.adoc b/content/blog/2025-06-03-rag-llm-azure.adoc new file mode 100644 index 000000000..609a8dfcc --- /dev/null +++ b/content/blog/2025-06-03-rag-llm-azure.adoc @@ -0,0 +1,148 @@ +--- + date: 2025-06-03 + title: Deploying the RAG-LLM GitOps Pattern on Azure + summary: How to deploy the RAG-LLM GitOps Validated Pattern on an ARO cluster + author: Drew Minnear + blog_tags: + - patterns + - how-to + - Azure + - Azure SQL Server + - ARO + - rag-llm-gitops +--- +:toc: +:imagesdir: /images + +[IMPORTANT] +==== +Currently, the Azure SQL (MSSQL) support and related Azure deployment improvements are available only in the branch https://github.com/dminnear-rh/rag-llm-gitops/tree/use-mssql-db[`use-mssql-db`] of my fork. + +You must fork or base your deployment off this branch until the changes in https://github.com/validatedpatterns/rag-llm-gitops/pull/66[PR #66] are merged into the main pattern repository. +==== + +== Prerequisites + +Before you start, ensure the following: + +* You are logged into an existing ARO cluster. +* Your Azure subscription has sufficient quota for GPU instances (default: `Standard_NC8as_T4_v3`, requiring at least 8 CPUs). +* You've created a token on https://huggingface.co[HuggingFace] and accepted the terms of the model you'll deploy. By default, the pattern uses the https://huggingface.co/solidrust/Mistral-7B-Instruct-v0.3-AWQ[Mistral-7B-Instruct-v0.3-AWQ] model. + +TIP: Model and database defaults are defined in `overrides/values-Azure.yaml`. You can override them by editing this file. + +== Database Options + +The pattern defaults to using Azure SQL Server. Alternatively, you may deploy a local Redis, PostgreSQL, or Elasticsearch instance within your cluster. + +To select your database type, edit `overrides/values-Azure.yaml`: + +[source,yaml] +---- +global: + db: + type: "AZURESQL" # Options: AZURESQL, REDIS, EDB, ELASTIC +---- + +WARNING: Choosing Redis, PostgreSQL (EDB), or Elasticsearch (ELASTIC) will deploy local database instances. Ensure your cluster has sufficient resources available. + +== Deploying Azure SQL Server (Optional) + +Follow these steps if you plan to use Azure SQL Server: + +. Navigate to the Azure portal and create a new SQL Database server. +. Select `Use SQL authentication`. +. Record your `Server name`, `Server admin login`, and `Password` (these will be needed later). +. On the *Networking* tab, set `Allow Azure services and resources to access this server` to `Yes`. +. Click *Review + create*, and then *Create*. + +Wait until the server status shows as active before proceeding. + +== Creating Required Secrets + +Before installation, create a secrets YAML file at `~/values-secret-rag-llm-gitops.yaml`. Populate it as follows: + +[source,yaml] +---- +version: "2.0" + +secrets: + - name: hfmodel + fields: + - name: hftoken + value: hf_your_huggingface_token + - name: azuresql + fields: + - name: user + value: adminuser + - name: password + value: your_password + - name: server + value: yourservername.database.windows.net +---- + +Replace these placeholders with your actual credentials: + +* `hftoken`: Your HuggingFace token (you must accept the model's terms). +* `user`: Azure SQL server admin username. +* `password`: Azure SQL admin password. +* `server`: Fully qualified Azure SQL server name. + +TIP: If you're not using Azure SQL Server, omit the entire `azuresql` section. + +== Creating GPU Nodes (MachineSet) + +Your cluster requires GPU nodes with a specific taint to host the vLLM inference service: + +[source,yaml] +---- +- key: odh-notebook + value: "true" + effect: NoSchedule +---- + +=== Creating GPU Nodes Automatically + +If no GPU nodes exist, run this command to provision one default GPU node: + +[source,shell] +---- +./pattern.sh make create-gpu-machineset-azure +---- + +This creates a single `Standard_NC8as_T4_v3` GPU node. + +=== Customizing GPU Node Creation + +To control GPU node specifics, provide additional parameters: + +[source,shell] +---- +./pattern.sh make create-gpu-machineset-azure GPU_REPLICAS=3 OVERRIDE_ZONE=2 GPU_VM_SIZE=Standard_NC16as_T4_v3 +---- + +Parameters available: + +* `GPU_REPLICAS`: Number of GPU nodes to provision. +* `OVERRIDE_ZONE`: Availability zone (optional). +* `GPU_VM_SIZE`: Azure VM SKU for GPU nodes. + +The script automatically applies the required taint. The Nvidia Operator installed by the pattern will handle CUDA driver installation on GPU nodes. + +== Installing the Pattern + +Ensure you've completed the following steps: + +. Logged into your ARO cluster. +. Created your database (Azure SQL Server) if applicable. +. Prepared the secrets YAML file (`~/values-secret-rag-llm-gitops.yaml`). +. Provisioned GPU nodes with the required taint. + +Finally, install the pattern by running: + +[source,shell] +---- +./pattern.sh make install +---- + +Your RAG-LLM GitOps Validated Pattern will now deploy to your Azure Red Hat OpenShift cluster.