Welcome to this Llama Stack Demo you can run easily on Red Hat OpenShift AI (RHOAI). It's a comprehensive demo that deploys everything you need to test agents with MCP tools on RHOAI using Llama Stack:
- Models: IBM Granite 3.3 and Llama 3.1 deployed using vLLM Serving Runtime and automatically added to Llama Stack
- MCP Servers: deployed as normal deployments and automatically added to Llama Stack
- Llama Stack Server: Llama Stack Server is deployed using the Llama Stack Operator included in Red Hat OpenShift AI
I started with an MCP Server using TypeScript generated using Cursor.ai straight from legal documents. It tried to cover all the legal elements, it was complex and the logic tricky, several iterations to fine tune the descriptions to make it kind of work in a small model (SLM) like IBM Granite and Llama 3.2 (~7B both of them). I changed my strategy and also the programing language, from TypeScript to Rust and from the full logic generated from the legal documents to a decision table generated with Cursor.ai but with much more guidance (and simplification) from my side. This time it worked pretty well with Claude Desktop but not quite with the SLMs, the latest iteration had to do with the names of some variables, is_single_parent_family and number_of_children_after...
The inner to outer loop:
- decision table
- code
- unit testing
- MCP Inspector
- Claude Desktop
- Llama Stack Local with models in RHOAI
- Llama Stack on RHOAI
Eligibility Assessment System powered by Llama Stack and Model Context Protocol (MCP)! This system helps assess eligibility for Family Care Unpaid Leave Support based on the Republic of Lysmark's Act No. 2025/47-SA. To get started quickly, jump straight to installation.
The Eligibility Assessment MCP Llama Stack system is an intelligent solution for evaluating eligibility for Family Care Unpaid Leave Support based on the Republic of Lysmark's legislation (Act No. 2025/47-SA and related regulations). The system combines the power of Llama Stack with Model Context Protocol (MCP) servers and Retrieval Augmented Generation (RAG) to provide accurate, context-aware assessments.
This system is designed to help users understand their eligibility for unpaid leave assistance in various family care situations, including care for sick/injured family members, childcare for multiple children, adoption cases, and single-parent family scenarios.
This deployment includes a Helm chart for setting up:
- An OpenShift AI Project with all necessary components
- Llama Stack distribution with Llama 3.1 8B model for natural language processing
- MCP server for the eligibility engine with specialized knowledge processing
- Vector database (Milvus) with embedded legal documents for RAG capabilities
- Document loader service to populate the knowledge base with Act No. 2025/47-SA and regulations
- Llama Stack Playground interface for interactive eligibility consultations
Use this project to quickly deploy an intelligent eligibility assessment system that provides accurate, legally-informed guidance on Family Care Unpaid Leave Support eligibility. 🏛️
Experience the Eligibility Assessment System through the Llama Stack Playground interface. After deployment, you can interact with the system to:
- Ask questions about eligibility requirements for different family care scenarios
- Get detailed assessments based on the legal framework of Act No. 2025/47-SA
- Understand the documentation needed for applications
- Learn about financial assistance amounts and duration limits
- Llama Stack Distribution - The foundational platform for LLM applications
- Model Context Protocol (MCP) - Protocol for integrating context sources with language models
- Eligibility Engine MCP Server - Rust-based MCP server for eligibility processing
- Llama Stack Docker images: quay.io/opendatahub/llama-stack:odh
- Model images from Red Hat AI Services ModelCar Catalog
- Based on Republic of Lysmark's Act No. 2025/47-SA: Family Care Unpaid Leave Support Act
- GPU: 1x NVIDIA A10G or equivalent (for optimal LLM performance)
- CPU: 8 cores
- Memory: 24 Gi (to handle Llama Stack, MCP servers, and vector database)
- Storage: 20Gi (for models, vector database, and document storage)
Note: The system uses quantized Llama 3.1 8B model (w4a16) for efficient GPU utilization while maintaining good performance.
- GPU: 1x NVIDIA A10G-SHARED (shared GPU allocation)
- CPU: 6 cores
- Memory: 16 Gi
- Storage: 10Gi
- Red Hat OpenShift 4.16+
- Red Hat OpenShift AI 2.16+
- Dependencies for Single-model server:
- Red Hat OpenShift Service Mesh
- Red Hat OpenShift Serverless
- Llama Stack Components:
- Llama Stack Distribution with vLLM runtime
- Model Context Protocol (MCP) server support
- Vector database capabilities (Milvus)
- Container Images:
- Llama Stack:
quay.io/opendatahub/llama-stack:odh - vLLM Runtime:
quay.io/modh/vllm:rhoai-2.23-cuda - Eligibility Engine MCP:
quay.io/atarazana/eligibility-engine-mcp-rs:latest
- Llama Stack:
- Standard user. No elevated cluster permissions required
Please note before you start
This system was tested on Red Hat OpenShift 4.16.24 & Red Hat OpenShift AI v2.16.2.
Ensure you have access to GPU resources and the required container registries.
git clone https://github.com/alpha-hack-program/eligibility-mcp-llamastack.git && \
cd eligibility-mcp-llamastack/oc login ...PROJECT="llama-stack-demo"
oc new-project ${PROJECT}Label project with
- modelmesh-enabled: 'false'
- opendatahub.io/dashboard: 'true'
oc label namespace ${PROJECT} modelmesh-enabled=false opendatahub.io/dashboard=trueThis default delployment deploys one model... TODO.
helm install llama-stack-demo helm/ --namespace ${PROJECT} \
--set namespace=${PROJECT} --timeout 10mIf you have access to Intel Gaudi accelerators you could use this command which uses helm/intel.values instead:
helm install llama-stack-demo helm/ --namespace ${PROJECT} \
--values helm/intel.yaml --set namespace=${PROJECT} --timeout 10mIf you want an NVIDIA deployment with two models run this. TODO explain which models... bla.
helm install llama-stack-demo helm/ --namespace ${PROJECT} \
--values helm/nvidia.yaml --set namespace=${PROJECT} --timeout 10moc -n ${PROJECT} get pods -wExpected pods (may take 5-10 minutes to start):
(Output)
NAME READY STATUS RESTARTS AGE
eligibility-lsd-0 1/1 Running 0 8m
eligibility-lsd-playground-0 1/1 Running 0 8m
eligibility-engine-0 1/1 Running 0 7m
loader-0 0/1 Completed 0 6m
llama-3-1-8b-w4a16-predictor-df76b56d6-fw8fp 2/2 Running 0 10m
You can access the system in multiple ways:
Get the OpenShift AI Dashboard URL:
oc get routes rhods-dashboard -n redhat-ods-applicationsNavigate to Data Science Projects -> llama-stack-demo. You'll see the deployed models and workbenches.
Get the Llama Stack Playground URL:
oc get routes eligibility-lsd-playground -n ${PROJECT}Access the playground directly to interact with the eligibility assessment system.
For programmatic access, get the Llama Stack API endpoint:
oc get routes eligibility-lsd -n ${PROJECT}Use this endpoint to integrate the eligibility assessment capabilities into your applications.
| Family relationship | Situation | Single-parent family | Number of children | Potentially eligible | Monthly benefit | Case | Description | Output | DESCRIPTION | Rule ID |
|---|---|---|---|---|---|---|---|---|---|---|
| true | delivery, birth | true | true | 500 | E | Single-parent family with newborn | The single-parent status must be documented | Case E: Single-parent family with any child | regla-005 | |
| true | delivery, birth | >=3 | true | 500 | B | Third child or more with newborn | The number of children must be 3 or more, the ages of at least 2 of the minors must be less than 6, if there is disability greater than 33% then the limit is 9 years | Case B: Third child or more with newborn | regla-002 | |
| true | delivery, birth | false | 0 | B | The number of children must be 3 or more, must consult with administration | The number of children must be 3 or more, must consult with administration | 9ec43eb2-484f-4fcf-9dd7-6510da30850c | |||
| true | illness, accident | true | 725 | A | First-degree family care sick or accident victim | The person must have been hospitalized and the care of the person must be continued | Case A: First-degree family care sick/injured | regla-001 | ||
| true | adoption, foster_care | true | 500 | C | Adoption or foster care | In the foster care case the duration must be longer than one year | Case C: Adoption or foster care | regla-003 | ||
| true | multiple_birth, multiple_delivery, multiple_adoption, multiple_foster_care | true | 500 | D | Delivery, adoption or foster care multiple | Case D: Delivery, adoption or foster care multiple | regla-004 | |||
| true | false | 0 | NONE | No case applies | No case applies | 515afd1f-43cc-44ed-971c-fefb273840b2 | ||||
| false | false | 0 | NONE | Not applicable by relationship (first degree) | Only father, mother, son, daughter, spouse or partner are accepted | 058dd988-90dd-46da-8478-ee458aacde6f | ||||
| false | 0 | NONE | UNKNOWN_ERROR | f32bfb0f-801d-4d6c-b5bd-13a1edd0eaca |
This table contains the evaluation criteria and outcomes for unpaid leave assistance eligibility. The data shows different cases (A through E) with varying monthly benefits:
- Case A: First-degree family care sick/injured - 725€
- Case B: Third child or more with newborn - 500€
- Case C: Adoption or foster care - 500€
- Case D: Multiple delivery/adoption/foster care - 500€
- Case E: Single-parent family with any child - 500€
- NONE: Cases where no assistance applies - 0€
The table includes input parameters (family relationship, situation, single-parent status, number of children) and corresponding outputs (eligibility, benefit amount, case classification, descriptions, and rule IDs).
- My mother had an accident and she's at the hospital. I have to take care of her, can I get access to the unpaid leave aid?
- My mother had an accident and she's at the hospital. I have to take care of her, tell me if I can get access to the unpaid leave aid and the requirements I have to meet.
- I have just adopted two children, at the same time, aged 3 and 5, am I elegible for the unpaid leave aid? How much?
- I have just adopted two children, at the same time, aged 3 and 5, tell me if I'm elegible for the unpaid leave aid and which requirements I should meet.
- I'm a single mom and I just had a baby, may I get access to the unpaid leave aid?
- Enumerate the legal requirements to get the aid for unpaid leave.
You are a helpful AI assistant that uses tools to help citizens of the Republic of Lysmark. Answers should be concise and human readable. AVOID references to tools or function calling nor show any JSON. Infer parameters for function calls or instead use default values or request the needed information from the user. Call the RAG tool first if unsure. Parameter single_parent_family only is necessary if birth/adoption/foster_care otherwise use false.
Unistall the helm chart.
helm uninstall llama-stack-demo --namespace ${PROJECT}Delete all remaining objects like jobs created in hooks.
oc delete jobs -l "app.kubernetes.io/part-of=eligibility-mcp-llamastack"Finally remove the project:
oc delete project ${PROJECT}