🧠⚡ Kāraka NexusGraph

Agentic Intelligence, Deployed at Cloud Speed | NVIDIA NIM hosted on EKS

For Judges: 📖 Deployment Guide - Complete setup instructions for testing this project

🏎️ Overview

Kāraka NexusGraph is a full-stack Agentic AI application built for the 2025 NVIDIA × AWS Generative AI Hackathon. It showcases how Llama-3.1 Nemotron-Nano 8B v1 (deployed as a NVIDIA NIM inference microservice) can power an agentic reasoning system hosted efficiently on AWS EKS, paired with a Retrieval Embedding NIM for contextual memory.

🪶 Kāraka NexusGraph combines scalable cloud deployment with modular agentic reasoning — making intelligence fast, composable, and cloud-native.

The Problem with Traditional RAG

Today's AI retrieves by keyword. We retrieve by meaning.

Traditional Retrieval-Augmented Generation (RAG) is noisy because it pulls text, not facts. When you ask "Who replicated findings funded by the NIH?", keyword-based systems return entire paragraphs containing "NIH" and "replicated" — forcing the LLM to re-parse unstructured text, leading to hallucinations and imprecise answers.

Our Solution: Kāraka Knowledge Graphs

We fix this by using LLMs to map unstructured data into a deterministic Kāraka knowledge graph. This builds a knowledge base on how and why, allowing an agent to retrieve structured, hallucination-free facts for precise answers.

What is Kāraka Theory?

Kāraka (कारक) is a 2,500-year-old grammatical framework from Pāṇini's Sanskrit grammar that defines semantic roles in sentences:

Agent (Kartā): Who does the action
Object (Karma): What receives the action
Instrument (Karaṇa): Tool/means used
Recipient (Sampradāna): Beneficiary/destination
Source (Apādāna): Origin/separation point
Locus (Adhikaraṇa): Where/when/what-about the action occurs

Unlike dependency parsing (which captures syntax), Kāraka captures semantic intent — the deep structure of meaning that remains constant across paraphrases.

Architecture Overview

graph TB
    subgraph "Document Ingestion"
        A[Raw Document] -->|Upload| B[L1: Upload Handler]
        B -->|Store| C[S3: Raw Bucket]
        C -->|Trigger| D[L2: Validate Doc]
        D -->|LLM Verification| E[L3: Sanitize Doc]
        E -->|Split Sentences| F[S3: Verified Bucket]
    end
    
    subgraph "Knowledge Graph Construction (Step Functions)"
        F -->|Trigger| G[SFN: Per-Document Workflow]
        G -->|Get Sentences| H[L8: Get Sentences]
        
        H -->|For Each Sentence| I[Parallel Extraction]
        I -->|Extract| J[L9: Extract Entities]
        I -->|Extract| K[L10: Extract Kriyā]
        I -->|Embed| L[L8: Embedding Call]
        
        J & K -->|Build| M[L11: Build Events]
        M -->|Audit| N[L12: Audit Events]
        N -->|Extract| O[L13: Extract Relations]
        
        O -->|Store| P[L15: Graph Node Ops]
        O -->|Store| Q[L16: Graph Edge Ops]
        L -->|Store| R[S3: Embeddings]
    end
    
    subgraph "Query Processing"
        S[User Query] -->|Submit| T[L21: Query Submit]
        T -->|Process| U[L23: Query Processor]
        U -->|Embed Query| V[L8: Embedding Call]
        V -->|Retrieve| W[L17: Retrieve from Embedding]
        W -->|Graph Traversal| X[NetworkX Graph]
        X -->|Synthesize| Y[L18: Synthesize Answer]
        Y -->|LLM Call| Z[L7: LLM Call]
        Z -->|Return| AA[Structured Answer + Citations]
    end
    
    subgraph "Storage Layer"
        AB[(DynamoDB: Jobs)]
        AC[(DynamoDB: Sentences)]
        AD[(DynamoDB: LLM Logs)]
        AE[S3: Knowledge Graph]
    end
    
    subgraph "Model Infrastructure (EKS)"
        AF[NVIDIA NIM: Generator]
        AG[NVIDIA NIM: Embedder]
    end
    
    D -.->|Log| AD
    E -.->|Update| AB
    E -.->|Store| AC
    P & Q -.->|Store| AE
    Z -.->|Call| AF
    V & L -.->|Call| AG

🧩 Core Stack

Layer	Technology	Purpose
Inference	`NVIDIA NIM: Llama-3.1-Nemotron-Nano-8B-v1`	Large-language reasoning engine
Embedding / Retrieval	`nvidia/llama-3.2-nv-embedqa-1b-v2`	Vector memory for contextual recall
Compute Platform	`AWS EKS` (AWS CDK provisioned)	Containerized microservice orchestration
Orchestration	`AWS Step Functions`	Agentic workflow coordination
Serverless Compute	`AWS Lambda (Python 3.12)`	Event-driven processing agents
Storage	`AWS S3 + DynamoDB`	Document storage + metadata
Monitoring / Logging	`CloudWatch`	Usage tracking + observability
API Layer	`AWS API Gateway`	RESTful endpoints for frontend
Graph Operations	`NetworkX`	Knowledge graph structure
Frontend (UI)	`React`	Interactive visualization & query interface
IaC	`AWS CDK (Python)`	Automated, reproducible infra setup

🚀 Features

⚙️ Agentic AI Core – Modular reasoning agents coordinated via AWS Step Functions
☁️ NVIDIA NIM Inference – Runs Llama 3.1 Nano 8B as a microservice on EKS
🧭 Retrieval Embedding Memory – Uses nv-embedqa NIM for contextual grounding
📊 AWS Native Infra – Scalable, monitored, cost-controlled Kubernetes cluster
🔄 Iterative Validation – LLM-driven quality assurance with retry logic
🎯 Semantic Role Labeling – Kāraka theory for precise fact extraction
💬 Structured Evidence Retrieval – Graph traversal for hallucination-free answers
🛠️ Infrastructure as Code – AWS CDK for EKS, Lambda, DynamoDB, S3
📈 Observability Tools – Processing chain visualization and LLM call logs

🛠️ Tools & Technologies Deep Dive

NVIDIA NIM (NVIDIA Inference Microservices)

Generator Model: llama-3.1-nemotron-nano-8b-v1
- Deployed as containerized microservice on EKS
- Handles all LLM reasoning tasks (entity extraction, validation, synthesis)
- OpenAI-compatible API endpoints
Embedder Model: llama-3.2-nv-embedqa-1b-v2
- Generates 2048-dimensional embeddings
- Powers semantic search and retrieval
- Optimized for query-document matching

AWS Services Architecture

EKS (Elastic Kubernetes Service)
- 2x g5.xlarge GPU nodes (NVIDIA A10G)
- Auto-scaling node groups
- Managed Kubernetes control plane
Lambda Functions (18 total)
- Python 3.12 runtime
- Reserved concurrency for LLM calls (3) and RAG queries (2 each)
- Custom layer with requests, networkx, numpy
Step Functions
- Per-document workflow orchestration
- Map state for parallel sentence processing
- Max concurrency = 1 to prevent LLM overload
DynamoDB
- Jobs table: Document processing status
- Sentences table: Sentence metadata with GSI by job_id
- LLM Logs table: Complete audit trail with GSIs
S3 Buckets
- Raw documents bucket
- Verified documents bucket
- Knowledge graph bucket (nodes.json, edges.json, embeddings)

Agentic Pipeline Components

Document Lifecycle Agents (L1-L4)
- Upload handler, validator, sanitizer, status tracker
Knowledge Graph Extraction Agents (L9-L14)
- Entity extractor, Kriyā extractor, event builder, auditor, relation extractor, attribute extractor
Graph Operations (L15-L16)
- Node operations (NetworkX graph construction)
- Edge operations (Kāraka links + relations)
RAG Agents (L17-L18, L21-L23)
- Embedding retrieval, answer synthesis, query processor
Observability Tools (L19-L20)
- Processing chain viewer, sentence chain viewer

The Kāraka RAG Pipeline

Phase 1: Document Ingestion & Validation

Upload Handler: Generates pre-signed S3 URL for document upload
Validate Doc: LLM verifies document quality and coherence
Sanitize Doc: Splits document into atomic sentences using LLM

Phase 2: Knowledge Graph Construction (Per Sentence)

For each sentence, we run a deterministic extraction pipeline:

Step 1: Parallel Extraction

Extract Entities (L9): Identify all nouns/entities
Extract Kriyā (L10): Identify verbs and their voice (active/passive)
Generate Embedding (L8): Create vector representation

Step 2: Event Instance Creation

Build Events (L11): Create event instances with Kāraka links
- Maps entities to semantic roles (Agent, Object, Instrument, etc.)
- Handles passive voice transformations
- Distinguishes Locus types (Space, Time, Topic)

Step 3: Quality Assurance

Audit Events (L12): LLM validates Kāraka assignments
- Checks for Locus misclassification (most common error)
- Verifies passive voice handling
- Ensures no invented entities
- Iterative retry until score = 100

Step 4: Relationship Extraction

Extract Relations (L13): Finds non-Kāraka relationships
- Sambandha (Relations): "with", "of", "between" connections
- Sāmānādhikaraṇya (Characteristics): Appositive phrases
- Compound Events: Sequential actions by same agent

Step 5: Graph Storage

Graph Node Ops (L15): Stores entities and events as nodes
Graph Edge Ops (L16): Stores Kāraka links and relations as edges
Uses NetworkX for graph operations
Stored in S3 as JSON (nodes.json, edges.json)

Phase 3: Query Processing (RAG)

When a user asks a question:

Query Submit (L21): Accepts query, returns query_id
Query Processor (L23): Orchestrates retrieval
- Embeds query using NVIDIA NIM Embedder
- Retrieves top-k semantically similar sentences
- Traverses knowledge graph to find connected facts
Retrieve from Embedding (L17): Cosine similarity search
Synthesize Answer (L18): LLM generates answer using:
- Structured Evidence: Facts from knowledge graph with semantic roles
- Retrieved Sentences: Original text for context
- Citations: Every fact cited with source (doc_id:sentence_id)

Why This Reduces Hallucinations

Traditional RAG:

Query: "Who replicated NIH-funded findings?"
Retrieved: [3 paragraphs of text mentioning NIH and replication]
LLM: *re-parses text, may hallucinate connections*

Kāraka RAG:

Query: "Who replicated NIH-funded findings?"
Retrieved Structured Facts:
  - Event: "was funded" → Agent: "NIH", Object: "gut-brain research"
  - Event: "replicated" → Agent: "Maria Santos", Object: "findings on gut-brain"
LLM: *synthesizes from structured facts only*
Answer: "Maria Santos replicated the NIH-funded findings (doc1:s3)"

The knowledge graph provides:

Deterministic facts (not ambiguous text)
Semantic roles (who did what to whom)
Provenance (every fact traceable to source sentence)
Structured context (LLM can't invent connections)

Technology Stack

Infrastructure

AWS CDK: Infrastructure as Code (Python)
AWS EKS: Kubernetes cluster for GPU workloads
AWS Lambda: Serverless compute (Python 3.12)
AWS Step Functions: Orchestration of KG construction
AWS DynamoDB: Metadata storage
AWS S3: Document and graph storage

AI/ML

NVIDIA NIM: Containerized model inference
- Generator: llama-3.1-nemotron-nano-8b-v1
- Embedder: llama-3.2-nv-embedqa-1b-v2
NetworkX: Graph data structure and operations
NumPy: Vector operations for embeddings

Compute Resources

GPU Nodes: 2x g5.xlarge (NVIDIA A10G)
Lambda Concurrency:
- LLM calls: 3 reserved
- RAG queries: 2 reserved per function
Step Functions: Max concurrency = 1 (prevents LLM overload)

Hackathon Scope: Proof of Concept

This hackathon project demonstrates the core hypothesis:

✅ What We Built:

Complete document ingestion pipeline
LLM-driven Kāraka extraction with iterative validation
Knowledge graph construction (entities, events, relations)
Semantic retrieval with graph traversal
Answer synthesis with structured evidence + citations

🚧 Future Work (Beyond Hackathon):

Multi-hop reasoning across documents
Temporal reasoning (event sequences)
Contradiction detection
Graph visualization UI
Advanced query decomposition
Coreference resolution across sentences

Key Prompts

Our system uses carefully engineered prompts for each extraction step:

Entity Extraction (entity_prompt.txt): Identifies all nouns
Kriyā Extraction (kriya_extraction_prompt.txt): Identifies verbs and voice
Event Instance Creation (event_instance_prompt.txt): Maps Kāraka roles
Auditor (auditor_prompt.txt): Validates semantic correctness
Relation Extraction (relation_prompt.txt): Finds non-Kāraka relationships
Answer Synthesis (answer_synthesizer_prompt.txt): Generates cited answers

All prompts are stored in prompts/ and synced to S3 during deployment.

Testing the System

1. Deploy Infrastructure

See DEPLOYMENT-GUIDE.md for complete setup instructions.

2. Upload a Document

./test-fresh-upload.sh

3. Monitor Processing

# Check sentence processing status
python check-sentence-status.py

# View processing chain for a sentence
curl "$API_URL/processing-chain?sentence_hash=<hash>"

4. Query the Knowledge Graph

./test-query-api.sh

Example query: "Who collaborated on neuroplasticity research?"

Response:

{
  "query_id": "q_abc123",
  "status": "completed",
  "answer": "Dr. Elena Kowalski collaborated with Dr. James Chen on a groundbreaking study examining neuroplasticity (doc1:s2).",
  "structured_evidence": [
    {
      "event": "collaborated",
      "agent": "Dr. Elena Kowalski",
      "locus_topic": "a groundbreaking study",
      "source": "doc1:s2"
    }
  ]
}

Project Structure

.
├── app.py                          # CDK app entry point
├── nvidia_aws_agentic_ai/          # CDK stack definitions
│   ├── serverless_stack.py        # Lambda, DynamoDB, API Gateway
│   └── eks_stack.py                # EKS cluster, GPU nodes, NIM models
├── src/lambda_src/                 # Lambda function code
│   ├── job_mgmt/                   # Document lifecycle management
│   ├── kg_agents/                  # Knowledge graph extraction agents
│   ├── graph_ops/                  # NetworkX graph operations
│   ├── rag/                        # Retrieval and synthesis
│   └── api_tools/                  # Observability and query APIs
├── prompts/                        # LLM prompts for each agent
├── lambda_layer/                   # Python dependencies (requests, networkx, numpy)
├── deploy-model.sh                 # Deploy EKS + NVIDIA NIM models
├── deploy-backend.sh               # Deploy serverless backend
├── test-model-endpoints.sh         # Test NVIDIA NIM endpoints
├── test-query-api.sh               # Test RAG query flow
└── DEPLOYMENT-GUIDE.md             # Setup instructions for judges

Team & Acknowledgments

This project was built for the NVIDIA + AWS Generative AI Hackathon.

Core Innovation:

Application of ancient Pāṇinian grammar to modern knowledge graphs
LLM-driven semantic extraction with iterative validation
Structured evidence retrieval for hallucination reduction

Technologies:

NVIDIA NIM for efficient model inference
AWS serverless architecture for scalability
Kāraka theory for semantic role labeling

License

This project is submitted for the NVIDIA + AWS Generative AI Hackathon.

The Fountain of Intellect: Where 2,500-year-old linguistic theory meets cutting-edge AI to build knowledge graphs that think like humans do.

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
.kiro		.kiro
architecture		architecture
frontend		frontend
image		image
karaka_frame		karaka_frame
node_modules		node_modules
nvidia_aws_agentic_ai		nvidia_aws_agentic_ai
prompts		prompts
research-proposal		research-proposal
scripts		scripts
test-data-files		test-data-files
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.kiro_exec.sh		.kiro_exec.sh
DEPLOYMENT-GUIDE.md		DEPLOYMENT-GUIDE.md
LOCAL_IMPLEMENTATION_PLAN.md		LOCAL_IMPLEMENTATION_PLAN.md
Part A_ Interrogative Pronouns.csv		Part A_ Interrogative Pronouns.csv
README.md		README.md
app.py		app.py
cdk.context.json		cdk.context.json
cdk.json		cdk.json
check-eks-status.sh		check-eks-status.sh
deploy-backend.sh		deploy-backend.sh
deploy-model.sh		deploy-model.sh
deploy-with-env.sh		deploy-with-env.sh
monitor-pods.sh		monitor-pods.sh
package-lock.json		package-lock.json
package.json		package.json
pause-model.sh		pause-model.sh
proposal.md		proposal.md
requirements.txt		requirements.txt
resume-model.sh		resume-model.sh
temp_colab.py		temp_colab.py
test-model-endpoints.sh		test-model-endpoints.sh

Folders and files

Latest commit

History

Repository files navigation

🧠⚡ Kāraka NexusGraph

Agentic Intelligence, Deployed at Cloud Speed | NVIDIA NIM hosted on EKS

🏎️ Overview

The Problem with Traditional RAG

Our Solution: Kāraka Knowledge Graphs

What is Kāraka Theory?

Architecture Overview

🧩 Core Stack

🚀 Features

🛠️ Tools & Technologies Deep Dive

NVIDIA NIM (NVIDIA Inference Microservices)

AWS Services Architecture

Agentic Pipeline Components

The Kāraka RAG Pipeline

Phase 1: Document Ingestion & Validation

Phase 2: Knowledge Graph Construction (Per Sentence)

Step 1: Parallel Extraction

Step 2: Event Instance Creation

Step 3: Quality Assurance

Step 4: Relationship Extraction

Step 5: Graph Storage

Phase 3: Query Processing (RAG)

Why This Reduces Hallucinations

Technology Stack

Infrastructure

AI/ML

Compute Resources

Hackathon Scope: Proof of Concept

Key Prompts

Testing the System

1. Deploy Infrastructure

2. Upload a Document

3. Monitor Processing

4. Query the Knowledge Graph

Project Structure

Team & Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages