AVA - Autonomous Virtual Assistant

🧠 Research-Grade AI Assistant with Verified Reasoning

Accuracy over Speed • Local-First • Privacy-Preserving

AVA v4.2 implements the Sentinel Architecture — a state-of-the-art cognitive system
that prioritizes verified accuracy over probabilistic token generation.

✨ Why AVA?

🎯 Accuracy-First Design

Unlike standard LLMs that "guess" tokens, AVA implements:

Active Inference for autonomous decision-making
Search-First Verification for factual queries
Test-Time Learning that improves during use

🔒 100% Local & Private

Your data never leaves your machine:

Runs entirely on your hardware
No cloud dependencies
No telemetry or tracking

⚡ Optimized for Consumer Hardware

Designed for 4GB VRAM GPUs:

Layer-wise paging for large models
Intelligent routing (fast vs deep)
Thermal-aware processing

🧪 Research-Grade Architecture

Built on cutting-edge research:

Titans (Test-Time Learning)
Entropix (Entropy-Based Routing)
Free Energy Principle (Active Inference)

🏗️ Sentinel Architecture

AVA's four-stage cognitive loop ensures accurate, verified responses:

                              ┌─────────────────┐
                              │   USER QUERY    │
                              └────────┬────────┘
                                       │
                    ╔══════════════════╧══════════════════╗
                    ║      STAGE 1: PERCEPTION            ║
                    ║  ┌─────────────────────────────┐    ║
                    ║  │ Embedding → KL Divergence   │    ║
                    ║  │ → Surprise Score            │    ║
                    ║  └─────────────────────────────┘    ║
                    ╚══════════════════╤══════════════════╝
                                       │
                    ╔══════════════════╧══════════════════╗
                    ║      STAGE 2: APPRAISAL             ║
                    ║  ┌─────────────────────────────┐    ║
                    ║  │ Active Inference Engine     │    ║
                    ║  │ G(π) = -Pragmatic           │    ║
                    ║  │      - Epistemic + Effort   │    ║
                    ║  └─────────────────────────────┘    ║
                    ╚══════════════════╤══════════════════╝
                                       │
              ┌────────────────────────┼────────────────────────┐
              │                        │                        │
              ▼                        ▼                        ▼
    ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
    │    MEDULLA      │      │     SEARCH      │      │     CORTEX      │
    │   Fast Path     │      │     Tools       │      │    Deep Path    │
    │   ─────────     │      │   ─────────     │      │   ─────────     │
    │   gemma3:4b     │      │   DDG/Google    │      │   qwen2.5:32b   │
    │   <200ms        │      │   Bing/Brave    │      │   3-30s         │
    └─────────────────┘      └─────────────────┘      └─────────────────┘
              │                        │                        │
              └────────────────────────┼────────────────────────┘
                                       │
                    ╔══════════════════╧══════════════════╗
                    ║      STAGE 4: LEARNING              ║
                    ║  ┌─────────────────────────────┐    ║
                    ║  │ Titans Memory Update        │    ║
                    ║  │ M_t = M_{t-1} - η∇θL       │    ║
                    ║  │ (Surprise-Weighted)         │    ║
                    ║  └─────────────────────────────┘    ║
                    ╚══════════════════╤══════════════════╝
                                       │
                              ┌────────┴────────┐
                              │ VERIFIED OUTPUT │
                              └─────────────────┘

🚀 Quick Start

Prerequisites

# 1. Install Ollama (required)
# Download from: https://ollama.ai

# 2. Pull models
ollama pull gemma3:4b              # Fast responses
ollama pull nomic-embed-text       # Surprise calculation
ollama serve                       # Start server

Installation

📦 Option A: Download Release (Recommended)

Download the installer from Releases:

AVA_x64-setup.exe — Windows Installer
AVA_x64_en-US.msi — MSI Package

🔧 Option B: Build from Source

git clone https://github.com/NAME0x0/AVA.git
cd AVA/ui
npm install
npm run tauri build

Run AVA

# Desktop App (GUI)
./AVA.exe                    # or double-click

# Terminal UI (Power Users)
cd AVA && pip install -e .
python -m tui.app

# API Server Only
python server.py             # http://127.0.0.1:8085

🎮 Features

Policy Selection
_{Free Energy minimization for autonomous behavior}

Test-Time Learning
_{Neural memory updates during inference}

Verified Facts
_{Web search before generation}

Embedding-Based
_{KL divergence, not heuristics}

Interfaces

Interface	Description	Launch
🖥️ Desktop App	Native GUI with neural visualization	`AVA.exe`
⌨️ Terminal UI	Keyboard-driven power-user interface	`python -m tui.app`
🌐 HTTP API	REST + WebSocket for integrations	`http://127.0.0.1:8085`

TUI Keybindings

Key	Action	Key	Action
`Ctrl+K`	Command palette	`Ctrl+S`	Force search
`Ctrl+L`	Clear chat	`Ctrl+D`	Deep thinking
`Ctrl+T`	Toggle metrics	`Ctrl+E`	Export chat
`F1`	Help	`Ctrl+Q`	Quit

🔌 API Reference

# Health check
curl http://127.0.0.1:8085/health

# Send message
curl -X POST http://127.0.0.1:8085/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain quantum computing"}'

# Get cognitive state (entropy, surprise, varentropy)
curl http://127.0.0.1:8085/cognitive

# WebSocket streaming
wscat -c ws://127.0.0.1:8085/ws

Endpoint	Method	Description
`/health`	GET	Server health & Ollama status
`/chat`	POST	Send message, get response
`/ws`	WS	Real-time bidirectional chat
`/cognitive`	GET	Entropy, surprise, confidence
`/belief`	GET	Active Inference belief state
`/memory`	GET	Memory statistics

📁 Project Structure

AVA/
├── 📂 config/               # Configuration files
│   ├── cortex_medulla.yaml  # Main config
│   └── tools.yaml           # Tool definitions
├── 📂 docs/                 # Documentation
│   ├── GETTING_STARTED.md   # Quick start guide
│   ├── ARCHITECTURE.md      # Sentinel architecture
│   └── API_EXAMPLES.md      # API reference
├── 📂 src/                  # Python source (TUI/tools)
│   ├── core/                # Cortex-Medulla system
│   ├── hippocampus/         # Titans memory
│   └── tools/               # Tool implementations
├── 📂 tui/                  # Terminal UI (Textual)
├── 📂 ui/                   # Desktop GUI (Tauri + Next.js)
│   └── src-tauri/           # Rust backend
│       └── src/engine/      # Cognitive engine
├── 📂 tests/                # Test suite
└── 📄 README.md             # You are here

⚙️ Configuration

Edit config/cortex_medulla.yaml:

cognitive:
  fast_model: "gemma3:4b"        # Medulla (fast)
  deep_model: "qwen2.5:32b"      # Cortex (deep)
  surprise_threshold: 0.5        # Routing threshold

search:
  enabled: true
  min_sources: 3                 # Verify with N sources

agency:
  epistemic_weight: 0.6          # Curiosity level
  pragmatic_weight: 0.4          # Goal focus

thermal:
  max_gpu_power_percent: 15      # Safe for laptops

See CONFIGURATION.md for all options.

📊 Hardware Requirements

Component	Minimum	Recommended
GPU VRAM	4GB	8GB+
System RAM	8GB	16GB+
Storage	10GB	50GB
OS	Windows 10 / Linux	Windows 11 / Ubuntu 22.04

VRAM Budget (4GB GPU)

Component           │ Resident  │ Peak
────────────────────┼───────────┼──────────
System Overhead     │   300 MB  │   300 MB
Medulla (gemma3:4b) │ 2,000 MB  │ 2,000 MB
Embedding Model     │   200 MB  │   200 MB
Titans Memory       │   100 MB  │   100 MB
────────────────────┼───────────┼──────────
Total               │ 2,600 MB  │ 2,600 MB
Headroom            │ 1,400 MB  │ 1,400 MB

🛠️ Troubleshooting

❌ "Ollama is not running"

# Start Ollama server
ollama serve

# Verify it's running
curl http://localhost:11434/api/tags

❌ "No models available"

# Pull required models
ollama pull gemma3:4b
ollama pull nomic-embed-text

# Verify models
ollama list

❌ "Port 8085 already in use"

# Windows
netstat -ano | findstr :8085
taskkill /F /PID <pid>

# Linux/macOS
lsof -i :8085
kill -9 <pid>

❌ "Out of GPU memory"

# Use smaller model
export OLLAMA_MODEL=gemma2:2b

# Or limit GPU memory
export AVA_GPU_MEMORY_LIMIT=3000

See TROUBLESHOOTING.md for more solutions.

📚 Documentation

Document	Description
Getting Started	Installation & first steps
Architecture	Sentinel architecture deep-dive
Configuration	All configuration options
API Examples	HTTP/WebSocket examples
TUI Guide	Terminal UI reference
Environment Variables	All env vars
Troubleshooting	Common issues

🤝 Contributing

Contributions are welcome! Please read our Contributing Guide first.

# Fork the repo, then:
git clone https://github.com/YOUR_USERNAME/AVA.git
cd AVA
pip install -e ".[dev]"
pre-commit install

# Make changes, then:
pytest                    # Run tests
cargo test               # Rust tests
git commit -m "feat: your feature"
git push origin your-branch

📜 License

This project is licensed under the MIT License — see LICENSE for details.

🙏 Acknowledgments

Research	Technology
Titans — Test-Time Learning (Google, 2025) Entropix — Entropy-Based Routing Active Inference — Free Energy Principle (Friston) Mamba — State Space Models	Ollama — Local LLM inference Tauri — Desktop framework Textual — TUI framework Next.js — React framework

_{Built with ❤️ for the research community}

⭐ Star this repo • 🐛 Report Bug • 💡 Request Feature

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.claude		.claude
.github		.github
config		config
data		data
docs		docs
examples		examples
installer		installer
legacy		legacy
models		models
scripts		scripts
src		src
tests		tests
tui		tui
ui		ui
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
VERSION		VERSION
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
state.json		state.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AVA - Autonomous Virtual Assistant

🧠 Research-Grade AI Assistant with Verified Reasoning

✨ Why AVA?

🎯 Accuracy-First Design

🔒 100% Local & Private

⚡ Optimized for Consumer Hardware

🧪 Research-Grade Architecture

🏗️ Sentinel Architecture

🚀 Quick Start

Prerequisites

Installation

Run AVA

🎮 Features

Interfaces

TUI Keybindings

🔌 API Reference

📁 Project Structure

⚙️ Configuration

📊 Hardware Requirements

VRAM Budget (4GB GPU)

🛠️ Troubleshooting

📚 Documentation

🤝 Contributing

📜 License

🙏 Acknowledgments

About

Uh oh!

Releases 11

Uh oh!

Contributors 2

Uh oh!

Languages

License

NAME0x0/AVA

Folders and files

Latest commit

History

Repository files navigation

AVA - Autonomous Virtual Assistant

🧠 Research-Grade AI Assistant with Verified Reasoning

✨ Why AVA?

🎯 Accuracy-First Design

🔒 100% Local & Private

⚡ Optimized for Consumer Hardware

🧪 Research-Grade Architecture

🏗️ Sentinel Architecture

🚀 Quick Start

Prerequisites

Installation

Run AVA

🎮 Features

Interfaces

TUI Keybindings

🔌 API Reference

📁 Project Structure

⚙️ Configuration

📊 Hardware Requirements

VRAM Budget (4GB GPU)

🛠️ Troubleshooting

📚 Documentation

🤝 Contributing

📜 License

🙏 Acknowledgments

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 11

Uh oh!

Contributors 2

Uh oh!

Languages