🧠 Persistent Character Agent

A persistent conversational AI agent with long-term memory, dynamic identity, proximity awareness, and real-time streaming responses — powered by Google Gemini API.

🔬🛠️✅ Current Status: In the Testing and Stabilization Phase...

✨ Features

🔄 Real-Time Streaming — Typewriter-style character-by-character response display via Gemini streaming API
🧠 Bifurcated Memory System — Dual-layer memory with episodic (SQLite FTS5) and semantic (FAISS) recall
📝 Auto-Summarization — Every 5 turns are compressed into a single memory sentence and indexed into long-term storage
🎭 Dynamic Lore Retrieval — Personality and knowledge chunks retrieved via semantic search, not static injection
📍 Proximity Detection — Nomic embedding-based detection of physical/remote/transitional presence states
🕐 Temporal Awareness — Tracks time between conversations and adjusts context accordingly
🛡️ Traffic Control — Hold-Wait-Commit pattern ensures only valid responses are logged
💾 Response Caching — Deduplicates API calls with local hash-based cache
🔧 Memory Management CLI — List, delete, clear, and rebuild memory from the command line

🏗️ Architecture

project/
├── main.py                     # Entry point — CLI loop, streaming, 5-turn cycles
├── model_config.py             # LLM configuration (model, API, generation params)
├── setup.py                    # Initialize project structure and default files
├── manage_memory.py            # Memory management CLI tool
├── check_models.py             # List available models for your API key
├── requirements.txt
│
├── agent/                      # State & Retrieval Layer
│   ├── temporal.py             # Time delta calculation
│   ├── memory.py               # SQLite FTS5 episodic storage
│   ├── semantic_search.py      # FAISS vector search (nomic-embed-text)
│   ├── conversation.py         # Session logging + buffer management
│   ├── dynamic_lore.py         # Semantic lore retrieval
│   ├── lore/                   # Static personality files
│   │   ├── self.md             # AI identity
│   │   ├── user.md             # User profile
│   │   └── relationship.md    # Connection definition
│   └── episodes/               # Raw memory source files
│
├── pipeline/                   # Prompt Construction & Rendering
│   ├── packet_builder.py       # XML-tagged prompt assembly
│   ├── renderer.py             # Gemini API (non-streaming, cached)
│   └── summarizer_builder.py   # 5-turn summarization pipeline
│
├── streaming/                  # Real-Time Response
│   └── renderer_streaming.py   # Streaming with typewriter effect
│
├── proximity/                  # Presence Detection
│   └── proximity_manager.py    # Nomic-based proximity state engine
│
├── memory/                     # Memory Intent & Retrieval
│   └── memory_loader.py        # Intent detection + multi-source fetching
│
├── tools/                      # Utilities
│   └── index_lore.py           # Rebuild lore index
│
└── data/
    ├── nomic-embed-text-v1.5.Q8_0.gguf  # Local embedding model
    └── logs_raw/               # Session conversation logs

🔁 Data Flow

User Input → HOLD (temporary, not logged)
     ↓
Build Packet (XML-tagged prompt)
  ├── Dynamic Lore     → Semantic search over personality chunks
  ├── Proximity State  → Inject if changed (embedding similarity)
  ├── Memory Bank      → Fetch if memory intent detected
  └── Chat History     → Last 6 turns
     ↓
Stream to Gemini API (gemma-3-12b-it)
     ↓
Typewriter Display → Clean [AI]: prefix → Validate
     ↓
Valid?  → COMMIT both messages to log + buffer
Invalid? → DISCARD (clean retry, no history pollution)
     ↓
Turn == 5? → Summarize → Index to brain.db + FAISS → Reset buffer

🚀 Quick Start

Prerequisites

Python 3.10+
Google AI Studio API Key (free tier supports Gemma models)
~200MB disk space (for the Nomic embedding model)

Installation

# Clone the repository
git clone https://github.com/optimist1101jan/Persistent-AI-Systems-.git

# Navigate to the project directory
cd Persistent-AI-Systems-

# Install dependencies
pip install -r requirements.txt

# Add your API key (Creates API_KEY.txt)
echo "API_KEY=your_gemini_api_key_here" > API_KEY.txt

# Initialize database and project structure
python setup.py

# Start the agent
python main.py

Embedding Model Setup

Download the nomic-embed-text-v1.5 GGUF model and place it in data/:

# Place the file at:
data/nomic-embed-text-v1.5.Q8_0.gguf

Note: The agent works without the embedding model (using fallback), but semantic search and proximity detection require it.

⚙️ Configuration

All model settings are in model_config.py:

Parameter	Default	Description
`MODEL`	`gemma-3-12b-it`	Gemini/Gemma model to use
`TEMPERATURE`	`0.7`	Response creativity
`MAX_OUTPUT_TOKENS`	`1000`	Max response length
`TIMEOUT`	`60s`	API request timeout
`MAX_RETRIES`	`3`	Retry attempts on failure

Available Models

Free Tier (Gemma):                    Paid Tier (Gemini):
  gemma-3-1b-it   (fastest)            gemini-2.0-flash
  gemma-3-4b-it   (balanced)           gemini-2.0-flash-lite
  gemma-3-12b-it  (recommended)        gemini-2.5-flash
  gemma-3-27b-it  (best quality)       gemini-2.5-pro

Run python check_models.py to see all models available for your API key.

🧠 Memory System

The agent uses a 3-stage memory pipeline:

Stage 1 — Session Buffer

Raw conversation turns held in-memory for the current 5-turn cycle.

Stage 2 — Summarization

After 5 turns, the buffer is sent to Gemini for compression into a single factual sentence.

Stage 3 — Long-Term Indexing

The compressed memory is simultaneously indexed into:

Episodic Store (SQLite FTS5) — keyword searchable
Semantic Index (FAISS) — embedding-based similarity search

Memory Retrieval

When the user asks a memory-related question (e.g., "do you remember..."), the system:

Detects memory intent via keyword patterns
Searches both episodic and semantic stores
Injects relevant memories into the prompt

🛠️ Memory Management

python manage_memory.py list            # View all stored memories
python manage_memory.py delete <id>     # Delete a specific memory
python manage_memory.py stats           # Show memory statistics
python manage_memory.py rebuild         # Rebuild FAISS index
python manage_memory.py clear           # Clear all memories

📍 Proximity System

The agent detects physical presence context using embedding similarity:

State	Description	Example Input
`PHYSICAL`	User is present, face-to-face	"sits next to you"
`REMOTE`	Chatting remotely	"texting from work"
`TRANSITION_TOWARD`	User arriving	"walks over to you"
`TRANSITION_AWAY`	User leaving	"I need to go now"

Proximity context is only injected when the state changes or on the first turn, saving tokens.

📋 Tech Stack

Component	Technology
LLM	Google Gemini API (Gemma 3 12B)
Embeddings	nomic-embed-text-v1.5 (GGUF, 768-dim)
Vector Search	FAISS (IndexFlatIP, cosine similarity)
Episodic Memory	SQLite FTS5
Embedding Runtime	llama-cpp-python
Streaming	Gemini streamGenerateContent API
Language	Python 3.10+

🔒 Security Notes

Never commit API_KEY.txt — add it to .gitignore
The API key is loaded at runtime from a local file
No external data is stored beyond local cache and logs

📄 License

This project is for educational and personal use. See individual dependency licenses for third-party components.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Persistent Character Agent

🔬🛠️✅ Current Status: In the Testing and Stabilization Phase...

✨ Features

🏗️ Architecture

🔁 Data Flow

🚀 Quick Start

Prerequisites

Installation

Embedding Model Setup

⚙️ Configuration

Available Models

🧠 Memory System

Stage 1 — Session Buffer

Stage 2 — Summarization

Stage 3 — Long-Term Indexing

Memory Retrieval

🛠️ Memory Management

📍 Proximity System

📋 Tech Stack

🔒 Security Notes

📄 License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
agent		agent
data/system_logs		data/system_logs
memory		memory
pipeline		pipeline
proximity		proximity
readme		readme
streaming		streaming
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
logger_config.py		logger_config.py
main.py		main.py
manage_memory.py		manage_memory.py
model_config.py		model_config.py
model_config.txt		model_config.txt
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

🧠 Persistent Character Agent

🔬🛠️✅ Current Status: In the Testing and Stabilization Phase...

✨ Features

🏗️ Architecture

🔁 Data Flow

🚀 Quick Start

Prerequisites

Installation

Embedding Model Setup

⚙️ Configuration

Available Models

🧠 Memory System

Stage 1 — Session Buffer

Stage 2 — Summarization

Stage 3 — Long-Term Indexing

Memory Retrieval

🛠️ Memory Management

📍 Proximity System

📋 Tech Stack

🔒 Security Notes

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages