🚀 Production-Ready Knowledge Organization Infrastructure Pipeline
✅ Provenance System Enhanced (Sept 27, 2025): Complete parent-child document relationships with full URL preservation through provenance chain.
✅ Text Extraction Fixed (Sept 14, 2025): Pipeline now processing 100% clean, uncorrupted text after fixing sensor extraction issues.
A comprehensive sensor-to-agent pipeline that processes real-time content from KOI sensors, generates embeddings, handles deduplication and versioning, and provides immediate semantic search capabilities for AI agents.
- Overview
- Architecture
- Key Features
- Installation
- Configuration
- Usage
- API Documentation
- Database Schema
- Testing
- Deployment
- Troubleshooting
The KOI Processor is the central processing hub of the Knowledge Organization Infrastructure (KOI) ecosystem. It receives events from distributed sensors, processes content into searchable embeddings, and makes knowledge immediately available to AI agents through semantic search.
-
✅ RID-based Deduplication: Prevents duplicate content ingestion
-
✅ Version Control: Tracks content updates with full audit trail
-
✅ CAT Receipts: Complete provenance tracking for all transformations
-
✅ Isolated Tables: Separates sensor data from scraped content
-
✅ Parent-Child Documents: Hierarchical relationships for forum topics and posts
-
✅ Provenance API: Navigate document lineage via
/api/koi/graph/provenance/{rid} -
✅ Enhanced Curators: Daily/Weekly content curation with proper URL attribution
-
✅ Production Embeddings: Model-agnostic embedding server (currently BGE-large-en-v1.5)
-
✅ MCP Integration: Semantic search via Model Context Protocol
-
✅ Content Operations Dashboard: Web-based monitoring for Daily Bot and Weekly Digest
-
✅ Daily Content Curator: LLM-enhanced daily X posts with comprehensive ledger integration
-
✅ Weekly Aggregator: AI-powered weekly digest with on-chain activity tracking
-
✅ Code Graph Service: Automatic code entity extraction into Apache AGE graph
-
✅ Knowledge Graph Quality Improvement (Dec 2025): Modular post-processing pipeline with regression suite and 99.7% quality (up from 62%)
Status: ✅ PRODUCTION DEPLOYMENT v1.1 | Quality: 99.7% | Tests: KG regression suite passing | Dedup: 70.10%
Comprehensive three-phase quality improvement project completed successfully:
Phase 1-2: Quality Filters & Pipeline ✅
- 62% → 99.7% quality improvement
- Modular Pipeline Framework: 6 operational modules (ConfidenceFilter, DocumentLevelDeduplicator, CanonicalResolver, OntologyNormalizer, ListSplitter, EntityQualityFilter)
- Regression Suite: Targeted tests for KG stability
- Production Deployment: Zero errors, < 1% performance overhead
Phase 3: Cross-Document Deduplication ✅ COMPLETE
- Entity Deduplication System with three-tier waterfall:
- Tier 1 (Exact): B-Tree index match (~microseconds) - 58.0% hit rate
- Tier 2 (Semantic): pgvector HNSW + OpenAI embeddings (~milliseconds) - 10.6% hit rate
- Tier 3 (New): Insert new entities - 31.4% new entities
- Production Stats (as of 2025-12-11, pre-Stage 6 re-extraction):
- 12,985 unique entities from 43,430 raw entity mentions
- 70.10% deduplication rate (target: 65-75%)
- 64,925 RDF triples (Fuseki knowledge graph deployed)
- Zero type collisions (all type mismatches resolved)
- Zero placeholder entities (all "Unknown"/"Anonymous" removed)
- Zero errors in production
- Quality Improvements:
- JIRA IDs: 509 → 0 (100% eliminated)
- Template text: 444 → 0 (100% eliminated)
- Chunk repetition: 95% reduction
- Type consolidation: 678 entities merged
- Code Quality: A+ grade (expert reviewed)
Stage 6 rebuilds the semantic KG from a docs-only corpus (Notion + Discourse + Website + GitHub/GitLab markdown), using Gemini extraction and PostgreSQL as the authoritative store. Fuseki is rebuilt from PostgreSQL after completion.
Code Bridge enables joinable semantics and code structure:
koi_code_artifacts: canonical code entities (exported from code graph provenance)koi_doc_code_links: doc → code links (MENTIONS edges preserved)entity_registry.metadata.code_uri: entity-level code links (post-Stage 6)- AGE stub sync for single-query access to semantic anchors
Key scripts:
scripts/reextraction/stage6_canary_gemini.pyscripts/reextraction/stage6_full_reextract_gemini.pyscripts/code_bridge/export_code_artifacts.pyscripts/code_bridge/link_docs_to_code.pyscripts/code_bridge/link_entities_to_code.pyscripts/code_bridge/sync_stubs_to_age.py
Quick Start:
from knowledge_graph.graph_integration import KnowledgeGraphIntegrator
# With pipeline and deduplication
kg = KnowledgeGraphIntegrator(
use_pipeline=True,
use_entity_resolver=True # Enable deduplication
)
valid_entities = kg.process_entities_batch(entities)Documentation:
- PRODUCTION_DEPLOYMENT_SUMMARY.md - Production deployment status & rollback procedures
- prompts/ALL_PROMPTS_SUMMARY.md - Complete project workflow
- CLAUDE.md - Current project context
koi-processor/
├── src/ # Source code
│ ├── core/ # Core KOI processing
│ │ ├── koi_event_bridge_v2.py
│ │ ├── koi_types.py
│ │ ├── koi_permissions_api.py
│ │ └── bge_server.py
│ ├── content/ # Content generation & monitoring
│ │ ├── content_dashboard.py
│ │ ├── daily_curator_llm.py
│ │ ├── weekly_curator_llm.py
│ │ └── quality_control.py
│ ├── audio/ # Podcast generation
│ │ ├── audio_pipeline_enhanced.py
│ │ └── podcast_integration.py
│ ├── knowledge_graph/ # Knowledge graph & entity processing
│ │ ├── postprocessing/
│ │ │ ├── pipeline.py # Pipeline framework
│ │ │ └── modules/ # Processing modules
│ │ ├── entity_resolver.py # Entity deduplication (3-tier)
│ │ ├── uri_generator.py # Deterministic URI generation
│ │ └── graph_integration.py # Fuseki integration
│ ├── services/ # External service integrations
│ │ ├── regen_ledger.py
│ │ └── regen_ledger_comprehensive.py
│ └── utils/ # Utilities & helpers
├── scripts/ # Operational scripts
│ ├── setup.sh # One-command setup
│ ├── run_migrations.sh # Database migrations
│ ├── run_daily_curator.py
│ └── code_bridge/ # Code/semantic bridge scripts
├── migrations/ # Database migrations
├── docs/ # Documentation
├── tests/ # Test suite (use targeted KG regression subset)
├── prompts/ # Project planning & documentation
│ ├── PROMPT_1-23_*.md # Active prompts
│ └── archive/ # Superseded prompts
├── config/ # Configuration files
├── static/ # Dashboard static files
├── templates/ # Dashboard templates
└── requirements.txt # Python dependencies
DATA INGESTION PIPELINE:
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ KOI Sensors │────▶│ Coordinator │────▶│ Event Bridge │
│ (Various) │ │ (Port 8005) │ │ (Port 8100) │
└─────────────┘ └──────────────┘ └──────────────┘
│
┌──────────────┴──────────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ Embedding │ │ Entity Extractor │
│ Server │ │ (JSON-LD/RDF) │
│ (Port 8090) │ │ [PLANNED] │
└──────────────┘ └──────────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ PostgreSQL │ │ Apache Jena │
│ (pgvector) │ │ Fuseki │
│ • Embeddings │ │ (Port 3030) │
└──────────────┘ │ • RDF Triples │
│ • Ontologies │
└──────────────────┘
QUERY/ACCESS LAYER:
┌──────────────────────────────────────┐
│ Hybrid RAG API (Port 8301) │
│ • Reciprocal Rank Fusion (RRF) │
│ • BGE Semantic Search │
│ • Adaptive Extraction Triggers │
└──────────────────────────────────────┘
│
┌─────────────────┴──────────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ MCP Knowledge │ │ GAIA React │
│ Server │ │ Frontend │
│ (Port 8200) │ │ (Port 3000 web) │
└──────────────────┘ └──────────────────┘
QUERY/ACCESS LAYER:
┌───────────────────────────────────┐
│ PostgreSQL │
│ • koi_memories (KOI knowledge) │
│ • koi_embeddings (pgvector) │
│ • memories (agent state) │
│ • conversations (agent history) │
└───────────────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ Eliza Agents │
│ (5 AI Agents) │
│ │
│ • Direct SQL for state │
│ • MCP tools for external data │
└─────────────────────────────────┘
▲ ▲
│ │
┌────────────┴───┐ ┌────┴──────────┐
│ Knowledge MCP │ │ Regen MCP │
│ Server │ │ Server │
│ │ │ │
│ Routes to: │ │ Connects to: │
│ • PostgreSQL │ │ • Regen │
│ (pgvector) │ │ Ledger │
│ • Apache Jena │ │ • Blockchain │
│ Fuseki │ │ data │
└────────────────┘ └───────────────┘
▲ ▲
│ │
┌───────┴────────┐ │
│ │ │
PostgreSQL Apache Jena Regen Ledger
(pgvector) Fuseki (Blockchain)
-
KOI Sensors: Monitor websites, documents, and other sources
-
KOI Coordinator (
port 8200): Routes events to processing pipeline -
KOI Event Bridge v2 (
port 8100): Distributes content to processors- Handles deduplication, versioning, chunking
- Routes to both embedding and entity extraction paths
-
Embedding Server (
port 8090): Generates semantic embeddings- Currently using BAAI/bge-large-en-v1.5 (1024 dimensions)
- Model-agnostic API allows swapping to other models
- Stores embeddings in PostgreSQL pgvector
-
Entity Extractor (PLANNED): Extracts structured data
- Processes content into JSON-LD/RDF format
- Extracts entities, relationships, and ontological information
- Uses unified metabolic ontology (36 classes)
- Loads RDF triples directly into Apache Jena
-
PostgreSQL: Dual-purpose database
- Stores KOI knowledge (koi_memories, koi_embeddings with pgvector)
- Stores agent state (memories, conversations, relationships)
-
Apache Jena Fuseki (
port 3030): SPARQL triplestore- Stores RDF triples and OWL ontologies
- Populated by Entity Extractor (when implemented)
- Handles complex ontological/semantic reasoning queries
-
Knowledge MCP Server: KOI knowledge query API for agents
- Routes semantic searches to PostgreSQL pgvector
- Routes ontological queries to Apache Jena Fuseki
- Provides unified knowledge interface via stdio transport
-
Regen MCP Server: Blockchain data API for agents
- Connects to Regen Ledger blockchain
- Provides access to on-chain data (carbon credits, ecological state, etc.)
- Handles blockchain queries and transactions
- Separate from knowledge infrastructure
-
Eliza Agents: Three connection patterns
- Direct PostgreSQL: For agent state, conversations, memories
- Via Knowledge MCP: For KOI knowledge queries (embeddings and ontologies)
- Via Regen MCP: For blockchain/ledger queries
- RID-based tracking: Each document has a unique Resource Identifier
- Version control: UPDATE events create new versions, preserving history
- Audit trail: Complete provenance tracking with CAT receipts
- Intelligent chunking: 1000 chars with 200 char overlap
- Multi-format support: Handles JSON, HTML, plain text
- Event types: NEW, UPDATE, FORGET with appropriate handling
- Production embeddings: State-of-the-art semantic vectors
- MCP integration: Standard protocol for agent tool use
- Permission filtering: Agent-specific content access control
- Dual-table pattern:
koi_memoriesfor source documents,memoriesfor chunked content - No contamination: Clean separation of data sources
- Migration support: Gradual transition from legacy systems
- Full documentation: See STORAGE_ARCHITECTURE.md for details
- Export publication dates from DB to JSON mapping:
node scripts/export_published_map.js(usesPOSTGRES_URL, writes tosrc/core/published_map.jsonby default)
- Refine RDF graph with
regx:publishedAtand load into Jena:bash scripts/refine_with_published.sh(usesCONSOLIDATION_PATH,PUBLISHED_MAP_PATH,JENA_DATA_ENDPOINT)
- Enables SPARQL date gating in MCP when present.
- Python 3.8+
- PostgreSQL with pgvector extension (or Docker)
- 4GB+ RAM recommended
./scripts/setup.sh-
Run setup:
./scripts/setup.sh
-
Configure OAuth (see config/README.md):
cp /path/to/client_secret.json config/client_secret.json
-
Service account setup (for automated ingestion):
- Share Google Drive folders with:
rag-ingestion-bot@koi-sensor.iam.gserviceaccount.com - Grant "Viewer" access
- Share Google Drive folders with:
Once configured, users authenticate at:
- Auth URL:
https://your-domain.com/api/koi/auth/initiate - Callback:
https://your-domain.com/api/koi/auth/callback - Status:
https://your-domain.com/api/koi/auth/status
- Clone and setup:
cd /opt/projects/koi-processor
git pull origin regen-prod
bash scripts/setup.sh- Configure environment:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY (required for podcast generation)- Run migrations:
bash scripts/run_migrations_with_backup.sh- Start services:
# Start core KOI services
bash scripts/start_all_services.sh
# Start Hybrid RAG API (required for semantic search)
cd /opt/projects/koi-processor && bun koi-query-api.ts &
# Start MCP Knowledge Server (for agent access)
source venv/bin/activate && python3 src/core/koi_knowledge_mcp_server.py &Service Ports:
- Port 8005: KOI Coordinator
- Port 8090: BGE Embedding Server
- Port 8100: Event Bridge
- Port 8200: MCP Knowledge Server
- Port 8301: Hybrid RAG API (semantic search)
- Port 8400: Content Dashboard
- Access dashboard: Open http://localhost:8400 in your browser
To set up HTTPS access at https://regen.gaiaai.xyz/digests:
# Run the nginx setup script
sudo bash /opt/projects/koi-processor/setup_nginx_digests.shThis will:
- Install nginx (if needed)
- Configure SSL with Let's Encrypt
- Set up proxy from https://regen.gaiaai.xyz/digests to localhost:8400
- Enable WebSocket support for real-time updates
After setup, the dashboard will be accessible at: https://regen.gaiaai.xyz/digests
# Clone the repository
git clone https://github.com/yourusername/koi-processor.git
cd koi-processor
# Run the setup script
chmod +x scripts/setup.sh
./scripts/setup.shThe setup script will:
- ✅ Check Python version
- ✅ Create virtual environment
- ✅ Install all dependencies
- ✅ Set up PostgreSQL (optionally with Docker)
- ✅ Run database migrations
- ✅ Create configuration files
- ✅ Set up the monitoring dashboard
source venv/bin/activate
python src/content/content_dashboard.py
# Open http://localhost:8400 in your browsersource venv/bin/activate
python scripts/run_daily_curator.pysource venv/bin/activate
python scripts/run_daily_curator.py statusIf you prefer to set up components manually:
# Create database
createdb -U postgres eliza
# Enable pgvector extension
psql -U postgres -d eliza -c "CREATE EXTENSION IF NOT EXISTS vector;"
# Run migrations individually
psql -U postgres -d eliza < migrations/001_create_transformation_receipts.sql
psql -U postgres -d eliza < migrations/002_create_agent_knowledge_permissions.sql
psql -U postgres -d eliza < migrations/003_create_isolated_koi_tables.sql
psql -U postgres -d eliza < migrations/004_add_publication_dates.sql
psql -U postgres -d eliza < migrations/005_create_dashboard_tables.sql# For testing/development
python bge_server.py
# For production (requires GPU)
# See bge_server_real.py for Hugging Face implementation# Download and extract Fuseki
wget https://dlcdn.apache.org/jena/binaries/apache-jena-fuseki-4.10.0.tar.gz
tar -xzf apache-jena-fuseki-4.10.0.tar.gz
cd apache-jena-fuseki-4.10.0
# Start Fuseki server
./fuseki-server --loc=/path/to/data --port=3030 /koi
# Or use Docker
docker run -p 3030:3030 -e ADMIN_PASSWORD=admin stain/jena-fusekicd bge-mcp-ts
bun install
bun run bge-server.ts# See separate Regen MCP repository for blockchain integration
# https://github.com/yourusername/regen-mcp-serverCreate a .env file in the project root:
# Database
POSTGRES_URL=postgresql://postgres:postgres@localhost:5433/eliza
# Embedding Server
BGE_API_URL=http://localhost:8090/encode
# Event Bridge Configuration
USE_ISOLATED_TABLES=true # Use new deduplication tables
KOI_COORDINATOR_URL=http://localhost:8200
# MCP Server (optional)
MCP_SERVER_PORT=3000
# Logging
LOG_LEVEL=INFO- 8090: Embedding Server
- 8100: KOI Event Bridge
- 8200: KOI Coordinator
- 3000: MCP Server (stdio transport)
- 3030: Apache Jena Fuseki SPARQL endpoint
python bge_server.py # Currently using BGE model
# Server will run on http://localhost:8090USE_ISOLATED_TABLES=true python koi_event_bridge_v2.py
# Server will run on http://localhost:8100./fuseki-server --loc=/path/to/data --port=3030 /koi
# SPARQL endpoint will be at http://localhost:3030/koicd bge-mcp-ts
bun run bge-server.ts
# Knowledge MCP server handles query routing to PostgreSQL and Apache Jena# See Regen MCP repository for setup
# Provides blockchain data access to agentscurl -X POST http://localhost:8100/process-koi-event \
-H "Content-Type: application/json" \
-d '{
"event_type": "NEW",
"source_sensor": "website_monitor",
"timestamp": "2025-09-09T12:00:00Z",
"bundle": {
"rid": "sensor.website.example.com.page1",
"cid": "bafyreiabc123...",
"content": {
"text": "This is the content to be processed..."
},
"metadata": {
"title": "Example Page",
"url": "https://example.com/page1"
},
"manifest": {
"version": "1.0.0"
}
}
}'curl -X POST http://localhost:8100/process-koi-event \
-H "Content-Type: application/json" \
-d '{
"event_type": "UPDATE",
"source_sensor": "website_monitor",
"timestamp": "2025-09-09T13:00:00Z",
"bundle": {
"rid": "sensor.website.example.com.page1",
"cid": "bafyreiabc456...",
"content": {
"text": "This is the UPDATED content..."
},
"metadata": {
"title": "Example Page (Updated)"
},
"manifest": {
"version": "1.0.0"
}
}
}'# Event Bridge health
curl http://localhost:8100/
# Pipeline statistics
curl http://localhost:8100/stats
# Embedding server test
curl -X POST http://localhost:8090/encode \
-H "Content-Type: application/json" \
-d '{"text": "test embedding"}'
# Apache Jena SPARQL test
curl http://localhost:3030/koi/sparql \
-H "Content-Type: application/sparql-query" \
-d "SELECT * WHERE { ?s ?p ?o } LIMIT 10"The dual MCP Server architecture provides specialized query interfaces:
-
Semantic Search (via PostgreSQL pgvector):
- Agent sends:
{"tool": "bge_search", "query": "regenerative agriculture"} - Routes to PostgreSQL for embedding similarity search
- Returns relevant documents with similarity scores
- Agent sends:
-
Ontological Query (via Apache Jena):
- Agent sends:
{"tool": "sparql_query", "query": "SELECT ?entity WHERE..."} - Routes to Apache Jena Fuseki
- Returns RDF triples and relationships
- Agent sends:
-
Hybrid Query:
- Combines results from both systems
- Semantic context from embeddings + ontological relationships
- Blockchain Query:
- Agent sends:
{"tool": "ledger_query", "query": "carbon_credits"} - Connects to Regen Ledger
- Returns on-chain data (credits, attestations, ecological state)
- Agent sends:
Returns service status and configuration.
Response:
{
"service": "KOI Event Bridge v2",
"status": "operational",
"version": "2.0.0",
"features": [...],
"isolated_tables": true
}Processes a KOI event with deduplication and versioning.
Request Body:
{
"event_type": "NEW|UPDATE|FORGET",
"source_sensor": "string",
"timestamp": "ISO 8601",
"bundle": {
"rid": "unique resource identifier",
"cid": "content identifier",
"content": {},
"metadata": {},
"manifest": {}
}
}Response:
{
"success": true,
"rid": "string",
"cid": "string",
"chunks_created": 1,
"embeddings_created": 1,
"version": 1,
"previous_version_id": null,
"error": null
}Returns current pipeline metrics.
Generates semantic embedding for text (currently using BGE model).
Request:
{
"text": "content to embed"
}Response:
{
"embedding": [0.123, -0.456, ...] // 1024 dimensions
}CREATE TABLE koi_memories (
id UUID PRIMARY KEY,
rid VARCHAR(500) NOT NULL,
cid VARCHAR(500),
version INTEGER DEFAULT 1,
previous_version_id UUID,
event_type VARCHAR(20),
source_sensor VARCHAR(200),
content JSONB,
metadata JSONB,
superseded_at TIMESTAMP,
created_at TIMESTAMP,
UNIQUE(rid, version)
);CREATE TABLE koi_embeddings (
id SERIAL PRIMARY KEY,
memory_id UUID REFERENCES koi_memories(id),
dim_768 vector(768), -- For alternative models
dim_1024 vector(1024), -- For BGE
dim_1536 vector(1536), -- For OpenAI
created_at TIMESTAMP,
UNIQUE(memory_id)
);-- Get latest version of all documents
SELECT * FROM current_koi_memories;
-- Get version history for a RID
SELECT * FROM get_koi_memory_history('sensor.website.example.com.page1');
-- Pipeline statistics
SELECT * FROM koi_pipeline_stats;
-- Check for duplicates
SELECT rid, COUNT(*)
FROM koi_memories
WHERE superseded_at IS NULL
GROUP BY rid
HAVING COUNT(*) > 1;Status: Fully Implemented and Tested
The podcast publishing system generates weekly audio digests from aggregated content using automated audio generation (Podcastfy) or manual export (NotebookLM).
Key Components:
podcast_publisher.py- RSS 2.0 feed generation with iTunes extensionspodcastfy_generator.py- Automated audio generation (no manual steps)podcast_integration.py- Complete pipeline orchestrationPODCAST_HOSTING_GUIDE.md- Comprehensive documentation
Features:
- Automated audio generation from weekly digests
- RSS feed with full podcast metadata
- Google Drive backup integration (optional)
- Episode management and versioning
- Configurable voices and conversation styles
Status: Fully Implemented
Aggregates content from past 7 days and generates comprehensive weekly digests.
Status: Fully Implemented
The Daily Content Curator will be a specialized processor component that aggregates and curates content for daily X posts and weekly digests.
Architecture Decision:
- Component Type: Processor/Aggregator (NOT a KOI node)
- Location:
/koi-processor/daily_curator.py - Integration: Queries KOI infrastructure rather than acting as a sensor
Key Features:
- Query PostgreSQL for recent koi_memories (24-48 hours)
- Embedding similarity search for trending topic identification
- Stats aggregation from ledger sensor data
- Thread generation (3-5 posts with headline, stat, links, CTA)
- Style guide compliance checking
- JSON output for X bot consumption
Data Flow:
KOI Sensors → Event Bridge → PostgreSQL
↓
Daily Content Curator
↓
X Bot / Weekly Digest
python -m pytest tests/# Start all services
./scripts/start_services.sh
# Run integration tests
python tests/test_integration.py# Send test event
python scripts/send_test_event.py
# Check if processed
psql -U postgres -d eliza -c "SELECT * FROM koi_pipeline_stats;"- Use environment variables for all configuration
- Enable SSL for PostgreSQL connections
- Use real embedding model instead of mock server
- Set up monitoring (Prometheus metrics available at
/metrics) - Configure log rotation for production logs
# Build image
docker build -t koi-processor .
# Run with environment file
docker run --env-file .env.production koi-processor[Unit]
Description=KOI Event Bridge v2
After=network.target postgresql.service
[Service]
Type=simple
User=koi
WorkingDirectory=/opt/koi-processor
Environment="USE_ISOLATED_TABLES=true"
ExecStart=/usr/bin/python3 koi_event_bridge_v2.py
Restart=always
[Install]
WantedBy=multi-user.target- Check if embedding server is running:
curl http://localhost:8090/encode -d '{"text":"test"}' - Verify BGE_API_URL environment variable
- Check firewall rules for port 8090
- This means deduplication is working!
- Use UPDATE event type for changed content
- Check RID uniqueness before sending NEW events
- Verify pgvector extension:
\dxin psql - Check embedding dimension matches model output
- Review Event Bridge logs for errors
- Adjust chunk size and overlap in configuration
- Implement rate limiting for sensor events
- Consider horizontal scaling with multiple Event Bridge instances
# Enable debug logging
LOG_LEVEL=DEBUG python koi_event_bridge_v2.py
# Check specific component
python -c "from koi_event_bridge_v2 import test_connection; test_connection()"- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
- koi-sensors - Sensor implementations
- koi-research - Research and documentation
- GAIA - Eliza AI agent framework
MIT License - see LICENSE file for details
Built with 💚 for the regenerative future