-
Notifications
You must be signed in to change notification settings - Fork 36
Core Concepts
Understanding FACT's architecture and core concepts will help you leverage its full potential for high-performance data processing.
FACT (Fast Augmented Context Tools) is built on several key architectural principles:
┌─────────────────────────────────────────────────────────┐
│ User Interface │
│ (CLI / API / Library / Web Interface) │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────┐
│ Processing Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Templates │ │ Query │ │ Cache │ │
│ │ Registry │ │ Processor │ │ Manager │ │
│ └──────────────┘ └──────────────┘ └────────────────┘ │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────┐
│ Core Engine │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ FACT │ │ Security │ │ Tools │ │
│ │ Driver │ │ Manager │ │ Executor │ │
│ └──────────────┘ └──────────────┘ └────────────────┘ │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────┐
│ Storage Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Database │ │ File System │ │ Remote Storage │ │
│ │ (SQLite) │ │ (Cache) │ │ (Arcade.dev) │ │
│ └──────────────┘ └──────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────┘
Cognitive templates are pre-built processing patterns optimized for specific types of data analysis.
Templates are reusable configurations that define:
- Input Schema - Expected data structure
- Processing Operations - Transform, analyze, filter, aggregate
- Output Format - Result structure
- Performance Hints - Optimization strategies
-
analysis-basic
- Statistical analysis (sum, average, min, max)
- Best for numerical datasets
- Sub-50ms processing time
-
pattern-detection
- Identifies trends and patterns
- Anomaly detection
- Time-series analysis
-
data-aggregation
- Grouping and summarization
- Multi-dimensional aggregation
- High-performance for large datasets
-
quick-transform
- Fast data transformation
- Optimized for caching
- Minimal processing overhead
{
"name": "financial-analysis",
"description": "Comprehensive financial data analysis",
"operations": [
{
"type": "transform",
"config": {
"normalize_currency": true,
"calculate_percentages": true
}
},
{
"type": "analyze",
"config": {
"metrics": ["revenue_growth", "profit_margin", "roi"],
"time_period": "quarterly"
}
},
{
"type": "aggregate",
"config": {
"group_by": ["sector", "quarter"],
"calculations": ["sum", "average", "trend"]
}
}
],
"cache_hints": {
"ttl": 3600,
"key_pattern": "finance_{sector}_{quarter}"
}
}FACT's caching system is central to its performance, providing sub-100ms response times.
┌─────────────────────────────────────────┐
│ Cache Manager │
├─────────────────────────────────────────┤
│ ┌─────────────┐ ┌────────────────┐ │
│ │ Hot Cache │ │ Cold Cache │ │
│ │ (Memory) │ │ (Disk) │ │
│ └─────────────┘ └────────────────┘ │
├─────────────────────────────────────────┤
│ Eviction Policy (LRU) │
├─────────────────────────────────────────┤
│ Cache Statistics │
└─────────────────────────────────────────┘
-
Multi-Tier Caching
- Memory cache for hot data
- Disk cache for warm data
- Remote cache for cold data
-
Smart Eviction
- LRU (Least Recently Used) policy
- TTL (Time To Live) support
- Priority-based retention
-
Cache Warming
- Predictive pre-loading
- Background refresh
- Dependency tracking
def generate_cache_key(query: str, context: dict) -> str:
"""Generate deterministic cache key"""
# Normalize query
normalized = query.lower().strip()
# Add context
context_str = json.dumps(context, sort_keys=True)
# Generate hash
key_data = f"{normalized}:{context_str}"
return hashlib.sha256(key_data.encode()).hexdigest()[:16]Tools extend FACT's capabilities by providing secure, sandboxed execution of specific operations.
class Tool:
"""Base class for FACT tools"""
def __init__(self):
self.name: str
self.description: str
self.parameters: dict
self.security_level: str
async def execute(self, params: dict) -> dict:
"""Execute tool with parameters"""
# Validate parameters
self._validate(params)
# Execute in sandbox
result = await self._sandboxed_execute(params)
# Validate output
self._validate_output(result)
return result-
SQL Query Tool
- Read-only database queries
- SQL injection protection
- Result caching
-
Data Transform Tool
- Format conversion
- Data cleaning
- Schema validation
-
Analysis Tool
- Statistical calculations
- Pattern recognition
- Trend analysis
-
Export Tool
- Multiple format support
- Streaming for large datasets
- Compression options
FACT implements defense-in-depth security:
-
Input Validation
- SQL injection prevention - Path traversal protection - Command injection blocking - Size limits enforcement
-
Authentication & Authorization
- API key validation - Role-based access control - Token management - Session handling
-
Sandboxed Execution
- Resource limits - Network isolation - File system restrictions - Time limits
-
Audit Logging
- Query logging - Access tracking - Error monitoring - Performance metrics
-
Query Optimization
- Query plan caching
- Parallel execution
- Early termination
- Result streaming
-
Memory Management
- Object pooling
- Lazy loading
- Memory-mapped files
- Garbage collection tuning
-
Async Processing
- Non-blocking I/O
- Concurrent operations
- Background tasks
- Event-driven architecture
@dataclass
class PerformanceMetrics:
cache_hit_rate: float # Target: >85%
avg_response_time: float # Target: <100ms
queries_per_second: int # Target: >100
memory_usage: int # Target: <500MB
cpu_utilization: float # Target: <70%1. Request Reception
├── Input validation
├── Authentication
└── Rate limiting
2. Cache Check
├── Generate cache key
├── Check memory cache
├── Check disk cache
└── Return if hit
3. Query Processing
├── Parse query
├── Build execution plan
├── Execute tools
└── Transform results
4. Response Generation
├── Format results
├── Update cache
├── Log metrics
└── Return response
-
Synchronous Mode
- Direct request-response
- Immediate results
- Best for simple queries
-
Asynchronous Mode
- Non-blocking execution
- Concurrent processing
- Best for complex operations
-
Streaming Mode
- Progressive results
- Lower memory usage
- Best for large datasets
-
Batch Mode
- Multiple queries together
- Optimized execution
- Best for bulk operations
# REST API pattern
@app.route('/api/query', methods=['POST'])
async def query_endpoint():
data = request.json
driver = await get_driver()
result = await driver.process_query(data['query'])
return jsonify(result)# Event processing pattern
async def process_event(event):
if event.type == 'data_update':
# Invalidate related cache
await cache.invalidate_pattern(f"*{event.entity}*")
# Process updated data
result = await driver.process_query(
f"Analyze {event.entity} changes"
)
# Publish results
await publish_results(result)# Docker Compose pattern
services:
fact-processor:
image: fact-system:latest
environment:
- FACT_MODE=microservice
- CACHE_REDIS_URL=redis://cache:6379
depends_on:
- cache
- database-
Performance Metrics
- Response times (p50, p95, p99)
- Throughput (requests/second)
- Error rates
- Cache performance
-
Resource Metrics
- CPU usage
- Memory consumption
- Disk I/O
- Network traffic
-
Business Metrics
- Query patterns
- User engagement
- Feature usage
- Cost efficiency
# Prometheus metrics
fact_query_duration = Histogram(
'fact_query_duration_seconds',
'Query processing duration',
['query_type', 'cache_hit']
)
fact_cache_operations = Counter(
'fact_cache_operations_total',
'Cache operations',
['operation', 'result']
)-
Cache First
- Design for cacheability
- Use deterministic keys
- Set appropriate TTLs
-
Fail Fast
- Validate early
- Set timeouts
- Provide fallbacks
-
Scale Horizontally
- Stateless design
- Distributed caching
- Load balancing
-
Monitor Everything
- Track metrics
- Log important events
- Alert on anomalies
# Circuit breaker pattern
class CircuitBreaker:
def __init__(self, failure_threshold=5, timeout=60):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.failures = 0
self.last_failure = None
self.state = 'closed'
async def call(self, func, *args, **kwargs):
if self.state == 'open':
if time.time() - self.last_failure > self.timeout:
self.state = 'half-open'
else:
raise CircuitOpenError()
try:
result = await func(*args, **kwargs)
if self.state == 'half-open':
self.state = 'closed'
self.failures = 0
return result
except Exception as e:
self.failures += 1
self.last_failure = time.time()
if self.failures >= self.failure_threshold:
self.state = 'open'
raise-
Distributed Processing
- Multi-node clusters
- Federated queries
- Global cache synchronization
-
AI Enhancement
- Query understanding
- Automatic optimization
- Predictive caching
-
Advanced Templates
- ML model integration
- Custom operators
- Visual programming
-
Real-time Capabilities
- WebSocket support
- Server-sent events
- Change data capture
Understanding these core concepts will help you build efficient, scalable applications with FACT. For implementation details, see the language-specific guides.