Phase 4 wraps everything into a middleware pipeline. It builds the remaining data structure (Sliding Focus Buffer), the remaining algorithm (Context Budgeting), and the HCEPipeline class that orchestrates all components.
hce_pipeline.py (~350 lines)
hce_core: EntityGraph, NodeType, EdgeType, spreading_activationsemantic_tree: SemanticTree, hierarchical_relevance_searchentity_extractor: EntityExtractor- Standard library:
collections.deque,dataclasses,time,json,pathlib
role: str— "user" or "assistant"content: strtimestamp: float
The short-term memory component from the architecture.
| Method | Purpose |
|---|---|
__init__(max_size) |
Create buffer with max N entries (default 10) |
add(role, content) |
Push entry, oldest drops off when full |
get_recent(n) |
Get last n entries (or all if n is None) |
clear() |
Empty the buffer |
size property |
Current entry count |
to_text() |
Format buffer as readable text for LLM context |
content: strsource: str— "graph", "tree", or "buffer"utility: floattoken_cost: intmetadata: dict
Simple heuristic: len(text.split()) * 4 // 3 (~1.33 tokens per word).
Pluggable — users can replace with tiktoken.
Standalone function. Greedy knapsack:
- Compute ratio = utility / token_cost for each candidate
- Sort by ratio descending
- Greedily pack until budget exhausted
- Return selected candidates
| Method | Purpose |
|---|---|
__init__(max_focus, context_budget, graph, tree) |
Initialize with optional pre-built components |
retrieve_context(query) -> str |
Run all retrieval algorithms, budget, format context block |
update(user_query, ai_response) |
Store interaction in all three structures |
build_prompt(user_query) -> str |
Combine [Context Block] + [Focus Buffer] + [User Query] |
wrap_chat(chat_func) -> Callable |
Decorator: intercept → enrich → call LLM → update |
save(directory) / load(directory) |
Persist all state to a directory |
- Extract entities from query → seed nodes for spreading activation
- Run
spreading_activation(graph, seeds)→ graph candidates - Run
hierarchical_relevance_search(tree, query)→ tree candidates - Collect focus buffer entries → buffer candidates
- Merge all into
ContextCandidatelist - Run
context_budgeting(candidates, budget)→ selected candidates - Format selected candidates into a context block string
- Add user query + AI response to focus buffer
- Store combined interaction in semantic tree
- Extract entities from both query and response, update graph
- Import check
- End-to-end: create pipeline, process several queries, verify context retrieval improves
- Context budgeting: verify budget is respected
- wrap_chat: verify decorator enriches prompts and stores responses
- Save/load round-trip