forked from blevesearch/go-faiss
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
Implement selective context compression to reduce token usage while preserving important information for generation.
Background
The Python REFRAG implementation uses a compress-select-expand pipeline: chunk passages into fixed-size segments, compute importance via query similarity, expand only top-p% chunks, and compress the rest with LLM summarization.
Reference: refrag_ollama.py:610-625 (REFRAGOllama.compress_and_select)
Features
Chunking
- Split passages into k-token chunks (default: k=64)
- Use GPT-2 tokenizer for chunking (lightweight, consistent)
Importance Scoring
- Encode chunks with query context
- Compute cosine similarity between chunk and query
- Rank chunks by importance
Selective Expansion
- Expand top p% of chunks (default: p=0.25, i.e., 25%)
- Compress low-importance chunks with LLM (Claude Haiku or similar)
- Build compressed context string
Performance
- Reduces context from ~10k tokens to ~2k tokens
- Preserves most important information
- Enables longer generation with limited context windows
Implementation Tasks
- Add tokenizer for chunking (use tiktoken or similar)
- Implement chunk importance scoring
- Add LLM compression for low-importance chunks
- Integrate with HybridIndex API
- Add compression metrics/logging
API Design
type CompressionOptions struct {
ChunkSize int // Chunk size in tokens (default: 64)
SelectionRatio float64 // Fraction of chunks to expand (default: 0.25)
CompressModel string // LLM model for compression (e.g., "claude-haiku")
}
type HybridIndex struct {
// ... existing fields ...
}
func (h *HybridIndex) CompressContext(results []*SearchResult, query string, opts *CompressionOptions) (string, error)Benefits
- 5x reduction in context tokens (10k → 2k)
- Enables longer generation with limited context
- Preserves query-relevant information
- Proven effective in Python REFRAG
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels