Skip to content

Add selective context compression for RAG generation #3

@colek42

Description

@colek42

Summary

Implement selective context compression to reduce token usage while preserving important information for generation.

Background

The Python REFRAG implementation uses a compress-select-expand pipeline: chunk passages into fixed-size segments, compute importance via query similarity, expand only top-p% chunks, and compress the rest with LLM summarization.

Reference: refrag_ollama.py:610-625 (REFRAGOllama.compress_and_select)

Features

Chunking

  • Split passages into k-token chunks (default: k=64)
  • Use GPT-2 tokenizer for chunking (lightweight, consistent)

Importance Scoring

  • Encode chunks with query context
  • Compute cosine similarity between chunk and query
  • Rank chunks by importance

Selective Expansion

  • Expand top p% of chunks (default: p=0.25, i.e., 25%)
  • Compress low-importance chunks with LLM (Claude Haiku or similar)
  • Build compressed context string

Performance

  • Reduces context from ~10k tokens to ~2k tokens
  • Preserves most important information
  • Enables longer generation with limited context windows

Implementation Tasks

  • Add tokenizer for chunking (use tiktoken or similar)
  • Implement chunk importance scoring
  • Add LLM compression for low-importance chunks
  • Integrate with HybridIndex API
  • Add compression metrics/logging

API Design

type CompressionOptions struct {
    ChunkSize      int     // Chunk size in tokens (default: 64)
    SelectionRatio float64 // Fraction of chunks to expand (default: 0.25)
    CompressModel  string  // LLM model for compression (e.g., "claude-haiku")
}

type HybridIndex struct {
    // ... existing fields ...
}

func (h *HybridIndex) CompressContext(results []*SearchResult, query string, opts *CompressionOptions) (string, error)

Benefits

  • 5x reduction in context tokens (10k → 2k)
  • Enables longer generation with limited context
  • Preserves query-relevant information
  • Proven effective in Python REFRAG

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions