Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 27 additions & 27 deletions benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ This benchmark was executed using amdb project files and measured using tiktoken
|--------|-------|-------------|
| **Target Project** | amdb source code | Approximately 30 Rust modules |
| **Measurement Tool** | tiktoken | cl100k_base encoder (OpenAI GPT-4 tokenizer) |
| **Total Files Analyzed** | 29/29 | 100.0% retrieval accuracy |
| **Global Token Reduction** | 74.6% | Compared to full-repository code dumping |
| **Total Files Analyzed** | 30/30 | 96.7% precision targeting |
| **Global Token Reduction** | 83.3% | Compared to full-repository code dumping |
| **Context Graph Inclusion** | 100.0% | Mermaid diagrams generated for all modules |
| **Encoding Standard** | cl100k_base | Production-grade tokenization matching GPT-4 |

Expand All @@ -19,9 +19,9 @@ This benchmark was executed using amdb project files and measured using tiktoken

The structural context approach employed by `amdb` demonstrates **significant efficiency gains** over traditional full-code dumping methods:

- **Token Efficiency**: Achieved 74.6% reduction in token count while maintaining 100% file coverage
- **Intelligent Compression**: Heavy-weight files show 88-97% compression rates
- **Zero Information Loss**: 100% retrieval accuracy ensures no critical code context is missed
- **Token Efficiency**: Achieved 83.3% reduction in token count while maintaining full file coverage
- **Intelligent Compression**: Heavy-weight files show 85-97% compression rates (93.0% average)
- **Precision Targeting**: 96.7% accuracy ensures the exact file is retrieved
- **Visual Context**: Mermaid graphs provide architectural understanding without additional token overhead

---
Expand All @@ -32,32 +32,32 @@ The following table showcases token reduction for the five most token-intensive

| Module | Original Tokens | Compressed Tokens | Reduction | Compression Rate |
|--------|----------------|-------------------|-----------|------------------|
| **indexer** | 1,339 | 49 | -1,290 tokens | 96.3% |
| **parser** | 1,232 | 63 | -1,169 tokens | 94.9% |
| **query** | 1,224 | 108 | -1,116 tokens | 91.2% |
| **parser** | 1,177 | 63 | -1,114 tokens | 94.6% |
| **indexer** | 1,095 | 49 | -1,046 tokens | 95.5% |
| **vector_store** | 915 | 107 | -808 tokens | 88.3% |
| **generator** | 852 | 23 | -829 tokens | 97.3% |
| **Total (Top 5)** | **5,263** | **350** | **-4,913 tokens** | **93.3% avg** |
| **vector_store** | 1,013 | 147 | -866 tokens | 85.5% |
| **generator** | 839 | 23 | -816 tokens | 97.3% |
| **Total (Top 5)** | **5,647** | **390** | **-5,257 tokens** | **93.0% avg** |

### Analysis

These five modules alone account for a substantial portion of the codebase's complexity:

1. **query (91.2% compression)**: The query module handles vector similarity search and ranking. By extracting only function signatures, type definitions, and key relationships, `amdb` reduces 1,224 tokens to just 108 while preserving the module's API surface and usage patterns.
1. **indexer (96.3% compression)**: File system traversal, language detection, and batch processing logic is summarized into high-level operations and data flows, reducing from 1,339 to 49 tokens—achieving exceptional compression while maintaining structural clarity.

2. **parser (94.6% compression)**: Tree-sitter integration and AST traversal logic is highly verbose. The structural representation captures parse tree navigation and symbol extraction without including implementation details, achieving near-95% compression.
2. **parser (94.9% compression)**: Tree-sitter integration and AST traversal logic is highly verbose. The structural representation captures parse tree navigation and symbol extraction without including implementation details, achieving near-95% compression.

3. **indexer (95.5% compression)**: File system traversal, language detection, and batch processing logic is summarized into high-level operations and data flows, reducing from 1,095 to 49 tokens.
3. **query (91.2% compression)**: The query module handles vector similarity search and ranking. By extracting only function signatures, type definitions, and key relationships, `amdb` reduces 1,224 tokens to just 108 while preserving the module's API surface and usage patterns.

4. **vector_store (88.3% compression)**: Despite being the most "data-heavy" of the top 5, vector embeddings storage and retrieval logic still achieves 88% compression by focusing on API contracts rather than implementation.
4. **vector_store (85.5% compression)**: Despite being the most "data-heavy" of the top 5, vector embeddings storage and retrieval logic still achieves 85.5% compression by focusing on API contracts rather than implementation.

5. **generator (97.3% compression)**: The markdown context generation logic shows the highest compression rate at 97.3%. Template logic and string formatting are abstracted into their purpose and outputs.

---

## Global Context Efficiency

### Why 74.6% Global Reduction Matters
### Why 83.3% Global Reduction Matters

When working with Large Language Models (LLMs), context window size is a **hard constraint**. Modern models like GPT-4 have 8K-128K token limits, and exceeding these limits results in:

Expand All @@ -67,17 +67,17 @@ When working with Large Language Models (LLMs), context window size is a **hard

**Traditional Approach: Full-Code Dumping**
```
Total Tokens: ~15,000+ tokens (estimated full codebase)
Context Window Usage: 15-20% of GPT-4-8K window
Total Tokens: ~9,671 tokens (full codebase raw tokens)
Context Window Usage: 12-15% of GPT-4-8K window
Cost per API call: High (all tokens processed)
```

**amdb Structural Approach**
```
Total Tokens: ~3,810 tokens (74.6% reduction)
Context Window Usage: 4-5% of GPT-4-8K window
Total Tokens: ~1,615 tokens (83.3% reduction)
Context Window Usage: ~2% of GPT-4-8K window
Cost per API call: Low (only essential tokens)
Information Preservation: 100% (all files represented)
Information Preservation: High (96.7% precision targeting)
```

### What Gets Preserved
Expand Down Expand Up @@ -172,10 +172,10 @@ The benchmark results conclusively demonstrate that **structural context generat

### Quantitative Benefits

1. **74.6% Token Reduction**: Massive savings in context window usage
2. **93.3% Compression on Heavy Files**: Core modules see 9-19x size reduction
3. **100% Retrieval Accuracy**: Zero information loss at the structural level
4. **Cost Efficiency**: ~4x reduction in API token costs for LLM interactions
1. **83.3% Token Reduction**: Massive savings in context window usage
2. **93.0% Compression on Heavy Files**: Core modules see 6-20x size reduction
3. **96.7% Precision Targeting**: Highly accurate file retrieval
4. **Cost Efficiency**: ~6x reduction in API token costs for LLM interactions

### Qualitative Benefits

Expand All @@ -192,11 +192,11 @@ For any project exceeding 10,000 tokens (~50+ files), structural context generat

## About This Benchmark

**Benchmark Version**: 1.0
**Date Executed**: 2026-02-07
**Benchmark Version**: 1.1
**Date Executed**: 2026-02-11
**Tool Version**: amdb v0.3.0
**Tokenizer**: tiktoken (cl100k_base)
**Target**: amdb self-analysis (30 Rust modules)
**Target**: amdb self-analysis (30 Rust modules, 9,671 raw tokens)

For questions or to report benchmark discrepancies, contact: try.betaer@gmail.com

Expand Down