AI-powered personal web indexing with privacy-first local search
A privacy-first Chrome extension that automatically indexes your browsing history locally and provides instant AI-powered search from your new tab. Never lose track of that important article or documentation again.
# 1. Clone
git clone https://github.com/ZaynJarvis/newtab && cd newtab
# 2. Configure LLM Provider (choose one):
# Option A: OpenAI (Recommended)
cp backend/.env.example backend/.env
# Edit .env: set API_TOKEN=sk-your-openai-key
# Option B: Keep ARK (Existing Users)
cp backend/.env.ark backend/.env
# Edit .env: set API_TOKEN=your-ark-token
# Option C: Other Providers (Claude, Groq)
# See LLM_PROVIDERS.md for detailed setup
# 3. Start backend
docker compose -f docker-compose.yml up -d
# 4. Load extension: chrome://extensions/ β Developer mode β Load unpacked β 'extension' folderπ Full installation guide: INSTALL.md
π€ LLM Provider setup: LLM_PROVIDERS.md
# Test with Docker (recommended)
docker compose exec backend python -m pytest tests/test_simple_backend.py -v
# Or run full test suite
python run_tests.py all# Test health endpoint
curl http://localhost:8000/health
# Add test content and search
curl -X POST "http://localhost:8000/index" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "title": "Test", "content": "Test content"}'
curl "http://localhost:8000/search?q=test"π Full testing guide: E2E_TESTING_GUIDE.md
Choose your preferred AI provider for optimal performance and cost:
| Provider | LLM Support | Embeddings | Speed | Cost | Setup |
|---|---|---|---|---|---|
| OpenAI | β GPT-4 | β text-embedding-3 | βββ | ββ | Quick setup |
| Claude | β Claude-3 | β | βββ | ββ | Setup guide |
| Groq | β Llama3 | β | βββββ | ββββ | Setup guide |
| ARK | β Custom | β Custom | βββ | βββ | Internal setup |
π‘ Recommendations:
- New users: Start with OpenAI (most reliable)
- Speed focused: Use Groq for LLM + OpenAI for embeddings
- Privacy focused: Deploy local OpenAI-compatible models
π Detailed Provider Guide β
- Auto-captures every unique webpage you visit
- AI-generated keywords and descriptions via multiple LLM providers
- Vector embeddings for semantic similarity search
- Background processing - no interruption to browsing
- Frequency tracking with ARC-based visit analytics and page scoring
- Keyword search with SQLite FTS5 full-text indexing
- Semantic search using 2048-dimensional vector embeddings
- LRU cached embeddings for offline resilience and performance
- 3-step fallback strategy for API-independent search reliability
- Frequency-boosted ranking for commonly accessed pages
- ARC-based relevance scoring combining recency and access patterns
- Sub-100ms response times with 600+ requests/second throughput
- Adaptive Replacement Cache (ARC) algorithm for intelligent page eviction
- LRU Query Embedding Cache with 1000-query capacity and 7-day TTL
- Visit frequency tracking with automatic count suppression
- Smart re-indexing (only when content is >3 days old)
- Configurable storage limits with automatic cleanup
- Offline-first design for API-independent functionality
- 100% local storage - no cloud syncing or external data sharing
- User-controlled exclusions - blacklist sensitive domains
- Complete data ownership - export/import your entire index
- GDPR compliant - you control your data
| Metric | Target | Achieved |
|---|---|---|
| Indexing Response | <100ms | <10ms |
| Keyword Search | <500ms | <5ms |
| Vector Search | <1s | <100ms |
| Throughput | 100 req/s | 600+ req/s |
| Memory Usage | <100MB | 0.06MB/1000 vectors |
graph TB
A[Chrome Extension] --> B[FastAPI Backend]
B --> C[SQLite FTS5]
B --> D[Vector Store]
B --> E[ByteDance Ark API]
subgraph "Local Storage"
C[SQLite FTS5<br/>Keyword Index]
D[In-Memory<br/>Vector Store]
end
subgraph "AI Processing"
E[ByteDance Ark<br/>LLM + Embeddings]
end
| Component | Purpose | Technology |
|---|---|---|
| API Server | RESTful backend service | FastAPI + Uvicorn |
| Database | Keyword indexing & frequency tracking | SQLite FTS5 + ARC metadata |
| Vector Store | Semantic similarity search | NumPy + Cosine similarity |
| Query Cache | LRU embedding cache for offline search | Thread-safe LRU + JSON persistence |
| ARC Cache | Intelligent page eviction | Adaptive Replacement Cache algorithm |
| AI Client | LLM processing & embeddings | Multi-provider (OpenAI, Claude, Groq, ARK) |
| Extension | Browser integration | Chrome Manifest V3 |
# Index a webpage
POST /index
{
"url": "https://example.com",
"title": "Page Title",
"content": "Main page content..."
}
# Unified search (keyword + semantic + frequency)
GET /search?q=machine+learning
# Track page visits for frequency analytics
POST /track-visit
{
"url": "https://example.com"
}
# Get frequency analytics
GET /analytics/frequency?days=30
# Manual eviction management
POST /eviction/run
GET /eviction/preview?count=10
GET /eviction/stats
# Query embedding cache management
GET /cache/query/stats
GET /cache/query/top?limit=10
POST /cache/query/clear
POST /cache/query/cleanup
# Health check & system statistics
GET /health
GET /statsπ‘ Full API Documentation available when server is running
The system implements a sophisticated 3-step fallback strategy for embedding-based search:
1. π CACHE HIT β Use cached embedding (instant)
2. π API CALL β Generate new embedding + cache it
3. π FALLBACK β Use keyword search top result's embedding- LRU Eviction: 1000-query capacity with intelligent eviction
- TTL Expiration: 7-day automatic expiration for freshness
- Thread Safety: Concurrent access with RLock protection
- Persistence: Auto-save every 20 operations to JSON file
- Statistics: Hit/miss rates, access patterns, performance metrics
# View cache statistics
curl localhost:8000/cache/query/stats
# Get most popular queries
curl localhost:8000/cache/query/top?limit=5
# Clear all cached embeddings
curl -X POST localhost:8000/cache/query/clear
# Remove expired entries
curl -X POST localhost:8000/cache/query/cleanup- π 10x faster repeated searches (cache hits)
- π Works offline when embedding API is down
- π° Cost reduction by minimizing API calls
- π Analytics for query patterns and optimization
New Tab includes optional monitoring and observability features for production deployments:
# Start with observability (Prometheus, Grafana, Loki)
docker compose -f docker-compose.observe.yml up -d
# Access monitoring interfaces
open http://localhost:3000 # Grafana dashboards
open http://localhost:9090 # Prometheus metrics| Component | Purpose | Port | Technology |
|---|---|---|---|
| Grafana | Visual dashboards & alerting | :3000 |
Grafana 10.0 |
| Prometheus | Metrics collection & storage | :9090 |
Prometheus 2.45 |
| Loki | Log aggregation & search | :3100 |
Loki 2.8 |
| Promtail | Log collection agent | N/A | Promtail 2.8 |
| cAdvisor | Container metrics | :8080 |
cAdvisor 0.47 |
| Node Exporter | System metrics | :9100 |
Node Exporter 1.6 |
The backend uses comprehensive structured JSON logging:
{
"timestamp": "2025-08-17T12:03:37.665865Z",
"level": "INFO",
"logger": "src.services.api_client",
"message": "Generated and cached new embedding for query",
"extra": {
"query_preview": "machine learning fundamentals",
"embedding_dimension": 2048,
"event": "embedding_generated"
}
}- API Performance: Request latency, throughput, error rates
- Search Analytics: Query patterns, cache hit rates, response times
- Memory Usage: Vector store efficiency, database growth
- AI Processing: Embedding generation, LLM API calls, error rates
- Cache Performance: Query cache hits/misses, eviction rates
- Setup Guide: docs/OBSERVABILITY.md
- Monitoring Guide: docs/MONITORING.md
- Dashboard Configs: config/grafana/
# Generate test data (10 realistic web pages)
uv run python demo/test-data-generator.py
# Run validation suite
uv run python demo/quick-test.py
# Test query embedding cache
uv run python backend/test_query_cache.py
# Performance benchmark
uv run python demo/test_backend.pynewtab/
βββ backend/ # π’ Production Ready
β βββ src/
β β βββ main.py # FastAPI application
β β βββ core/
β β β βββ database.py # SQLite + FTS5 + frequency tracking
β β β βββ logging.py # Structured JSON logging
β β β βββ models.py # Pydantic models + frequency types
β β βββ services/
β β β βββ vector_store.py # In-memory vector search
β β β βββ api_client.py # ByteDance Ark integration
β β βββ cache/
β β β βββ query_embedding_cache.py # LRU cache for query embeddings
β β βββ api/ # API endpoints
β β βββ indexing.py # Page indexing
β β βββ search.py # Search endpoints
β β βββ monitoring.py # Metrics & observability
β βββ arc/ # ARC-based eviction system
β β βββ eviction.py # Eviction policies
β β βββ arc_cache.py # ARC algorithm implementation
β β βββ utils.py # Cache utilities
β βββ tests/ # Test suite
βββ config/ # π Observability (Add-on)
β βββ grafana/ # Dashboard configs
β βββ prometheus/ # Metrics collection
β βββ loki/ # Log aggregation
β βββ promtail/ # Log shipping
βββ extension/ # π‘ In Development
β βββ manifest.json # Chrome Extension config
β βββ newtab/ # New tab override UI
β βββ content/ # Content extraction scripts
βββ docs/ # π Documentation
β βββ OBSERVABILITY.md # Monitoring setup guide
β βββ MONITORING.md # Dashboard guide
βββ demo/ # π’ Complete
β βββ test-data-generator.py
β βββ quick-test.py
βββ docker-compose.yml # Basic deployment
βββ docker-compose.observe.yml # With monitoring stack
- Docker or Colima (see INSTALL.md)
- Chrome browser
- Python 3.11+ and uv (for local development)
# Development with live reload
docker compose -f docker-compose.dev.yml up --build
# View logs
docker compose -f docker-compose.dev.yml logs -f backend-dev# Backend development
cd backend
uv sync
uv run uvicorn src.main:app --reload
# Extension development
cd extension
# Load unpacked extension in Chrome://extensions# Optional - ByteDance Ark API integration
export ARK_API_TOKEN="your-api-token-here"
# Without API token, system uses mock data for developmentOptimized for efficiency - New Tab is designed to be lightweight:
| Component | Memory Usage | Notes |
|---|---|---|
| Docker Container | ~122MB | FastAPI + Python runtime |
| Vector Store | 0.5MB per 1000 pages | In-memory embeddings (auto-evicts at 10k limit) |
| Database | ~60KB per page | SQLite with full-text search |
| Total (1000 pages) | ~125MB | Excellent efficiency |
Scaling estimates:
- 10 pages: ~122MB
- 1,000 pages: ~125MB
- 10,000 pages: ~130MB (with auto-eviction)
Your indexed web pages and search history are stored locally in:
- Docker setup:
./data/backend/web_memory.db - Local setup:
./backend/web_memory.db
# Backup your data
cp ./data/backend/web_memory.db ./backup_$(date +%Y%m%d).db
# Restore from backup
cp ./backup_20240315.db ./data/backend/web_memory.db
docker compose restart backend# Check memory usage and stats
curl http://localhost:8000/metrics
# View detailed statistics
curl http://localhost:8000/stats# Reset all data (Docker)
docker compose down
rm -rf ./data/backend/*
docker compose up -d
# Reset all data (Local)
rm ./backend/web_memory.db ./backend/query_embeddings_cache.json- Phase 1A: Backend API with AI integration (Complete)
- Phase 1B: Testing infrastructure & validation (Complete)
- Phase 2A: Chrome extension core functionality
- Phase 2B: New tab search interface
- Phase 3: Advanced features (export, analytics, filters)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- π Documentation: API Docs β’ Implementation Plan
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
β Star this repo if New Tab helps you rediscover the web!
Built with β€οΈ for developers who never want to lose that perfect Stack Overflow answer again