A parallel data exporter for Quickwit with adaptive rate limiting and automatic splitting for large datasets.
Because sometimes you need to get those logs on your machine for the analysis you need to do.
- ✨ Parallel Processing - Export multiple days concurrently with configurable worker pools
- 🚀 Adaptive Rate Limiting - Automatically adjusts request rate (1-20 req/s) based on server response times and errors
- 📦 Automatic Subdivision - Handles large datasets by intelligently splitting time ranges based on record counts
- 💾 Compressed Output - Automatic gzip compression of exported JSONL data
- 🔄 Resume Support - Skip already-exported days and time ranges on restart
- 🏗️ Work Queue Architecture - Worker-local subdivision queues prevent deadlock
- 📋 Coverage-Based Aggregation - Detects completion and merges temp files when ranges are fully covered
- Go 1.21 or later
git clone https://github.com/miguelbernadi/quickwit-exporter.git
cd quickwit-exporter
go build -o quickwit-exporter cmd/exporter/main.goOr install directly:
go install github.com/miguelbernadi/quickwit-exporter/cmd/exporter@latest./quickwit-exporter --server https://quickwit.example.comThis will export the last 30 days (default) using query "*" (all records).
./quickwit-exporter \
--server https://quickwit.example.com \
--index logs \
--query "level:error" \
--days 7 \
--workers 5| Flag | Default | Description |
|---|---|---|
--server |
required | Quickwit server URL |
--index |
myindex |
Index name to query |
--query |
* |
Quickwit search query |
--days |
30 |
Number of days to export (counting backwards from today) |
--output |
quickwit_export_{date} |
Output directory path |
--temp-dir |
{output}/.tmp |
Temporary files directory |
--workers |
3 |
Number of parallel workers |
--debug |
false |
Enable debug logging |
./quickwit-exporter \
--server https://quickwit.example.com \
--days 7./quickwit-exporter \
--server https://quickwit.example.com \
--query "level:error AND service:api"./quickwit-exporter \
--server https://quickwit.example.com \
--index logsFor faster exports with powerful servers:
./quickwit-exporter \
--server https://quickwit.example.com \
--workers 5quickwit_export_20251112/
├── export_1731369600-1731456000.jsonl.gz
├── export_1731283200-1731369600.jsonl.gz
├── export_1731196800-1731283200.jsonl.gz
└── ...
Files are named with Unix timestamps: export_{startUnix}-{endUnix}.jsonl.gz
Each file contains all records for a single day in JSONL format (one JSON object per line), compressed with gzip.
Each line in the uncompressed file is a complete JSON object:
{"timestamp":"2025-11-12T10:23:45Z","message":"API key validated","level":"info",...}
{"timestamp":"2025-11-12T10:24:12Z","message":"User authentication successful","level":"debug",...}Decompress and view:
zcat quickwit_export_20251112/export_*.jsonl.gz | head -10Combine multiple days:
zcat quickwit_export_20251112/export_*.jsonl.gz > combined_all.jsonlSort by timestamp:
zcat quickwit_export_20251112/export_*.jsonl.gz | \
jq -s 'sort_by(.timestamp) | .[]' -c > combined_sorted.jsonlFilter and analyze:
# Count records by level
zcat export_*.jsonl.gz | jq -r '.level' | sort | uniq -c
# Extract specific fields
zcat export_*.jsonl.gz | jq '{timestamp, level, message}' -c
# Find all errors
zcat export_*.jsonl.gz | jq 'select(.level == "error")'Main Process
↓
Orchestrator (splits time range into ≤1 day chunks)
↓
Shared Work Queue → [Day1, Day2, Day3, ...]
↓
Workers (N parallel, default: 3)
├─ Each worker has local subdivision queue (capacity: 100)
├─ Check count for time range
├─ If count ≤ 10K: Fetch & write temp file
└─ If count > 10K: Subdivide & enqueue to local queue
↓
Adaptive Rate Limiter (1-20 req/s, adjusts automatically)
↓
Quickwit API
↓
Compactor (coverage-based aggregation)
├─ Monitors completed work items
├─ Aggregates when range fully covered
└─ Produces: export_{startUnix}-{endUnix}.jsonl.gz
The exporter intelligently handles datasets exceeding the 10,000 record fetch limit:
- Check count first: Use lightweight count API (MaxHits: 0) to check record count
- Smart decision:
- If count ≤ 10K: Fetch all records directly
- If count > 10K: Subdivide time range before fetching
- Work queue subdivision: Split range and enqueue to worker-local queue
- Parallel processing: Each worker processes its own subdivisions
- Coverage-based aggregation: Compactor merges temp files when range is fully covered
Example for a day with 35,000 records:
Day (35K records) → Check count
↓ Count > 10K, subdivide into 4 quarters
Q1 (8,750 records) → Fetch directly → write temp file
Q2 (8,750 records) → Fetch directly → write temp file
Q3 (8,750 records) → Fetch directly → write temp file
Q4 (8,750 records) → Fetch directly → write temp file
↓ All quarters complete, coverage check passes
Compactor → Merge Q1+Q2+Q3+Q4 → Final file (35K records)
Key advantage: Never hits offset limit because we check count first and subdivide proactively.
The rate limiter automatically adjusts based on server health:
- Initial rate: 5 requests/second (conservative start)
- Speed up: If response times < 500ms and error rate < 1%, increase by 20%
- Slow down: If response times > 2s or error rate > 10%, decrease by 30%
- Limits: Min 1 req/s, Max 20 req/s
This ensures optimal performance without overwhelming the server.
- Workers: Set to 3-5 for balanced performance (diminishing returns beyond 5)
- Adaptive rate limiting: Automatically adjusts between 1-20 req/s based on server health
- Network: Run on same cloud region as Quickwit for best performance
- Disk I/O: Use fast storage for output directory (SSD recommended)
- Automatically subdivides large time ranges to handle Quickwit's 10K record limit
- Worker-local subdivision queues prevent deadlock
- Coverage-based aggregation enables incremental completion
- Resume capability allows restarts without re-downloading completed days
The adaptive rate limiter should handle this automatically, but if you see persistent 429s:
- Reduce
--workers(try 1 or 2) - The rate limiter will automatically slow down on errors
- Check network latency to Quickwit server
- Monitor server resource usage
- Increase
--workersif server can handle it - Verify disk I/O isn't bottlenecked
- Check adaptive rate limiter isn't being too conservative
quickwit-exporter/
├── cmd/
│ └── exporter/
│ └── main.go # CLI entry point
├── internal/
│ ├── client/
│ │ ├── quickwit.go # Quickwit API client
│ │ └── quickwit_test.go
│ ├── contextlog/
│ │ ├── contextlog.go # Context-based logging helpers
│ │ └── contextlog_test.go
│ ├── exporter/
│ │ ├── orchestrator.go # Main orchestration & coordination
│ │ ├── worker.go # Work queue processing
│ │ ├── compactor.go # Coverage-based aggregation
│ │ ├── coverage.go # Coverage checking logic
│ │ ├── file_writer.go # JSONL file writing
│ │ └── *_test.go # Comprehensive test suite
│ └── ratelimit/
│ ├── adaptive.go # Adaptive rate limiter
│ └── adaptive_test.go
├── .github/workflows/
│ └── pr-checks.yml # CI/CD pipeline
├── Makefile # Build automation
├── CLAUDE.md # Development guide
└── README.md
make test # Run tests with race detection (recommended)
go test ./... # Basic test run
go test -race ./... # With race detection# Linux
GOOS=linux GOARCH=amd64 go build -o quickwit-exporter-linux cmd/exporter/main.go
# macOS (Intel)
GOOS=darwin GOARCH=amd64 go build -o quickwit-exporter-darwin-amd64 cmd/exporter/main.go
# macOS (Apple Silicon)
GOOS=darwin GOARCH=arm64 go build -o quickwit-exporter-darwin-arm64 cmd/exporter/main.go
# Windows
GOOS=windows GOARCH=amd64 go build -o quickwit-exporter.exe cmd/exporter/main.goContributions are welcome! Please feel free to submit a Pull Request.
- Follow Go best practices and idioms
- Maintain minimal external dependencies
- Add tests for new functionality
- Update README for user-facing changes
- Use meaningful commit messages
For issues, questions, or feature requests, please open an issue on GitHub.