Quickwit Exporter

A parallel data exporter for Quickwit with adaptive rate limiting and automatic splitting for large datasets.

Because sometimes you need to get those logs on your machine for the analysis you need to do.

Features

✨ Parallel Processing - Export multiple days concurrently with configurable worker pools
🚀 Adaptive Rate Limiting - Automatically adjusts request rate (1-20 req/s) based on server response times and errors
📦 Automatic Subdivision - Handles large datasets by intelligently splitting time ranges based on record counts
💾 Compressed Output - Automatic gzip compression of exported JSONL data
🔄 Resume Support - Skip already-exported days and time ranges on restart
🏗️ Work Queue Architecture - Worker-local subdivision queues prevent deadlock
📋 Coverage-Based Aggregation - Detects completion and merges temp files when ranges are fully covered

Installation

Prerequisites

Go 1.21 or later

Build from source

git clone https://github.com/miguelbernadi/quickwit-exporter.git
cd quickwit-exporter
go build -o quickwit-exporter cmd/exporter/main.go

Or install directly:

go install github.com/miguelbernadi/quickwit-exporter/cmd/exporter@latest

Usage

Basic Usage

./quickwit-exporter --server https://quickwit.example.com

This will export the last 30 days (default) using query "*" (all records).

Full Options

./quickwit-exporter \
  --server https://quickwit.example.com \
  --index logs \
  --query "level:error" \
  --days 7 \
  --workers 5

Command-Line Flags

Flag	Default	Description
`--server`	required	Quickwit server URL
`--index`	`myindex`	Index name to query
`--query`	`*`	Quickwit search query
`--days`	`30`	Number of days to export (counting backwards from today)
`--output`	`quickwit_export_{date}`	Output directory path
`--temp-dir`	`{output}/.tmp`	Temporary files directory
`--workers`	`3`	Number of parallel workers
`--debug`	`false`	Enable debug logging

Examples

Export with custom time range

./quickwit-exporter \
  --server https://quickwit.example.com \
  --days 7

Export with custom query

./quickwit-exporter \
  --server https://quickwit.example.com \
  --query "level:error AND service:api"

Export specific index

./quickwit-exporter \
  --server https://quickwit.example.com \
  --index logs

High-performance export

For faster exports with powerful servers:

./quickwit-exporter \
  --server https://quickwit.example.com \
  --workers 5

Output Format

Directory Structure

quickwit_export_20251112/
├── export_1731369600-1731456000.jsonl.gz
├── export_1731283200-1731369600.jsonl.gz
├── export_1731196800-1731283200.jsonl.gz
└── ...

Files are named with Unix timestamps: export_{startUnix}-{endUnix}.jsonl.gz

Each file contains all records for a single day in JSONL format (one JSON object per line), compressed with gzip.

File Format

Each line in the uncompressed file is a complete JSON object:

{"timestamp":"2025-11-12T10:23:45Z","message":"API key validated","level":"info",...}
{"timestamp":"2025-11-12T10:24:12Z","message":"User authentication successful","level":"debug",...}

Working with exported data

Decompress and view:

zcat quickwit_export_20251112/export_*.jsonl.gz | head -10

Combine multiple days:

zcat quickwit_export_20251112/export_*.jsonl.gz > combined_all.jsonl

Sort by timestamp:

zcat quickwit_export_20251112/export_*.jsonl.gz | \
  jq -s 'sort_by(.timestamp) | .[]' -c > combined_sorted.jsonl

Filter and analyze:

# Count records by level
zcat export_*.jsonl.gz | jq -r '.level' | sort | uniq -c

# Extract specific fields
zcat export_*.jsonl.gz | jq '{timestamp, level, message}' -c

# Find all errors
zcat export_*.jsonl.gz | jq 'select(.level == "error")'

How It Works

Architecture

Main Process
     ↓
Orchestrator (splits time range into ≤1 day chunks)
     ↓
Shared Work Queue → [Day1, Day2, Day3, ...]
     ↓
Workers (N parallel, default: 3)
   ├─ Each worker has local subdivision queue (capacity: 100)
   ├─ Check count for time range
   ├─ If count ≤ 10K: Fetch & write temp file
   └─ If count > 10K: Subdivide & enqueue to local queue
     ↓
Adaptive Rate Limiter (1-20 req/s, adjusts automatically)
     ↓
Quickwit API
     ↓
Compactor (coverage-based aggregation)
   ├─ Monitors completed work items
   ├─ Aggregates when range fully covered
   └─ Produces: export_{startUnix}-{endUnix}.jsonl.gz

Handling Large Datasets

The exporter intelligently handles datasets exceeding the 10,000 record fetch limit:

Check count first: Use lightweight count API (MaxHits: 0) to check record count
Smart decision:
- If count ≤ 10K: Fetch all records directly
- If count > 10K: Subdivide time range before fetching
Work queue subdivision: Split range and enqueue to worker-local queue
Parallel processing: Each worker processes its own subdivisions
Coverage-based aggregation: Compactor merges temp files when range is fully covered

Example for a day with 35,000 records:

Day (35K records) → Check count
  ↓ Count > 10K, subdivide into 4 quarters
Q1 (8,750 records) → Fetch directly → write temp file
Q2 (8,750 records) → Fetch directly → write temp file
Q3 (8,750 records) → Fetch directly → write temp file
Q4 (8,750 records) → Fetch directly → write temp file
  ↓ All quarters complete, coverage check passes
Compactor → Merge Q1+Q2+Q3+Q4 → Final file (35K records)

Key advantage: Never hits offset limit because we check count first and subdivide proactively.

Adaptive Rate Limiting

The rate limiter automatically adjusts based on server health:

Initial rate: 5 requests/second (conservative start)
Speed up: If response times < 500ms and error rate < 1%, increase by 20%
Slow down: If response times > 2s or error rate > 10%, decrease by 30%
Limits: Min 1 req/s, Max 20 req/s

This ensures optimal performance without overwhelming the server.

Performance

Optimization Tips

Workers: Set to 3-5 for balanced performance (diminishing returns beyond 5)
Adaptive rate limiting: Automatically adjusts between 1-20 req/s based on server health
Network: Run on same cloud region as Quickwit for best performance
Disk I/O: Use fast storage for output directory (SSD recommended)

Performance Characteristics

Automatically subdivides large time ranges to handle Quickwit's 10K record limit
Worker-local subdivision queues prevent deadlock
Coverage-based aggregation enables incremental completion
Resume capability allows restarts without re-downloading completed days

Troubleshooting

Rate limit errors (429)

The adaptive rate limiter should handle this automatically, but if you see persistent 429s:

Reduce --workers (try 1 or 2)
The rate limiter will automatically slow down on errors

Slow performance

Check network latency to Quickwit server
Monitor server resource usage
Increase --workers if server can handle it
Verify disk I/O isn't bottlenecked
Check adaptive rate limiter isn't being too conservative

Development

Project Structure

quickwit-exporter/
├── cmd/
│   └── exporter/
│       └── main.go                  # CLI entry point
├── internal/
│   ├── client/
│   │   ├── quickwit.go              # Quickwit API client
│   │   └── quickwit_test.go
│   ├── contextlog/
│   │   ├── contextlog.go            # Context-based logging helpers
│   │   └── contextlog_test.go
│   ├── exporter/
│   │   ├── orchestrator.go          # Main orchestration & coordination
│   │   ├── worker.go                # Work queue processing
│   │   ├── compactor.go             # Coverage-based aggregation
│   │   ├── coverage.go              # Coverage checking logic
│   │   ├── file_writer.go           # JSONL file writing
│   │   └── *_test.go                # Comprehensive test suite
│   └── ratelimit/
│       ├── adaptive.go              # Adaptive rate limiter
│       └── adaptive_test.go
├── .github/workflows/
│   └── pr-checks.yml                # CI/CD pipeline
├── Makefile                         # Build automation
├── CLAUDE.md                        # Development guide
└── README.md

Running Tests

make test          # Run tests with race detection (recommended)
go test ./...      # Basic test run
go test -race ./...  # With race detection

Building for Different Platforms

# Linux
GOOS=linux GOARCH=amd64 go build -o quickwit-exporter-linux cmd/exporter/main.go

# macOS (Intel)
GOOS=darwin GOARCH=amd64 go build -o quickwit-exporter-darwin-amd64 cmd/exporter/main.go

# macOS (Apple Silicon)
GOOS=darwin GOARCH=arm64 go build -o quickwit-exporter-darwin-arm64 cmd/exporter/main.go

# Windows
GOOS=windows GOARCH=amd64 go build -o quickwit-exporter.exe cmd/exporter/main.go

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Guidelines

Follow Go best practices and idioms
Maintain minimal external dependencies
Add tests for new functionality
Update README for user-facing changes
Use meaningful commit messages

License

MIT License

Support

For issues, questions, or feature requests, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
cmd/exporter		cmd/exporter
docs		docs
internal		internal
.golangci.yml		.golangci.yml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

Quickwit Exporter

Features

Installation

Prerequisites

Build from source

Usage

Basic Usage

Full Options

Command-Line Flags

Examples

Export with custom time range

Export with custom query

Export specific index

High-performance export

Output Format

Directory Structure

File Format

Working with exported data

How It Works

Architecture

Handling Large Datasets

Adaptive Rate Limiting

Performance

Optimization Tips

Performance Characteristics

Troubleshooting

Rate limit errors (429)

Slow performance

Development

Project Structure

Running Tests

Building for Different Platforms

Contributing

Development Guidelines

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages