-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Feature Request
Add optional metrics collection for troubleshooting and monitoring long-running processes.
Use Cases
- Debugging performance issues
- Monitoring log processing throughput
- Analyzing cache effectiveness
- Identifying bottlenecks
Proposed Metrics
Processing Metrics
- Lines processed per second
- Total lines processed
- Bytes processed
- Processing latency (p50, p95, p99)
Cache Metrics
- Cache hit rate (%)
- Cache size (entries)
- Cache hits / misses
Stream Metrics
- Stdout vs stderr line counts
- Lines per log level
- Average line length
Proposed Implementation
Option 1: CLI Flag with Summary
# Enable metrics collection
logwrap --metrics -- long-running-command
# Output at end:
# Metrics Summary:
# Lines processed: 125,432
# Processing rate: 1,234 lines/sec
# Cache hit rate: 94.3%
# Log levels: ERROR=12 WARN=45 INFO=125,375Option 2: Metrics File
# Write metrics to file
logwrap --metrics-file /tmp/logwrap-metrics.json -- command{
"duration_seconds": 120,
"lines_processed": 125432,
"lines_per_second": 1045.27,
"bytes_processed": 15234567,
"cache": {
"hits": 118234,
"misses": 7198,
"hit_rate": 0.943,
"size": 7198
},
"levels": {
"ERROR": 12,
"WARN": 45,
"INFO": 125375
},
"streams": {
"stdout": 125400,
"stderr": 32
}
}Option 3: Periodic Stats
# Print stats every 10 seconds
logwrap --metrics --metrics-interval 10s -- command
# Output:
# [10s] 12,543 lines (1,254/s) - Cache: 92.1% - Levels: E=1 W=5 I=12,537
# [20s] 25,123 lines (1,256/s) - Cache: 93.4% - Levels: E=2 W=8 I=25,113Implementation
// pkg/metrics/metrics.go
type Metrics struct {
StartTime time.Time
LinesProcessed uint64
BytesProcessed uint64
CacheHits uint64
CacheMisses uint64
LevelCounts map[string]uint64
StreamCounts map[string]uint64
mu sync.Mutex
}
func (m *Metrics) RecordLine(line, level, stream string) {
m.mu.Lock()
defer m.mu.Unlock()
m.LinesProcessed++
m.BytesProcessed += uint64(len(line))
m.LevelCounts[level]++
m.StreamCounts[stream]++
}
func (m *Metrics) RecordCacheHit(hit bool) {
m.mu.Lock()
defer m.mu.Unlock()
if hit {
m.CacheHits++
} else {
m.CacheMisses++
}
}
func (m *Metrics) Summary() string {
m.mu.Lock()
defer m.mu.Unlock()
duration := time.Since(m.StartTime).Seconds()
lps := float64(m.LinesProcessed) / duration
hitRate := float64(m.CacheHits) / float64(m.CacheHits + m.CacheMisses) * 100
return fmt.Sprintf(
"Lines: %d (%.0f/s) - Cache hit rate: %.1f%% - Levels: E=%d W=%d I=%d",
m.LinesProcessed, lps, hitRate,
m.LevelCounts["ERROR"], m.LevelCounts["WARN"], m.LevelCounts["INFO"],
)
}Configuration
# config.yaml
metrics:
enabled: false
output: "stderr" # where to write metrics
interval: "10s" # periodic stats (0 = summary only)
format: "text" # text or jsonBenefits
Troubleshooting:
- Identify slow processing
- Detect cache inefficiency
- Monitor resource usage
Optimization:
- Measure impact of changes
- Compare configurations
- Validate improvements
Monitoring:
- Track long-running processes
- Alert on performance degradation
- Analyze log patterns
Implementation Checklist
- Create metrics package
- Add Metrics struct with thread-safe counters
- Integrate metrics collection in processor
- Add CLI flag --metrics
- Add summary output at completion
- Add periodic stats option
- Add JSON output format
- Document metrics in README
Example Usage
# Basic metrics
$ logwrap --metrics -- ./long-running-app
# ... app output ...
# Metrics: 1,234,567 lines (12,345/s), Cache: 94.2%, E=12 W=456 I=1,234,099
# JSON metrics to file
$ logwrap --metrics-file metrics.json -- ./app
$ cat metrics.json
{"lines": 1234567, "rate": 12345.67, ...}
# Live stats every 5s
$ logwrap --metrics --metrics-interval 5s -- ./app
[5s] 61,728 lines (12,345/s) ...
[10s] 123,456 lines (12,346/s) ...Related Issues
- Unbounded log level cache causes memory leak in long-running processes #8 - Cache optimization (metrics help validate fixes)
- Optimize string building with strings.Builder #25 - String building (measure performance impact)
- Monitor and optimize mutex lock in hot path #26 - Mutex optimization (measure contention)
References
- Prometheus metrics: https://prometheus.io/docs/practices/naming/
- Go metrics: https://pkg.go.dev/expvar
Reactions are currently unavailable