This document describes all the new features and options added to the GPU tracker application.
The application now supports comprehensive command-line configuration:
-
-interval <seconds>: Set custom sampling interval (default: 5 seconds)- Example:
./gpuwatch -interval 10
- Example:
-
-db <path>: Specify custom database location- Example:
./gpuwatch -db /custom/path/gpuwatch.db
- Example:
-
-version: Display version information- Example:
./gpuwatch -version
- Example:
-
-once: Sample once and exit without starting the TUI- Use case: Quick status check or scripting
- Example:
./gpuwatch -once
-
-continuous: Continuous background monitoring mode- Samples at regular intervals and saves to database
- No TUI, runs until stopped with Ctrl+C
- Example:
./gpuwatch -continuous -interval 30
-
-list-users: List all users currently using GPUs- Quick summary of user memory consumption
- Example:
./gpuwatch -list-users
-
-export <format>: Export snapshot data (formats:json,csv)- JSON: Full structured data export
- CSV: Tabular format for spreadsheets
- Example:
./gpuwatch -export json
-
-output <file>: Specify output file for exports- If omitted, prints to stdout
- Example:
./gpuwatch -export csv -output report.csv
-
-max-temp <degrees>: GPU temperature alert threshold (default: 90°C)- Visual alerts in TUI when exceeded
- Example:
./gpuwatch -max-temp 85
-
-max-mem <percent>: Memory utilization alert threshold (default: 95%)- Visual alerts in TUI when exceeded
- Example:
./gpuwatch -max-mem 90
Exports complete snapshot data in JSON format including:
- Full GPU information (utilization, memory, temperature, power)
- Process details (PID, name, user, memory usage)
- Timestamp information
Example:
./gpuwatch -export json -output snapshot.jsonUse cases:
- Integration with monitoring systems
- Data analysis with jq or Python
- API consumption
Exports data in CSV format suitable for spreadsheets:
- Columns: Timestamp, GPU Index, GPU Name, Utilization %, Memory %, Temperature, Power, PID, Process, User, Memory MB
- Easy to import into Excel, Google Sheets, or process with awk/sed
Example:
./gpuwatch -export csv -output report.csv- Cycle through users to filter view by specific user
- Shows only processes belonging to the selected user
- Indicator shows active user filter
- Press
fmultiple times to cycle through all users - Clear with
c
- Cycle through GPUs to focus on specific GPU
- Shows only the selected GPU and its processes
- Useful for multi-GPU systems
- Press
gmultiple times to cycle through all GPUs - Clear with
c
- Toggle sorting of processes by memory usage
- Helps identify memory-intensive processes
- Works in combination with other filters
- Resets all active filters
- Returns to full system view
- Temperature Alerts: Red warning when GPU temperature exceeds threshold
- Memory Alerts: Red warning when memory utilization exceeds threshold
- Displayed directly in TUI next to GPU stats
- Example:
⚠️ HIGH TEMP 92°C
In one-shot and continuous modes, alerts are printed to stderr:
⚠️ ALERT: GPU 0 (NVIDIA GeForce RTX 3090) temperature 91.0°C exceeds threshold 90.0°C
This allows for easy integration with monitoring scripts and email alerts.
- Full terminal user interface
- Real-time updates
- History browsing
- Filtering and sorting
Usage:
./gpuwatch- Sample once and display
- Perfect for scripting
- Can be combined with export
Usage:
./gpuwatch -once
./gpuwatch -once -export json- Background monitoring
- Automatic database saves
- No TUI overhead
- Alert notifications to stderr
Usage:
./gpuwatch -continuous -interval 60 >> /var/log/gpuwatch.log 2>&1- Quick summary of GPU users
- Shows memory usage per user
- Fast and lightweight
Usage:
./gpuwatch -list-users# Sample every 10 minutes
*/10 * * * * /usr/local/bin/gpuwatch -continuous -interval 600 >> /var/log/gpuwatch.log 2>&1#!/bin/bash
./gpuwatch -once -max-temp 85 2>&1 | grep "ALERT" && \
echo "GPU temperature alert!" | mail -s "GPU Alert" admin@example.com#!/bin/bash
DATE=$(date +%Y-%m-%d)
./gpuwatch -export csv -output "/reports/gpu-report-${DATE}.csv"# Get current GPU data as JSON
curl -X POST https://api.example.com/gpu-metrics \
-H "Content-Type: application/json" \
-d "$(./gpuwatch -export json)"# Get GPU 0 temperature
./gpuwatch -export json | jq '.GPUs[0].TempC'
# Get all users and their memory usage
./gpuwatch -export json | jq '.Procs[] | {user: .User, mem: .UsedMemMB}'
# Check if any GPU is over 80% memory
./gpuwatch -export json | jq '.GPUs[] | select(.UtilMem > 80)'a- Toggle auto-recordingr- Refresh snapshot onces- Save snapshot manuallyh- Toggle History mode←/→- Previous/Next snapshot (in History)↑/↓- Previous/Next day (in History)t- Jump to today/live modeq- Quit?- Toggle help overlay
f- Cycle through users to filterg- Cycle through GPUs to filterm- Toggle sort by memory usagec- Clear all active filters
# Fast sampling, low thresholds
./gpuwatch -interval 2 -max-temp 70 -max-mem 80# Moderate sampling, reasonable thresholds
./gpuwatch -interval 30 -max-temp 85 -max-mem 90# Slow sampling, save resources
./gpuwatch -continuous -interval 300 -db /data/gpuwatch/history.db# Very fast sampling, strict alerts
./gpuwatch -interval 1 -max-temp 75 -max-mem 85None - All new features are opt-in via flags or keyboard shortcuts.
- Default sampling interval remains 5 seconds
- Alert thresholds default to 90°C and 95%
- TUI mode is still the default when no flags are specified
- Database location unchanged:
~/.local/share/gpuwatch/gpuwatch.db
- Existing database files work without modification
- Old keyboard shortcuts remain unchanged
- New shortcuts don't conflict with existing ones