A clear and concise description of what you want to happen.
What the feature would actually do
Add a mechanism that periodically saves telemetry snapshots such as:
Example JSON:
timestamp: 2026-03-04T21:31:22
node: node-42
gpu: GPU3
job_id: 91283
user: research_team
gpu_utilization: 88
memory_used_gb: 71
temperature: 78
power_watts: 310
ecc_errors: 0
Output formats:
• JSON
• CSV
A clear and concise description of any alternative solutions or features you've considered, if any.
No response
Additional context
No response