feat: arricchire run record con metriche di esecuzione#62
Merged
Conversation
- aggiunge duration_seconds per layer e run totale, derivato dai timestamp esistenti in to_dict() senza costo aggiuntivo - aggiunge metrics per layer: output_rows (clean/mart), output_bytes (tutti), tables_count (mart) - output_rows su mart definito come somma delle righe per tabella - metriche additive-only: nessun cambio di contratto per consumer esistenti - aggiunge set_layer_metrics() su RunContext e aggancia i runner via cmd_run.py - test mirati: default null, persist, round-trip, duration, somma mart Closes #50
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #50
Cosa cambia
toolkit/core/run_context.pyduration_secondsaggiunto a ogni layer e al run totale — derivato dai timestamp già presenti into_dict(), zero costo aggiuntivometricsaggiunto a ogni layer con campioutput_rows,output_bytes,tables_count(defaultnull)set_layer_metrics()suRunContexttoolkit/cli/cmd_run.py_execute_layeraggancia il valore di ritorno dei runner e chiamacontext.set_layer_metrics()se il runner restituisce un dicttoolkit/clean/run.py_run_sqlrestituisce anche il row count (SELECT count(*) FROM clean_outprima di chiudere la connessione DuckDB)run_cleanrestituisce{"output_rows": N, "output_bytes": M}toolkit/mart/run.pytotal_rowsper tabella nel loop (SELECT count(*) FROM {name}a connessione aperta)run_martrestituisce{"output_rows": total_rows, "output_bytes": total_bytes, "tables_count": len(written)}output_rowsè la somma delle righe delle tabelle scritte, non per-tabellatoolkit/raw/run.pyrun_rawrestituisce{"output_bytes": sum(files_written[].bytes)}output_rowsnon viene popolato su raw: file grezzi, non righe strutturateTradeoff espliciti
duration_secondsoutput_rowssu rawinput_rowssql_size_bytesRetrocompatibilità
metricscontinuano a caricarsi correttamentecmd_statusecmd_resumenon toccati — leggono i campi con.get()e sono già defensiviVerifica
pytest: 190 passedruff: pulito