A specialized sanitization utility designed for infrastructure security and auditing. This tool processes raw connection logs to remove malformed entries and extract high-value data for cleaner reporting.
- Automated Sanitization: Identifies and filters out lines with missing IP addresses or corrupted timestamps.
- Security Focused: Ensures that only well-formed connection data is passed on to downstream analytics tools.
- UV-Powered: Optimized for fast, isolated execution using the
uvPython toolchain.
-
Prepare Data: Place your
raw_connections.login the root directory. -
Execute Cleanup:
uv run clean
-
Data Logic: The script performs a three-point check on every log entry to ensure data quality:
- Validates the presence of a timestamp.
- Ensures a non-empty IP address field.
- Confirms a valid status (e.g., SUCCESS/ERROR)