Skip to content

Conversation

@VladimirKadlec
Copy link
Contributor

@VladimirKadlec VladimirKadlec commented Jan 30, 2026

Description

The PR implements --cache-warmup flag that clears the caches at the startup.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

  • Assisted-by: Claude Code

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Summary by CodeRabbit

  • New Features

    • Added cache-warmup mode accessible via a CLI flag to rebuild caches without reading existing entries.
  • Documentation

    • Added local-embeddings and NLP metrics installation instructions for pip and uv-based setups.
    • Included cache-warmup usage example and minor formatting adjustments.
  • Dependencies

    • Added rapidfuzz as an NLP-metrics dependency.
  • Tests

    • Expanded unit-test coverage for cache-warmup behavior, CLI propagation, and evaluation workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 30, 2026

Walkthrough

Adds a cache-warmup mode to the evaluation runner that clears and recreates configured cache directories before evaluation; documentation and pyproject notes updated; unit tests expanded extensively; a new CLI flag (--cache-warmup) and a private helper _clear_caches were added to the runner.

Changes

Cohort / File(s) Summary
Documentation & Packaging
README.md, pyproject.toml
Added optional NLP-metrics/local-embeddings install instructions and a cache-warmup usage example; added rapidfuzz>=3.0.0 to nlp-metrics dependencies and alternate uv-based install comments.
Cache Warmup Implementation
src/lightspeed_evaluation/runner/evaluation.py
Added private _clear_caches(system_config: SystemConfig) to remove/recreate enabled cache dirs; call integrated into run_evaluation() when eval_args.cache_warmup is true; added CLI --cache-warmup flag and imports (Path, shutil, SystemConfig).
Tests — Evaluation Runner
tests/unit/runner/test_evaluation.py
Extensive unit tests for _clear_caches() (all/partial/no caches, safety checks, dir creation) and for run_evaluation()/main() CLI flows including propagation of --cache-warmup, error cases, and output behavior.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant Main as main()
    participant Eval as run_evaluation()
    participant ClearC as _clear_caches()
    participant FS as Filesystem<br/>(cache dirs)

    User->>Main: invoke CLI with --cache-warmup
    Main->>Eval: call run_evaluation(eval_args{cache_warmup: true})
    Eval->>Eval: detect cache_warmup == True
    Eval->>Eval: print "warming caches" message
    Eval->>ClearC: call _clear_caches(system_config)
    ClearC->>ClearC: enumerate enabled caches from SystemConfig
    ClearC->>FS: remove existing cache directories
    ClearC->>FS: recreate empty cache directories
    ClearC-->>Eval: return
    Eval->>Eval: continue loading heavy modules and run evaluation
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • asamal4
  • tisnik
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add cache-warmup flag' directly and clearly summarizes the main change: introducing a new CLI flag for cache warmup functionality.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@VladimirKadlec VladimirKadlec marked this pull request as ready for review January 30, 2026 10:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/lightspeed_evaluation/runner/evaluation.py`:
- Around line 95-99: Replace the print calls in run_evaluation (the cache-warmup
branch using eval_args.cache_warmup) and inside _clear_caches with structured
logging: after configuration loads call setup_logging() from
system_config.logging to get a logger instance, then use logger.info(...)
instead of print(...) for the "Cache warmup mode" message and any output
produced by _clear_caches so messages follow the codebase logging conventions
and levels.
- Around line 17-46: The _clear_caches function must validate each cache_dir
before calling shutil.rmtree to avoid deleting high-level or unexpected
locations; for each (cache_name, cache_dir) in cache_dirs, resolve the path with
Path(cache_dir).resolve() and ensure it is not root, not equal to Path.cwd(),
and is a subdirectory of an allowed base (e.g., contains a trusted segment like
".caches" or matches a configured base cache directory); if any check fails
raise DataValidationError with a clear message referencing cache_name and
cache_dir, otherwise proceed to shutil.rmtree and recreate the directory as
before (keep references to shutil.rmtree, Path.mkdir, cache_dirs, and
_clear_caches when locating code).

Comment on lines +95 to +99
# Clear caches if cache warmup mode is enabled
if eval_args.cache_warmup:
print("\n🔥 Cache warmup mode: Clearing existing caches...")
_clear_caches(system_config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how logging is set up and used in the runner to align with existing patterns
rg -n "setup_logging|logging\.getLogger|logger\." src/lightspeed_evaluation/runner -g '*.py' -A 2

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 63


🏁 Script executed:

#!/bin/bash
# First, check the file size and then read it
wc -l src/lightspeed_evaluation/runner/evaluation.py

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 131


🏁 Script executed:

#!/bin/bash
# Read the evaluation.py file to understand logging setup and the context of lines 95-99
cat -n src/lightspeed_evaluation/runner/evaluation.py

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 9300


🏁 Script executed:

#!/bin/bash
# Search for logging patterns in the broader codebase
rg -n "import logging|logging\.getLogger" src/lightspeed_evaluation -g '*.py' | head -20

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 1811


🏁 Script executed:

#!/bin/bash
# Check what log levels are used in similar runner/initialization contexts
rg -n "logger\.(info|debug|warning)" src/lightspeed_evaluation/pipeline/evaluation/pipeline.py src/lightspeed_evaluation/core/system/loader.py -A 1 | head -30

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 2355


🏁 Script executed:

#!/bin/bash
# Verify the logging setup is available in the loader module
rg -n "logger\s*=" src/lightspeed_evaluation/core/system/loader.py -A 2 | head -10

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 192


🏁 Script executed:

#!/bin/bash
# Check the setup_logging function to understand how to initialize logging in evaluation.py
rg -n "def setup_logging" src/lightspeed_evaluation -A 10

Repository: lightspeed-core/lightspeed-evaluation

Length of output: 1023


Replace print statements with structured logging.
The cache-warmup messaging at lines 97-98 and all output in _clear_caches use print, which bypasses the configured logger. Update to use structured logging at INFO level throughout run_evaluation and _clear_caches to align with established patterns in the codebase (e.g., pipeline.py, processor.py) and the coding guideline requiring structured logging with appropriate log levels. Initialize the logger using setup_logging() from system_config.logging after configuration loads.

🤖 Prompt for AI Agents
In `@src/lightspeed_evaluation/runner/evaluation.py` around lines 95 - 99, Replace
the print calls in run_evaluation (the cache-warmup branch using
eval_args.cache_warmup) and inside _clear_caches with structured logging: after
configuration loads call setup_logging() from system_config.logging to get a
logger instance, then use logger.info(...) instead of print(...) for the "Cache
warmup mode" message and any output produced by _clear_caches so messages follow
the codebase logging conventions and levels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant