Skip to content

Implement configurable Karl context compression strategy (heuristic + LLM summarizer)#160

Open
YulYen wants to merge 1 commit intomainfrom
codex/implement-karl-context-compression-feature
Open

Implement configurable Karl context compression strategy (heuristic + LLM summarizer)#160
YulYen wants to merge 1 commit intomainfrom
codex/implement-karl-context-compression-feature

Conversation

@YulYen
Copy link
Copy Markdown
Owner

@YulYen YulYen commented Feb 25, 2026

Motivation

  • Provide a configurable context-compression system to replace the ad-hoc karl_prepare_quick_and_dirty trimming and allow an LLM-based summarization strategy while keeping the original heuristic available.
  • Make all behavior explicit in config.yaml with no hidden defaults or silent fallbacks.
  • Keep the implementation KISS, modular, and testable with clear error paths when LLM summarization fails.

Description

  • Added a new context_management section to config.yaml with strategy: "heuristic" | "karl" and a required karl sub-section containing model, summary_max_tokens, keep_last_messages, and log_dir.
  • Implemented src/core/context_summarizer.py introducing KarlSummarizer (pure service) and KarlSummarizationError, which accept an injected llm_core and raw config and implement summarize(messages: list[dict]) -> list[dict] returning one system summary plus the tail without mutating the original list.
  • Integrated strategy routing into WebUI and TerminalUI so that heuristic preserves existing trimming and karl invokes KarlSummarizer; LLM failures raise KarlSummarizationError and only fall back to heuristic when fallback_strategy: heuristic is explicitly configured.
  • Karl writes its own daily log files logs/karl_YYYY-MM-DD.log containing timestamp, number of messages summarized, summary character length, and model used, without altering global logging configuration.

Testing

  • Added tests/test_context_summarizer.py (validates reduction, tail preservation, no-mutation, model/options, and log creation) and updated tests/test_web_ui.py / tests/test_terminal_ui.py to include context_management and to assert strategy switching and non-instantiation of Karl under heuristic.
  • Ran pytest -q tests/test_context_summarizer.py tests/test_web_ui.py tests/test_terminal_ui.py and all tests passed (22 passed).
  • The new behavior enforces explicit config validation paths and raises clear exceptions when required config keys are missing or when summarization fails without an explicit fallback.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 061d84d55f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

stream = self._llm_core.stream_chat(
model_name=self._model_name,
messages=prompt_messages,
options={"num_predict": self._summary_max_tokens},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve persona context options in Karl summary calls

The Karl summarization request is sent with options={"num_predict": ...} only, so persona-specific options like num_ctx are dropped for this call even though context compression is triggered specifically in near-limit scenarios. In setups where personas run with a larger num_ctx than the backend default, summarization can fail or truncate before compression happens, causing KarlSummarizationError (and a hard failure when no fallback is configured). Pass through relevant persona options (at least num_ctx) when building the summary request options.

Useful? React with 👍 / 👎.

def _append_log_entry(
self, summarized_count: int, summary_length: int, model_name: str
) -> None:
Path(self._log_dir).mkdir(parents=True, exist_ok=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard Karl log writes from breaking successful summaries

After a summary is successfully produced, _append_log_entry performs filesystem writes without error handling; if log_dir cannot be created or appended to (e.g., read-only path, permission issues, disk full), the turn raises and fails despite having a valid summary. Because this logging is non-critical telemetry, these I/O failures should be caught and downgraded (e.g., warning) so context compression still completes.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant