feat(text_tasks): add external AI helper modules#884
feat(text_tasks): add external AI helper modules#8841larity wants to merge 6 commits intoace-step:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a suite of external-AI text-task modules: types, protocol validation, request builders, response extraction, and robust JSON/text parsing to convert free-form AI outputs into a normalized ExternalAIPlan. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant ReqHelpers as RequestHelpers
participant Protocols as Protocols
participant ExtAI as ExternalAI
participant RespParser as ResponseParser
participant JSONParser as JSONParser
participant Plan as ExternalAIPlan
App->>ReqHelpers: build_planning_messages(intent, task_focus)
ReqHelpers->>ReqHelpers: build_task_focus_guidance() / build_intent_specific_guidance()
ReqHelpers->>Protocols: build_request_for_protocol(protocol, provider, messages...)
Protocols->>Protocols: normalize_request_protocol() / require_message_pair()
Protocols-->>ReqHelpers: (payload, headers)
App->>ExtAI: POST payload + headers
ExtAI-->>App: raw_response (JSON)
App->>RespParser: extract_protocol_message_content(raw_response, protocol)
RespParser->>RespParser: decode & normalize protocol-specific shape
RespParser-->>App: assistant_text
App->>JSONParser: load_plan_json_object(assistant_text)
JSONParser->>JSONParser: iter_json_candidates() / repair / fallback labelled extraction
JSONParser-->>App: dict (plan candidate)
App->>RespParser: parse_plan_from_content(dict)
RespParser->>RespParser: coerce fields (to_bool,to_int,to_float) / post-normalize
RespParser->>Plan: construct ExternalAIPlan
Plan-->>RespParser: plan_instance
RespParser-->>App: ExternalAIPlan
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable the changed files summary in the walkthrough.Disable the |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
acestep/core/generation/handler/init_service_test.py (1)
969-1071:⚠️ Potential issue | 🔴 CriticalResolve the unresolved merge conflict on line 969.
Line 969 contains
<<<<<<< HEAD, making this file invalid Python and preventing import. Lines 989 and 1031 both definetest_load_text_encoder_uses_cpu_safe_dtype_when_offloaded—the second will overwrite the first. Resolve the conflict and keep a single test definition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/init_service_test.py` around lines 969 - 1071, Remove the leftover merge conflict markers (e.g., the "<<<<<<< HEAD" at the top) and eliminate the duplicated test definition for test_load_text_encoder_uses_cpu_safe_dtype_when_offloaded so only a single valid test remains; keep the working block that constructs _Host, sets offload_to_cpu and dtype, defines the _FakeEncoder helper, patches transformers and os.path.exists, calls host._load_text_encoder_and_tokenizer, and asserts the expected returned path, host.text_encoder/tokenizer, fake_encoder.to_calls and eval_called (references: test_load_text_encoder_uses_cpu_safe_dtype_when_offloaded, _Host, _FakeEncoder, host._load_text_encoder_and_tokenizer).
🧹 Nitpick comments (1)
acestep/text_tasks/external_ai_response_parsing_test.py (1)
17-24: Optional: useassertRaisesRegexto make error assertions tighter and shorter.This removes the extra
assertIn(...)and keeps expected error text closer to the raise site.♻️ Suggested refactor
- with self.assertRaises(ExternalAIClientError) as exc: + with self.assertRaisesRegex(ExternalAIClientError, "missing choices"): extract_protocol_message_content( - raw_response='{"choices":[]}', + raw_response='{"choices":[]}', protocol="openai_chat", ) - - self.assertIn("missing choices", str(exc.exception)) - with self.assertRaises(ExternalAIClientError) as exc: + with self.assertRaisesRegex( + ExternalAIClientError, "Unsupported external response protocol" + ): extract_protocol_message_content( - raw_response='{"choices":[{"message":{"content":"ok"}}]}', + raw_response='{"choices":[{"message":{"content":"ok"}}]}', protocol="mystery_protocol", ) - - self.assertIn("Unsupported external response protocol", str(exc.exception))Also applies to: 28-34
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/text_tasks/external_ai_response_parsing_test.py` around lines 17 - 24, Replace the two-step raise-and-assert pattern with a single assertRaisesRegex to tighten the test: call self.assertRaisesRegex(ExternalAIClientError, "missing choices") around the extract_protocol_message_content invocation (the tested function), and do the same for the second occurrence covering lines 28-34; this removes the separate self.assertIn(...) and keeps the expected "missing choices" message assertion adjacent to the raise site.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/text_tasks/external_ai_protocols.py`:
- Around line 35-42: The function require_message_pair currently only checks
length; update it to validate that messages[0] has role "system" and messages[1]
has role "user" (and that each message contains a "role" key) and raise
ExternalAIClientError if roles are missing or not exactly that pair (including
swapped roles like user/assistant or assistant/user), so malformed prompts fail
fast; reference the require_message_pair function and ExternalAIClientError when
implementing the role checks and error messages.
In `@acestep/text_tasks/external_ai_request_helpers.py`:
- Around line 42-54: The current substring check on normalized_signal wrongly
sets instrumental when a caption contains the word "instrumental" and misses
explicit field-style inputs like "Instrumental: true" because
extract_intent_signal_text strips the field name; update the logic in
external_ai_request_helpers.py around normalized_signal (and the
no_vocals_markers usage) to first detect an explicit Instrumental field
(case-insensitive patterns like "instrumental:true" or "instrumental: true" or a
standalone token "instrumental: true") and only treat that as an instruction to
force instrumental = true (return the existing no-vocals/instrumental message);
otherwise, for freeform captions, tighten the match to only trigger when the
marker is a standalone word or phrase (use word-boundary or exact-token checks)
so phrases like "Caption: instrumental intro with female vocals" do not assert
instrumental. Add unit/regression tests covering "Instrumental: true" and
"Caption: instrumental intro with female vocals" to ensure the explicit field is
honored and freeform captions are not.
In `@acestep/text_tasks/external_ai_response_parsing.py`:
- Around line 35-43: The parser currently only reads "key_scale" and
"time_signature" from the loaded JSON (via load_plan_json_object) and will drop
provider fields named "keyscale" or "timesignature"; update the normalization to
accept those aliases by falling back to obj.get("keyscale") when computing
key_scale and obj.get("timesignature") when computing time_signature (i.e., read
obj.get("key_scale") or obj.get("keyscale") for key_scale and
obj.get("time_signature") or obj.get("timesignature") for time_signature, then
apply the same str(...).strip() logic).
In `@acestep/text_tasks/external_ai_types.py`:
- Around line 26-29: The to_dict method currently returns asdict(self) which
serializes fields like key_scale and time_signature but does not include the
pipeline's canonical metadata keys expected by downstream hydration
(metadata["keyscale"] and metadata["timesignature"]); update the to_dict
implementation (method to_dict) to include/translate key_scale -> "keyscale" and
time_signature -> "timesignature" in the returned dict (e.g., add these entries
to the dict returned from asdict(self) or construct the dict manually) so
downstream generation param hydration sees metadata["keyscale"] and
metadata["timesignature"].
---
Outside diff comments:
In `@acestep/core/generation/handler/init_service_test.py`:
- Around line 969-1071: Remove the leftover merge conflict markers (e.g., the
"<<<<<<< HEAD" at the top) and eliminate the duplicated test definition for
test_load_text_encoder_uses_cpu_safe_dtype_when_offloaded so only a single valid
test remains; keep the working block that constructs _Host, sets offload_to_cpu
and dtype, defines the _FakeEncoder helper, patches transformers and
os.path.exists, calls host._load_text_encoder_and_tokenizer, and asserts the
expected returned path, host.text_encoder/tokenizer, fake_encoder.to_calls and
eval_called (references:
test_load_text_encoder_uses_cpu_safe_dtype_when_offloaded, _Host, _FakeEncoder,
host._load_text_encoder_and_tokenizer).
---
Nitpick comments:
In `@acestep/text_tasks/external_ai_response_parsing_test.py`:
- Around line 17-24: Replace the two-step raise-and-assert pattern with a single
assertRaisesRegex to tighten the test: call
self.assertRaisesRegex(ExternalAIClientError, "missing choices") around the
extract_protocol_message_content invocation (the tested function), and do the
same for the second occurrence covering lines 28-34; this removes the separate
self.assertIn(...) and keeps the expected "missing choices" message assertion
adjacent to the raise site.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 29d64329-3546-494f-8325-b5bbe4f33b18
📒 Files selected for processing (10)
acestep/core/generation/handler/init_service_test.pyacestep/text_tasks/external_ai_json_parsing.pyacestep/text_tasks/external_ai_json_parsing_test.pyacestep/text_tasks/external_ai_protocols.pyacestep/text_tasks/external_ai_protocols_test.pyacestep/text_tasks/external_ai_request_helpers.pyacestep/text_tasks/external_ai_request_helpers_test.pyacestep/text_tasks/external_ai_response_parsing.pyacestep/text_tasks/external_ai_response_parsing_test.pyacestep/text_tasks/external_ai_types.py
ChuxiJ
left a comment
There was a problem hiding this comment.
Code Review: PRs #881-884 (External LM Feature Set)
Reviewing all four PRs from this feature set together since they form a layered architecture.
Cross-cutting issue (all 4 PRs)
init_service_test.py: All four PRs contain literal <<<<<<< HEAD conflict markers in the shared test file. This must be resolved before any PR can merge — the file will fail to parse/run as-is. Please rebase all branches onto current main.
PR #881 (Storage & Provider Config) — +1139/-0
Strengths: Good separation of concerns, correct file permissions (0o700/0o600), keyring fallback, proper temp-file passphrase handling for OpenSSL.
Issues:
- Security: Passphrase temp file in
_run_opensslgets default permissions between creation andunlink. Should useos.openwith0o600oros.chmodimmediately after creation. - Thread safety:
save_external_lm_runtime_settingsdoes read-modify-write on a JSON file without file locking — could corrupt under concurrent access. hydrate_external_lm_env_from_store()mutatesos.environglobally — document the intended call site clearly.
PR #882 (Captioning Helpers) — +269/-0
Strengths: Clean, testable helpers. caption_needs_retry heuristic is straightforward.
Issues:
apply_user_metadata_overridessets bothplan.keyscale/plan.key_scaleandplan.timesignature/plan.time_signature— this dual-attribute pattern suggests the plan object schema is not well-defined. Ifplanis anExternalAIPlandataclass (from PR #884), setting undefined attributes will fail.build_fallback_captionproduces hardcoded English — no i18n consideration.
PR #883 (Model Discovery Cache) — +584/-0
Strengths: Respects XDG conventions, configurable TTL, good deduplication with OrderedDict.
Issues:
- Blocking HTTP:
urllib.request.urlopenindiscover_external_modelswill block the event loop in async contexts (FastAPI/Gradio). Consider running in a thread pool. - Cache key inconsistency:
_build_model_list_urlsdoesrstrip("/")but_cache_keydoesn't — trailing slash creates separate cache entries. invalidate_cached_external_modelssilently swallows allOSErrors.
PR #884 (AI Helper Modules) — +936/-0
Strengths: Robust JSON extraction with multiple fallback layers (fenced blocks → balanced braces → labelled fields). Good error types. 21 tests.
Issues:
extract_balanced_json_objectsdoesn't reset escape flag whenstart is None— a backslash in non-JSON text before the first{could skip the next character.build_request_for_protocolhardcodestemperature: 0.4— should be configurable per task.require_json_outputfor Anthropic only addsstop_sequences: ["```"]— doesn't actually enforce JSON output like OpenAI'sresponse_format.
Summary
Solid foundation work overall. Blockers: Fix merge conflicts in all 4 PRs. Recommended: Address the passphrase temp-file permissions and blocking HTTP call before merge.
|
Reviewed the CodeRabbit findings against the current branch. Applied across
Verified as not applicable to the current code:
Validation rerun: /media/stellaris/Linux AI/Acestep/acestep_git/ACE-Step-1.5/ACE-Step-1.5/.venv/bin/python -m unittest \
acestep.text_tasks.external_ai_json_parsing_test \
acestep.text_tasks.external_ai_protocols_test \
acestep.text_tasks.external_ai_request_helpers_test \
acestep.text_tasks.external_ai_response_parsing_test \
acestep.text_tasks.external_ai_types_testRan 28 tests, all passing. |
|
Looked through the PR-884-specific points from @ChuxiJ against the current branch ( What I verified:
So I did not make additional code changes for those review notes beyond the fixes already pushed for the earlier CodeRabbit findings. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
acestep/text_tasks/external_ai_request_helpers.py (1)
137-141: Consider normalizing provider strings for robustness.The checks
provider in {"openai", "zai"}andprovider == "zai"are case-sensitive exact matches. If callers pass variants like"z.ai","ZAI", or"OpenAI", the JSON output settings and thinking-disable logic won't apply.♻️ Optional: normalize provider at function entry
def build_request_for_protocol( *, protocol: str, provider: str, ... ) -> tuple[dict[str, Any], dict[str, str]]: """Build protocol-specific payload and headers.""" normalized_protocol = normalize_request_protocol(protocol) + normalized_provider = (provider or "").strip().lower().replace(".", "") if normalized_protocol == "anthropic_messages": ... - if require_json_output and provider in {"openai", "zai"}: + if require_json_output and normalized_provider in {"openai", "zai"}: payload["response_format"] = {"type": "json_object"} payload["stop"] = ["```"] - if disable_thinking and provider == "zai": + if disable_thinking and normalized_provider == "zai": payload["thinking"] = {"type": "disabled"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/text_tasks/external_ai_request_helpers.py` around lines 137 - 141, Normalize the provider string at the start of the function (e.g., create a normalized_provider = provider.lower().replace(".", "") or similar) and then use normalized_provider in the existing checks instead of provider; update the require_json_output check to use normalized_provider in {"openai", "zai"} and the disable_thinking check to use normalized_provider == "zai", keeping the same payload assignments to payload["response_format"], payload["stop"], and payload["thinking"] so variants like "OpenAI", "ZAI", or "z.ai" trigger the same behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@acestep/text_tasks/external_ai_request_helpers.py`:
- Around line 137-141: Normalize the provider string at the start of the
function (e.g., create a normalized_provider = provider.lower().replace(".", "")
or similar) and then use normalized_provider in the existing checks instead of
provider; update the require_json_output check to use normalized_provider in
{"openai", "zai"} and the disable_thinking check to use normalized_provider ==
"zai", keeping the same payload assignments to payload["response_format"],
payload["stop"], and payload["thinking"] so variants like "OpenAI", "ZAI", or
"z.ai" trigger the same behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4e7950fd-2c56-48f5-9cb7-dadeceea9236
📒 Files selected for processing (8)
acestep/text_tasks/external_ai_protocols.pyacestep/text_tasks/external_ai_protocols_test.pyacestep/text_tasks/external_ai_request_helpers.pyacestep/text_tasks/external_ai_request_helpers_test.pyacestep/text_tasks/external_ai_response_parsing.pyacestep/text_tasks/external_ai_response_parsing_test.pyacestep/text_tasks/external_ai_types.pyacestep/text_tasks/external_ai_types_test.py
✅ Files skipped from review due to trivial changes (1)
- acestep/text_tasks/external_ai_types_test.py
🚧 Files skipped from review as they are similar to previous changes (5)
- acestep/text_tasks/external_ai_protocols_test.py
- acestep/text_tasks/external_ai_response_parsing_test.py
- acestep/text_tasks/external_ai_response_parsing.py
- acestep/text_tasks/external_ai_protocols.py
- acestep/text_tasks/external_ai_request_helpers_test.py
bd41cc3 to
1374404
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
acestep/text_tasks/external_ai_response_parsing.py (1)
15-16: Consider centralizing protocol normalization.This file redefines the same protocol set and lowercase/strip validation that
acestep/text_tasks/external_ai_protocols.pyalready owns. Pulling the supported set + normalizer into one shared helper would keep request and response support from drifting the next time a protocol is added.As per coding guidelines: Prefer single-responsibility modules with clear boundaries.
Also applies to: 101-109
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/text_tasks/external_ai_response_parsing.py` around lines 15 - 16, The SUPPORTED_RESPONSE_PROTOCOLS constant and inline lowercase/strip validation in external_ai_response_parsing.py duplicate logic already owned by external_ai_protocols.py; refactor by removing the local SUPPORTED_RESPONSE_PROTOCOLS and any in-file normalization and instead import the canonical supported set and normalizer (e.g., SUPPORTED_PROTOCOLS and normalize_protocol) from external_ai_protocols.py and use those in functions that currently reference SUPPORTED_RESPONSE_PROTOCOLS (and the validation logic around protocol handling, including the logic around the code near lines 101-109) so that protocol definitions and normalization live in one single-responsibility module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/text_tasks/external_ai_json_parsing.py`:
- Around line 130-152: The labelled-text fallback currently only recognizes
underscored aliases; update the field_map and the regex in
external_ai_json_parsing.py so it also accepts runtime aliases without
underscores (e.g., "keyscale", "timesignature", "vocallanguage", "vocallanguage"
etc.) to match the JSON-path alias handling in parse_plan_from_content() and
apply_user_metadata_overrides(); specifically add those alias keys to field_map
(mapping them to the canonical names like "key_scale", "time_signature",
"vocal_language") and extend the pattern alternation to include the
non-underscored forms (keep flags and use the same normalization of raw_key via
.lower().replace("-", " ")). Ensure behavior is consistent with
apply_user_metadata_overrides() which writes both canonical and runtime aliases.
In `@acestep/text_tasks/external_ai_protocols.py`:
- Around line 42-50: The code assumes messages[0] and messages[1] are mappings
and that content is non-empty; update the validation around system_message and
user_message to first ensure messages is a sequence with at least two items and
that both entries are dict-like (e.g., isinstance(..., dict)), then check that
each dict has a non-empty string "role" and a non-empty string "content" before
accessing them; if any check fails, raise ExternalAIClientError. Apply these
checks where system_message and user_message are assigned (the variables
system_message, user_message) and keep the existing role-equality checks and
error messages but fail fast on malformed types or content==None/empty string.
In `@acestep/text_tasks/external_ai_request_helpers.py`:
- Around line 55-64: The branch that treats "wordless" as an explicit no-vocals
marker is too broad; remove the bare "wordless" token from the marker tuple in
the conditional that checks normalized_signal (the tuple currently written as
("no vocals", "wordless", "no lead vocals", "instrumental only")). Leave the
other explicit no-vocals strings and the set of exact instrumental matches
intact so requests like "wordless choir" or "wordless vocalise" are not
misclassified as instrumental, and keep the existing return text and the
normalized_signal variable usage unchanged.
In `@acestep/text_tasks/external_ai_response_parsing.py`:
- Around line 82-98: The parser currently assumes choice["message"] is a dict
and blindly stringifies non-string/invalid content which lets malformed payloads
slip through; update the block that handles choices/message/content (variables:
choices, choice, message, content) to first validate message is a dict (raise
ExternalAIClientError if not), then only accept content that is a str or a
list-of-dicts with type=="text" (for list, extract "text" fields as before); for
any other content types (e.g., dicts, numbers, None) raise ExternalAIClientError
instead of coercing to str so invalid shapes fail fast before reaching
load_plan_json_object(), and ensure you no longer rely on attribute access like
message.get without the isinstance check.
---
Nitpick comments:
In `@acestep/text_tasks/external_ai_response_parsing.py`:
- Around line 15-16: The SUPPORTED_RESPONSE_PROTOCOLS constant and inline
lowercase/strip validation in external_ai_response_parsing.py duplicate logic
already owned by external_ai_protocols.py; refactor by removing the local
SUPPORTED_RESPONSE_PROTOCOLS and any in-file normalization and instead import
the canonical supported set and normalizer (e.g., SUPPORTED_PROTOCOLS and
normalize_protocol) from external_ai_protocols.py and use those in functions
that currently reference SUPPORTED_RESPONSE_PROTOCOLS (and the validation logic
around protocol handling, including the logic around the code near lines
101-109) so that protocol definitions and normalization live in one
single-responsibility module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2fcc9989-ce00-4ddf-9824-70dd4c8a0d1e
📒 Files selected for processing (10)
acestep/text_tasks/external_ai_json_parsing.pyacestep/text_tasks/external_ai_json_parsing_test.pyacestep/text_tasks/external_ai_protocols.pyacestep/text_tasks/external_ai_protocols_test.pyacestep/text_tasks/external_ai_request_helpers.pyacestep/text_tasks/external_ai_request_helpers_test.pyacestep/text_tasks/external_ai_response_parsing.pyacestep/text_tasks/external_ai_response_parsing_test.pyacestep/text_tasks/external_ai_types.pyacestep/text_tasks/external_ai_types_test.py
✅ Files skipped from review due to trivial changes (2)
- acestep/text_tasks/external_ai_types.py
- acestep/text_tasks/external_ai_protocols_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
- acestep/text_tasks/external_ai_types_test.py
- acestep/text_tasks/external_ai_json_parsing_test.py
|
Addressed the new Included updates:
Validation:\n-
No unrelated pre-existing issues from |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/text_tasks/external_ai_json_parsing.py`:
- Around line 102-103: The two string replace calls on the repaired variable use
visually confusable mojibake characters; replace those hard-to-read glyphs with
explicit Unicode escapes (e.g., \u201C, \u201D, \u2018, \u2019) or use
chr(0x201C)/chr(0x201D)/chr(0x2018)/chr(0x2019) so the mappings in
external_ai_json_parsing.py are unambiguous and lint-friendly—update the two
replacements that currently target "“"/"�" and "‘"/"’" to use the
corresponding explicit codepoint escapes while keeping the same replacement
characters (double or single quotes).
- Around line 146-152: The label-capturing regex assigned to the variable
pattern in external_ai_json_parsing.py fails to match hyphenated aliases (e.g.,
key-scale, time-signature, vocal-language) even though later raw_key
normalization converts '-' to space; update the pattern's alternation groups
(the captured labels:
caption|lyrics|bpm|duration|key...|time...|vocal...|instrumental) to accept
hyphens as well as underscores/spaces by changing occurrences of [_ ] to a
character class that includes hyphen (e.g., [-_ ] or equivalent) or otherwise
add explicit hyphenated variants so lines like "key-scale:" and
"time-signature:" are matched by pattern.
In `@acestep/text_tasks/external_ai_request_helpers_test.py`:
- Around line 88-91: The test
test_resolve_max_tokens_for_format_uses_tighter_budget is flaky because
resolve_max_tokens_for_task_focus("format") reads
ACESTEP_EXTERNAL_FORMAT_MAX_TOKENS from the environment; before calling
resolve_max_tokens_for_task_focus("format") ensure the env var is controlled
(e.g., use pytest's monkeypatch to delenv or set
ACESTEP_EXTERNAL_FORMAT_MAX_TOKENS to a known value) so the test is
deterministic, then assert the expected 768; reference
resolve_max_tokens_for_task_focus and ACESTEP_EXTERNAL_FORMAT_MAX_TOKENS when
making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e20129a0-f551-42c3-99a6-bb4e201412b0
📒 Files selected for processing (8)
acestep/text_tasks/external_ai_json_parsing.pyacestep/text_tasks/external_ai_json_parsing_test.pyacestep/text_tasks/external_ai_protocols.pyacestep/text_tasks/external_ai_protocols_test.pyacestep/text_tasks/external_ai_request_helpers.pyacestep/text_tasks/external_ai_request_helpers_test.pyacestep/text_tasks/external_ai_response_parsing.pyacestep/text_tasks/external_ai_response_parsing_test.py
✅ Files skipped from review due to trivial changes (1)
- acestep/text_tasks/external_ai_request_helpers.py
🚧 Files skipped from review as they are similar to previous changes (3)
- acestep/text_tasks/external_ai_json_parsing_test.py
- acestep/text_tasks/external_ai_response_parsing_test.py
- acestep/text_tasks/external_ai_protocols.py
|
Follow-up
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
acestep/text_tasks/external_ai_request_helpers_test.py (1)
171-182: PreferassertRaisesRegexfor tighter exception-message checks.Using
assertRaisesRegex(...)here would make failures more precise and consistent with the stricter assertion style used elsewhere in this PR.♻️ Suggested test refinement
- with self.assertRaises(ExternalAIClientError) as exc: + with self.assertRaisesRegex( + ExternalAIClientError, "requires both system and user messages" + ): build_request_for_protocol( protocol="anthropic_messages", provider="claude", api_key="test-key", model="claude-sonnet-4-5", messages=[{"role": "system", "content": "s"}], base_url="https://api.anthropic.com/v1/messages", ) - - self.assertIn("requires both system and user messages", str(exc.exception)) - with self.assertRaises(ExternalAIClientError) as exc: + with self.assertRaisesRegex( + ExternalAIClientError, "system message followed by a user message" + ): build_request_for_protocol( protocol="anthropic_messages", provider="claude", api_key="test-key", model="claude-sonnet-4-5", messages=[{"role": "user", "content": "u"}, {"role": "assistant", "content": "a"}], base_url="https://api.anthropic.com/v1/messages", ) - - self.assertIn("system message followed by a user message", str(exc.exception)) - with self.assertRaises(ExternalAIClientError) as exc: + with self.assertRaisesRegex( + ExternalAIClientError, "Unsupported external request protocol" + ): build_request_for_protocol( protocol="mystery_protocol", provider="openai", api_key="test-key", model="gpt-4o-mini", messages=[{"role": "system", "content": "s"}, {"role": "user", "content": "u"}], base_url="https://api.openai.com/v1/chat/completions", ) - - self.assertIn("Unsupported external request protocol", str(exc.exception))Also applies to: 186-197, 201-212
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/text_tasks/external_ai_request_helpers_test.py` around lines 171 - 182, Replace the loose exception checks using self.assertRaises + self.assertIn with self.assertRaisesRegex to assert the ExternalAIClientError message exactly; e.g., in the test cases that call build_request_for_protocol (protocol="anthropic_messages", provider="claude", ... ) and expect the "requires both system and user messages" error, change to with self.assertRaisesRegex(ExternalAIClientError, r"requires both system and user messages"): and do the same pattern for the other two failing blocks (the similar calls around the other assertions at the next two ranges) so the test directly asserts the error type and message in one statement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@acestep/text_tasks/external_ai_request_helpers_test.py`:
- Around line 171-182: Replace the loose exception checks using
self.assertRaises + self.assertIn with self.assertRaisesRegex to assert the
ExternalAIClientError message exactly; e.g., in the test cases that call
build_request_for_protocol (protocol="anthropic_messages", provider="claude",
... ) and expect the "requires both system and user messages" error, change to
with self.assertRaisesRegex(ExternalAIClientError, r"requires both system and
user messages"): and do the same pattern for the other two failing blocks (the
similar calls around the other assertions at the next two ranges) so the test
directly asserts the error type and message in one statement.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b063ce2f-d7ea-4b1b-9ba0-af67b0dda8ea
📒 Files selected for processing (3)
acestep/text_tasks/external_ai_json_parsing.pyacestep/text_tasks/external_ai_json_parsing_test.pyacestep/text_tasks/external_ai_request_helpers_test.py
✅ Files skipped from review due to trivial changes (1)
- acestep/text_tasks/external_ai_json_parsing.py
🚧 Files skipped from review as they are similar to previous changes (1)
- acestep/text_tasks/external_ai_json_parsing_test.py
10311b3 to
9bde3ef
Compare
9bde3ef to
0e0803a
Compare
ChuxiJ
left a comment
There was a problem hiding this comment.
Code Review: External AI Helper Modules
Nice work on the clean module separation and thorough test coverage (21 tests). A few items to address:
Bugs
-
extract_balanced_json_objectsescape-handling: Theescapeflag is tracked globally but should only apply inside strings. A stray\{outside a JSON string causes the{to be skipped, potentially missing an object start. -
extract_json_blockgreedy fallback: Thecontent.find("{")tocontent.rfind("}")approach captures everything between the first and last brace in the entire input. With multiple JSON objects, this produces invalid JSON. Since this candidate is tried first viaiter_json_candidates, it runs before the balanced extractor. -
to_inttruncates floats silently:int(float("3.7"))→3. For BPM, useround()instead, or document the truncation.
Code Quality
-
Redundant validation in
require_message_pair: After confirmingrole == "system"/role == "user"(lines ~390-393), the subsequentisinstance(role, str)and non-empty checks are guaranteed to pass. -
to_dict()dual keys: Bothkey_scale/keyscaleandtime_signature/timesignatureare emitted. Document whether this is intentional, or remove one form. -
SUPPORTED_REQUEST_PROTOCOLS: Defined as an alias ofSUPPORTED_PROTOCOLSbut unused in this PR — dead code.
Architecture
-
Hardcoded magic strings: Provider names (
"openai","zai","ollama") and"temperature": 0.4are hardcoded. Consider constants/enum for providers and a parameter with default for temperature. -
Missing
__init__.py: Verify whether the project uses explicit packages — if so, the newtext_tasks/directory needs one. -
No input size guard:
extract_balanced_json_objectsiterates char-by-char with no length limit. A large malformed response could cause high CPU usage.
Overall solid foundation — the modular decomposition is clean and tests are well-written. The escape-handling bug (#1) is the most impactful item.
Summary
This PR adds the provider-agnostic helper modules used to build requests and parse structured responses for external AI integrations.
Scope
In scope:
Out of scope:
What Changed
external_ai_types.pyfor shared plan and error typesexternal_ai_protocols.pyfor supported protocol validation and message-shape helpersexternal_ai_json_parsing.pyfor JSON extraction, repair, and labelled-field fallbacksexternal_ai_request_helpers.pyfor request guidance, payload construction, and HTTP error guidanceexternal_ai_response_parsing.pyfor protocol-specific response content extraction and plan parsingexternal_ai_protocols_test.pyTesting
Ran 21 tests, all passing.
Risk / Compatibility
This PR is based directly on
main. It stays entirely inside helper modules underacestep/text_tasksand does not add network execution or raw-response debug logging.Related Upstream Context
This PR is one small slice of the broader external-LM work that was originally bundled into #808.
Related upstream references:
This slice focuses on shared external-AI helper logic such as message validation, prompt shaping, JSON extraction, and response parsing. It does not include provider secret storage, model cache persistence, or UI wiring.
Summary by CodeRabbit
New Features
Tests